VipulNaik - LessWrong

The first 10 minutes or so of https://podcasts.apple.com/us/podcast/episode-51-what-happened-at-the-arena-group-former/id1615989259?i=1000639450752 also provide related context.

Send us example gnarly bugs

VipulNaik1y110

One particularly amusing bug I was involved with was with an early version of the content recommendation engine at the company I worked at (this is used by websites to recommend related content on the website, such as related videos, articles, etc.). One of the customers for the recommendation engine was a music video service, and we/they noticed that One Direction's song called Infinity was showing up at the top of our recommendations a little too often. (I think this was triggered by the release of another One Direction song bringing the Infinity song into circulation, but I don't remember what that other song was).

It turned out this was due to a bug where we were taking a dot product of feature values with feature weights, where the feature value was being cast from a string to a numeric, with a fallback to zero if it was non-numeric, and then multiplied by the feature weight. For the "song title" feature, the feature weight was zero, and the feature value was anyway non-numeric, but even if it were numeric, it shouldn't matter, because anything times zero is ... zero, right? But the programming language in question treated "Infinity" as a numeric value, and it defined Infinity * 0 to be NaN (not a number) [ETA: A colleague who was part of the discovery process highlights that this behavior is in fact part of the IEEE 754 standard, so it would hold even for other programming languages that were compliant with the standard]. And NaN + anything would still be NaN, so the dot product would be NaN. And the way the sorting worked, NaN would always rank on top, so whenever the song got considered for recommendation it would rank on top.

AI #41: Bring in the Other Gemini

VipulNaik1y30

He points to recent events at Sports Illustrated. But to me the SI incident was the opposite. It indicated that we cannot do this yet. The AI content is not good. Not yet. Nor are we especially close. Instead people are using AI to produce garbage that fools us into clicking on it. How close are we to the AI content actually being as good as the human content? Good question.

I happen to work at the company (The Arena Group) that operates Sports Illustrated. I wasn't involved with any of this stuff directly (and may not even have been at the company over some of the relevant time period), and obviously I speak only for myself (and will restrict myself to the public information available on this which is anyway most of what I know). With that said: most of the media coverage of this issue was pretty bad (as it is for most issues, but then, Gell-Mann Amnesia ...). In particular, the Futurism article has a lot of issues. The best article I found on this was https://www.theverge.com/2023/11/27/23978389/sports-illustrated-ai-fake-authors-advon-commerce-gannett-usa-today that properly explains who the vendor in question was and the vendor's track record on another site. Overall, the Verge article helps check out the Arena Group's official statement that the articles were from AdVon, despite the seeming reluctance of Futurism (and other publications) to accept that.

The maybe-AI-maybe-not-AI content was written way way before the whole ChatGPT thing raised the profile of AI. Futurism links to https://web.archive.org/web/20221004090814/https://www.si.com/review/full-size-volleyball/ which is a snapshot of the article from October 2022, before the wide release of ChatGPT, and well before The Arena Group made public announcements about experimenting with AI. But in fact the article is even way older than October 2022; the oldest snapshot https://web.archive.org/web/20210916150227/https://www.si.com/review/full-size-volleyball/ of the article is from September 16, 2021, and the article publication date is September 2, 2021. Way before ChatGPT or any of the AI hype. And even that earliest version shows the same photo and a similar bio.

So to my mind this comes down to the use of a vendor (AdVon) with shady practices, and either the lack of due diligence in vendor selection or not caring about the details of their practices as long as they're driving revenue to the site and not hurting the site's brand. The reason to use a vendor like this is simply that they drive affiliate marketing revenue (people find the recommended content interesting, they click on it, maybe buy it, everybody gets a cut of the revenue, everybody is happy). This simply isn't even part of the editorial content and basically has nothing to do with replacing real writers of sports content with AI writers -- it's simply an effort to leverage the brand of the site by running a side business from one corner of it. Also, to the extent it is or isn't ethical, the issue probably has more to do with whether the reviews were genuine rather than whether the authors were human or AI -- even if the authors were human, if the reviews were fraudulent, it would be a problem in equal measure. So overall I think the vendor selection was problematic, but this has little to do with AI.

Separately, many sports sites, including Sports Illustrated, have used automatically (/ "AI")-generated content for routine content such as game summaries, e.g., using the services of Data Skrive: https://www.si.com/author/data-skrive -- this is probably a little closer to the idea of replacing human writers, but the kind of content being created is pretty much the kind of content that humans wouldn't want to spend time creating.

The Arena Group has done some AI experimentation with the goal of trying to use AI-like tools to write normal content (not things like game summaries), as Futurism critiqued at https://futurism.com/neoscope/magazine-mens-journal-errors-ai-health-article but this AdVon thing is completely separate in time, in space, and in purpose.

ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5

VipulNaik1y40

Thanks for reviewing and catching these subtle issues!

Technically, it is not true that the prime numbers being multiplied need to be distinct. For example, 2*2=4 is the product of two prime numbers, but it is not the product of two distinct prime numbers.

Good point, I've marked this as an error. My prompt about gcd did specify distinctness but the prompt about product did not, so this is indeed an error.

This seems wrong: "neither can be definitively identified" makes it sound like they exist but just can't be identified...

I passed on this one as being too minor to mark.

Safe primes area subset of Sophie Germain primes

Not true, e.g. 7 is safe but not Sophie Germain.

Good point; I missed reading this sentence originally. I've marked this one as well.

ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5

VipulNaik1y20

Thanks, fixed now! Sorry I missed that.

ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5

VipulNaik1y40

Good question!

First, my original post didn't provide the correct answers for most questions, only what was wrong with ChatGPT. Going from knowing what was wrong to actually giving correct answers seems like quite a leap. Further, ChatGPT changed its answers to better ones (including more rigorous explanations) even in cases where its original answers were correct.

Second, ChatGPT's self-reported training data cutoff date at the time it was asked these questions was September 2021 or January 2022. To my knowledge, Issa didn't ask it this question, but sources like https://www.reddit.com/r/ChatGPT/comments/16m6yc7/gpt4_training_cutoff_date_is_now_january_2022/ suggest that it was September 2021 at the time of his sessions, then became January 2022. So, the blog post, published in December 2022, should not have been part of its training data.

With that said, the sessions themselves (not the blog post about them) might have been part of the feedback loop, but in most cases I did not tell ChatGPT what was wrong with its answers within the session itself.

ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5

VipulNaik1y20

Thanks, and sorry I missed that error. I've updated the post by bolding the error, and also HT'ed your contribution.

riceissa's Shortform

VipulNaik1y60

Regarding the topic of batteries getting better while also being harder to remove/replace, I found these interesting comments by FeRD (https://www.ifixit.com/Wiki/What_to_do_with_a_swollen_battery?permalink=comment-911590#comment-911590 + the next two comments) that I quote in full below:

(1/2) I doubt it's entirely fair to blame Apple for this; they may not have even been the first manufacturer to use a non-removable battery pack. They were certainly the first biggest company to make that switch, but plenty of others did the same, and in some cases way too quickly for it to have been a case of them "copying" Apple. They simply came to the same conclusion: Removable batteries, like physical keyboards, hurt sales more than they helped. So, out they went.

Because, the frustration of it is that device manufacturers genuinely have really "good" reasons why batteries are no longer removable. Actually, at least two good reasons: Water-resistance and miniaturization.

Today's phones are surprisingly watertight, to the point where many can survive a brief dunk with no immediate ill effects. (Though I personally suspect that water infiltration is a trigger for subsequent battery swelling.) My Galaxy Note8 once survived a complete submersion lasting 2-3 seconds. Don't try that with your Nokia 3310!

FeRD - Sep 21, 2023

(2/2) All battery-powered devices are also continually getting smaller and smaller. (Or, failing that, they're packing more and more stuff into roughly the same amount of space.) And the fact is, removable batteries require MUCH more space than non-removable.

If a battery is removable, it has to have an outer, protective case of its own due to the dangerous chemicals inside. The phone would then also have to have mechanisms to align and secure the battery, a latch and release mechanism, and electrical contacts between what are now (effectively) two completely separate devices. That all takes up space.

A removable battery makes a device significantly larger (in particular, thicker), or else it has half the capacity of the non-removable design. Either way, 999 out of 1000 consumers will choose the smaller, thinner, lighter fixed-battery device with twice the runtime between charges, over a bigger, thicker, doesn't-last-as-long alternative with a removable battery.

FeRD - Sep 21, 2023

Actually, now that I think about it there's a third reason that's even more damning:

Removable batteries were never about extending device lifetime.

Manufacturers will tell you, and they can provide reams of consumer data to back it up: The percentage of consumers who keep a device long enough to wear out the first battery is TINY. Laughably tiny. The overwhelming majority of mobile-device owners want to replace their device with a newer, faster one every 2 years or less — long before the battery is even starting to degrade. (After all, until very recently the technology was advancing so quickly, a 2-year-old phone was nigh-unusable, given its limitations compared to newer models.)

Removable batteries were always intended for power-users who needed more runtime than they could get from a single battery. They'd own two+, and swap them out as needed (charging externally). In the end, rapid charging, larger capacities, and improved power-management software provided a better solution to that problem.

FeRD - Sep 21, 2023

riceissa's Shortform

VipulNaik1y70

A few thoughts:

In at least one area, namely cars, durability relative to actual usage has improved a lot over the past 50 years or so. See for instance https://en.wikipedia.org/wiki/Car_longevity "According to the New York Times, in the 1960s and 1970s, the typical car reached its end of life around 100,000 miles (160,000 km), but due to manufacturing improvements in the 2000s, such as tighter tolerances and better anti-corrosion coatings, the typical car lasts closer to 200,000 miles (320,000 km)."
The area where I'm most aware of claims of reduced durability is home appliances e.g. https://ryanfinlay.medium.com/they-used-to-last-50-years-c3383ff28a8e but I think there are a bunch of factors here that make it a little tricky given (a) low cost: there's a much wider selection of home appliances, and the low end that are quite cheap still last several years, which is obviously less than the high end and the older great devices, but probably good enough and a great comparison to costs. Low-end refrigerators for instance cost only a bit more than phones, which is remarkable considering the size differences! (b) energy use as a major component of cost over the long term: for appliances like refrigerators, the electricity use becomes a major cost component if the appliance lasts too long, so that having a long-lasting refrigerator that doesn't benefit from energy efficiency improvements may ultimately cost more.
In the realm of electronics, quality improvements in hardware even over the last 5-10 years have been very impressive; for instance, batteries have gotten better, design/form factors have gotten better. However, repairability in particular has gotten worse but this seems tied to the trend toward miniaturization and portability. If people anyway plan to replace their devices every few years, then portability probably wins over repairability.

Why have exposure notification apps been (mostly) discontinued?

VipulNaik1y20

Hmm, aren't exposure notifications an opt-in program? I was never forced to get them -- I chose to download and install the app and keep it on. The same way I choose to allow Google Maps to keep a record of my physical location.

LESSWRONG
is fundraising!
LW
$

Posts

Wiki Contributions

Comments