Optimization Process's Shortform

Optimization Process

Optimization Process's Shortform

14th Mar 2020

1 min read

3

This is a special post for quick takes by Optimization Process. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

New to LessWrong?

Getting Started

FAQ

Library

Optimization Process's Shortform

8Optimization Process

2Dagon

1Optimization Process

2Dagon

1Donald Hobson

2Optimization Process

3Optimization Process

1[comment deleted]

9 comments, sorted by

top scoring

Click to highlight new comments since: Today at 1:54 PM

[-]Optimization Process5y80

Some wagers have the problem that their outcome correlates with the value of what's promised. For example, "I bet $90 against your $10 that the dollar will not undergo >1000% inflation in the next ten years": the apparent odds of 9:1 don't equal the probability of hyperinflation at which you'd be indifferent to this bet.

For some (all?) of these problematic bets, you can mitigate the problem by making the money change hands in only one arm of the bet, reframing it as e.g. "For $90, I will sell you an IOU that pays out $100 in ten years if the dollar hasn't seen >1000% inflation." (Okay, you'll still need to tweak the numbers for time-discounting purposes, but it seems simpler now that we're conditioning on lack-of-hyperinflation.)

Does this seem correct in the weak case? ("some")

Does this seem correct in the strong case? ("all")

[-]Dagon5y20

Clearly not all - the extreme version of this is betting on human extinction. It's hard to imagine the payout that has any value after that comes to pass. In some, you can find the conditional wagers that work, in some you can find a better resource or measurement to wager (one gram of gold, or one day's average wage as reported by X government). In many, though, there just is no wager possible, as the utility of the parties diverges too much from the resources available to account for in the wager.

[-]Optimization Process5y10

Clearly not all - the extreme version of this is betting on human extinction. It's hard to imagine the payout that has any value after that comes to pass.

Agreed that post-extinction payouts are essentially worthless -- but doesn't the contract "For $90, I will sell you an IOU that pays out $100 in one year if humans aren't extinct" avoid that problem?

[-]Dagon5y20

Small amounts and near-even-money ($90 for $100) are bad intuition pumps - this is in the range where other considerations dominate the outcome estimates. In fact, you probably can't find many people to accept only 11% for a one-year unsecured loan.

[-]Donald Hobson5y10

This is exactly conditional to a bond that pays out in one year "unconditionally". Ie this is a loan with interest. (There are a few contrived scenarios where humans are extinct and money isn't worthless, depending on the definitions of those words. Would this bond pay out in a society of uploaded minds?)

[-]Optimization Process4y20

Consider AI-generated art (e.g. TWDNE, GPT-3 does Seinfeld, reverse captioning, Jukebox, AI Dungeon). Currently, it's at the "heh, that's kinda neat" stage; a median person might spend 5-30 minutes enjoying it before the novelty wears off.

(I'm about to speculate a lot, so I'll tag it with my domain knowledge level: I've dabbled in ML, I can build toy models and follow papers pretty well, but I've never done anything serious.)

Now, suppose that, in some limited domain, AI art gets good enough that normal people will happily consume large amounts of its output. It seems like this might cause a phase change where human-labeled training data becomes cheap and plentiful (including human labels for the model's output, a more valuable reward signal than e.g. a GAN's discriminator); this makes better training feasible, which makes the output better, which makes more people consume and rate the output, in a virtuous cycle that probably ends with a significant chunk of that domain getting automated.

I expect that this, like all my most interesting ideas, is fundamentally flawed and will never work! I'd love to hear a Real ML Person's take on why, if there's an obvious reason.

[-]Optimization Process4y*30

Trying to spin this into a plausible story: OpenAI trains Jukebox-2, and finds that, though it struggles with lyrics, it can produce instrumental pieces in certain genres that people enjoy about as much as human-produced music, for about $100 a track. Pandora notices that it would only need to play each track ($100 / ($0.00133 per play) = 75k) times to break even with the royalties it wouldn't have to pay. Pandora leases the model from OpenAI, throws $100k at this experiment to produce 1k tracks in popular genres, plays each track 100k times, gets ~1M thumbs-[up/down]s (plus ~100M "no rating" reactions, for whatever those are worth), and fine-tunes the model using that reward signal to produce a new crop of tracks people will like slightly more.

Hmm. I'm not sure if this would work: sure, from one point of view, Pandora gets ~1M data points for free (on net), but from another reasonable point of view, each data point (a track) costs $100 -- definitely not cheaper than getting 100 ratings off Mechanical Turk, which is probably about as good a signal. This cycle might only work for less-expensive-to-synthesize art forms.

[+][comment deleted]3mo10

Deleted by Optimization Process, 11/25/2024

[+][comment deleted]4y10

Deleted by Optimization Process, 06/25/2021

Reason: just testing markup

Moderation Log