Bad restaurants are more likely to have open tables than good restaurants.
That seems dependent on it being difficult to scale the specific skill that went into putting together the experience at the good restaurant. Things that are more scalable, like small consumer products, can be selected to be especially good trades (the bad ones don't get popular and inexpensive).
Bruh. Banana Laffy Taffy is the best. Happy to trade away non-banana to receive banana, 1:1.
The point of the essay is to describe the context that would make one want a hyperphone, so that
one can be motivated by the possibility of a hyperphone, and
one could get a hold of the criteria that would direct developing a good hyperphone.
The phrase "the ability to branch in conversations" doesn't do either of those.
Quoting another comment I made:
Make a hyperphone. A majority of my alignment research conversations would be enhanced by having a hyperphone, to a degree somewhere between a lot and extremely; and this is heavily weighted on the most hopeworthy conversations. (Also sometimes when I explain what a hyperphone is well enough for the other person to get it, and then we have a complex conversation, they agree that it would be good. But very small N, like 3 to 5.)
IDK if this is a crux for me thinking this is very relevant to stuff on my perspective, but:
The training procedure you propose doesn't seem to actually incentivize indifference. First, a toy model where I agree it does incentivize that:
So this agent is indeed incentivized to pick k uniformly at random from 1--N. Now consider:
Does this incentivize random choice at time N? No. It incentivizes the agent to choose randomly End or Continue at the very beginning of the episode, and then carefully plan and execute behavior that acheives the most reward assuming a run of length N or N+10 respectively.
Wait, but isn't this success? Didn't we make the agent have no trajectory length preference?
No. Suppose:
Do we kill the guy? Yes we certainly do, he will mess up our careful plans.