Wiki Contributions

Comments

simon11d30

So had some results I didn't feel were complete enough in to make a comment on (in the senses that subjectively I kept on feeling that there was some follow-on thing I should check to verify it or make sense of it), then got sidetracked by various stuff, including planning and now going on a trip sacred pilgrimage to see the eclipse. Anyway:

all of these results relate to the "main group" (non-fanged, 7-or-more segment turtles):

Everything seems to have some independent relation with weight (except nostril size afaik, but I didn't particularly test nostril size). When you control for other stuff, wrinkles and scars (especially scars) become less important relative to segments. 

The effect of abnormalities seems suspiciously close to 1 lb on average per abnormality (so, subjectively I think it might be 1). Adding abnormalities has an effect that looks like smoothing (in a biased manner so as to increase the average weight): the weight distribution peak gets spread out, but the outliers don't get proportionately spread out.  I had trouble finding a smoothing function* that I was satisfied exactly replicated the effect on the weight distribution however. This could be due to it not being a smoothing function, me not guessing the correct form, or me guessing the correct form and getting fooled by randomness into thinking it doesn't quite fit.

For green turtles with zero miscellaneous abnormalities, the distribution of scars looked somewhat close to a Poisson distribution. For the same turtles, the distribution of wrinkles on the other hand looked similar but kind of spread out a bit...like the effect of a smoothing function. And they both get spread out more with different colours. Hmm. Same spreading happens to some extent with segments as the colours change.

On the other hand, segment distribution seemed narrower than Poisson, even one with a shifted axis, and the abnormality distribution definitely looks nothing like Poisson (peaks at 0, diminishes far slower than a 0-peak Poisson).

Anyway, on the basis of not very much clear evidence but on seeming plausibility, some wild speculation:

I speculate there is a hidden variable, age. Effect of wrinkles and greyer colour (among non-fanged turtles) could be a proxy for age, and not a direct effect (names of those characteristics are also suggestive). Scars is likely a weaker proxy for age and also no direct effect. I guess segments likely do have some direct effect, while also being a (weak, like scars) proxy for age. Abnormalities clearly have a direct effect. Have not properly tested interactions between these supposed direct effects (age, segments, abnormalities), but if abnormality effect doesn't stack additively with the other effects, it would be harder for the 1-lb-per-abnormality size of the abnormality effect to be a non-coincidence.

So, further wild speculation: so age affect on weight could also be smoothing function (though, looks like high weight tail is thicker for greenish-gray - does that suggest it is not a smoothing function?

unknown: is there an inherent uncertainty in the weight given the characteristics, or does there merely appear to be because of the age proxies being unreliable indicators of age? is that even distinguishable? 

* by smoothing function I think I mean another random variable that you add to the first one, this other random variable takes on a range of values within a relatively narrow range. (e.g. uniform distribution from 0.0 to 2.0, or e.g. 50% chance of being 0.2, 50% chance of being 1.8).

Anyway, this all feels figure-outable even though I haven't figured it out yet. Some guesses where I throw out most of the above information (apart from prioritization of characteristics) because I haven't organized it to generate an estimator, and just guess ad hoc based on similar datapoints, plus Flint and Harold copied from above:

Abigail 21.6, Bertrand 19.3, Chartreuse 27.7, Dontanien 20.5, Espera 17.6, Flint 7.3, Gunther 28.9, Harold 20.4, Irene 26.1, Jacqueline 19.7

simon19d10

Well, as you may see it's also is not helpful

My reasoning explicitly puts instrumental rationality ahead of epistemic. I hold this view precisely to the degree which I do in fact think it is helpful.

The extra category of a "fair bet" just adds another semantic disagreement between halfers and thirders. 

It's just a criterion by which to assess disagreements, not adding something more complicated to a model.

Regarding your remarks on these particular experiments:

If someone thinks the typical reward structure is some reward structure, then they'll by default guess that a proposed experiment has that reward structure.

This reasonably can be expected to apply to halfers or thirders. 

If you convince me that halfer reward structure is typical, I go halfer. (As previously stated since I favour the typical reward structure). To the extent that it's not what I would guess by default, that's precisely because I don't intuitively feel that it's typical and feel more that you are presenting a weird, atypical reward structure!

And thirder utilities are modified during the experiment. They are not just specified by a betting scheme, they go back and forth based on the knowledge state of the participant - behave the way probabilities are supposed to behave. And that's because they are partially probabilities - a result of incorrect factorization of E(X).

Probability is a mathematical concept with very specific properties. In my previous post I talk about it specifically and show that thirder probabilities for Sleeping Beauty are ill-defined.

I've previously shown that some of your previous posts incorrectly model the Thirder perspective, but I haven't carefully reviewed and critiqued all of your posts. Can you specify exactly what model of the Thirder viewpoint you are referencing here? (which will not only help me critique it but also help me determine what exactly you mean by the utilities changing in the first place, i.e. do you count Thirders evaluating the total utility of a possibility branch more highly when there are more of them as a "modification" or not (I would not consider this a "modification").

simon19d10

updates:

In the fanged subset:

I didn't find anything that affects weight of fanged turtles independently of shell segment number. The apparent effect from wrinkles and scars appears to be mediated by shell segment number. Any non-shell-segment-number effects on weight are either subtle or confusingly change directions to mostly cancel out in the large scale statistics.

Using linear regression, if you force intercept=0, then you get a slope close to 0.5 (i.e. avg weight= 0.5*(number of shell segments) as suggested by qwertyasdef), and that's tempting to go for for the round number, but if you don't force intercept=0 then 0 intercept is well outside the error bars for the intercept (though it's still low, 0.376-0.545 at 95% confidence). If you don't force intercept=0 then the slope is more like 0.45 than 0.5. There is also a decent amount of variation which increases in a manner that could be plausibly linear with the number of shell segments (not really that great-looking a fit to a straight line with intercept 0 but plausibly close enough, I didn't do the math). Plausibly this could be modeled by each shell segment having a weight drawn from a distribution (average 0.45) and the total weight being the sum of the weights for each segment. If we assume some distribution in discrete 0.1lb increments, the per-segment variance looks to be roughly the amount supplied by a d4. 

So, I am now modeling fanged turtle weight as 0.5 base weight plus a contribution of 0.1*(1d4+2) for each segment. And no, I am not very confident that's anything to do with the real answer, but it seems plausible at least and seems to fit pretty well.

The sole fanged turtle among the Tyrant's pets, Flint, has a massive 14 shell segments and at that number of segments the cumulative probability of the weight being at or below the estimated value passes the 8/9 threshold at 7.3 lbs, so that's my estimate for Flint.

In the non-fanged, more than 6 segment main subset:

Shell segment number doesn't seem to be the dominant contributor here, all the numerical characteristics correlate with weight, will investigate further.

Abnormalities don't seem to affect or be affected by anything but weight. This is not only useful to know for separating abnormality-related and other effects on weight, but also implies (I think) that nothing is downstream of weight causally, since that would make weight act as a link for correlations with other things. 

This doesn't rule out the possibility of some other variable (e.g age) that other weight-related characteristics might be downstream of. More investigation to come. I'm now holding reading others' comments (beyond what I read at the time of my initial comment) until I have a more complete answer myself.

simon20d30

Thanks abstractapplic! Initial observations:

There are multiple subpopulations, and at least some that are clearly disjoint.

The 3167 fanged turtles are all gray, and only fanged turtles are gray. Fanged turtles always weigh 8.6lb or less. Within the fanged turtles it seems shell segment number is pretty decently correlated with weight. wrinkles and scars have weaker correlations with weight but also correlate to shell segment number so not sure they have independent effect, will have to disentangle.

Non-fanged turtles always weigh 13.0 lbs or more. There are no turtles weighing between 8.6lb and 13.0lb.

The 5404 turtles with exactly 6 shell segments all have 0 wrinkles or anomalies, are green, have no fangs, have normal sized nostrils, and weigh exactly 20.4lb. None of that is unique to 6-shell-segment turtles, but that last bit makes guessing Harold's weight pretty easy.

Among the 21460 turtles that don't belong in either of those groups, all of the numerical characteristics correlate with weight, and notably number of abnormalities don't seem to correlate with other numerical characteristics so likely have some independent effect. Grayer colours tend to have higher weight, but also correlate with other things that seem to effect weight, so will have to disentangle.

edit: both qwertyasdef and Malentropic Gizmo identified these groups before my comment including 6-segment weight, and qwertyasdef also remarked on the correlation of shell segment number to weight among fanged turtles. 

simon22d10

Throughout your comment you've been saying a phrase "thirders odds", apparently meaning odds 1:2, not specifying whether per awakening or per experiment. This is underspecified and confusing category which we should taboo. 

Yeah, that was sloppy language, though I do like to think more in terms of bets than you do. One of my ways of thinking about these sorts of issues is in terms of "fair bets" - each person thinks a bet with payoffs that align with their assumptions about utility is "fair", and a bet with payoffs that align with different assumptions about utility is "unfair".  Edit: to be clear, a "fair" bet for a person is one where the payoffs are such that the betting odds where they break even matches the probabilities that that person would assign.

I do not claim that. I say that in order to justify not betting differently, thirders have to retroactively change the utility of a bet already made:

I critique thirdism not for making different bets - as the first part of the post explains, the bets are the same, but for their utilities not actually behaving like utilities - constantly shifting back and forth during the experiment, including shifts backwards in time, in order to compensate for the fact that their probabilities are not behaving as probabilities - because they are not sound probabilities as explained in the previous post.

Wait, are you claiming that thirder Sleeping Beauty is supposed to always decline the initial per experiment bet - before the coin was tossed at 1:1 odds? This is wrong - both halfers and thirders are neutral towards such bets, though they appeal to different reasoning why.

OK, I was also being sloppy in the parts you are responding to.

Scenario 1: bet about a coin toss, nothing depending on the outcome (so payoff equal per coin toss outcome)

  • 1:1

Scenario 2: bet about a Sleeping Beauty coin toss, payoff equal per awakening

  • 2:1 

Scenario 3: bet about a Sleeping Beauty coin toss, payoff equal per coin toss outcome 

  • 1:1

It doesn't matter if it's agreed to before or after the experiment, as long as the payoffs work out that way. Betting within the experiment is one way for the payoffs to more naturally line up on a per-awakening basis, but it's only relevant (to bet choices) to the extent that it affects the payoffs.

Now, the conventional Thirder position (as I understand it) consistently applies equal utilities per awakening when considered from a position within the experiment.

I don't actually know what the Thirder position is supposed to be from a standpoint from before the experiment, but I see no contradiction in assigning equal utilities per awakening from the before-experiment perspective as well. 

As I see it, Thirders will only regret a bet (in the sense of considering it a bad choice to enter into ex ante given their current utilities) if you do some kind of bait and switch where you don't make it clear what the payoffs were going to be up front.

 But what I'm pointing at, is that thirdism naturally fails to develop an optimal strategy for per experiment bet in technicolor problem, falsly assuming that it's isomorphic to regular sleeping beauty.

Speculation; have you actually asked Thirders and Halfers to solve the problem? (while making clear the reward structure? - note that if you don't make clear what the reward structure is, Thirders are more likely to misunderstand the question asked if, as in this case, the reward structure is "fair" from the Halfer perspective and "unfair" from the Thirder perspective).

Technicolor and Rare Event problems highlight the issue that I explain in Utility Instability under Thirdism - in order to make optimal bets thirders need to constantly keep track of not only probability changes but also utility changes, because their model keeps shifting both of them back and forth and this can be very confusing. Halfers, on the other hand, just need to keep track of probability changes, because their utility are stable. Basically thirdism is strictly more complicated without any benefits and we can discard it on the grounds of Occam's razor, if we haven't already discarded it because of its theoretical unsoundness, explained in the previous post.

A Halfer has to discount their utility based on how many of them there are, a Thirder doesn't. It seems to me, on the contrary to your perspective, that Thirder utility is more stable.

Halfer model correctly highlights the rule how to determine which cases these are and how to develop the correct strategy for betting. Thirder model just keeps answering 1/3 as a broken clock.

... and I in my hasty reading and response I misread the conditions of the experiment (it's a "Halfer" reward structure again). (As I've mentioned before in a comment on another of your posts, I think Sleeping Beauty is unusually ambiguous so both Halfer and Thirder perspectives are viable. But, I lean toward the general perspectives of Thirders on other problems (e.g. SIA seems much more sensible (edit: in most situations) to me than SSA), so Thirderism seems more intuitive to me). 

Thirders can adapt to different reward structures but need to actually notice what the reward structure is! 

What do you still feel that is unresolved?

the things mentioned in this comment chain. Which actually doesn't feel like all that much, it feels like there's maybe one or two differences in philosophical assumptions that are creating this disagreement (though maybe we aren't getting at the key assumptions).

Edited to add: The criterion I mainly use to evaluate probability/utility splits is typical reward structure - you should assign probabilities/utilities such that a typical reward structure seems "fair", so you don't wind up having to adjust for different utilities when the rewards have the typical structure (you do have to adjust if the reward structure is atypical, and thus seems "unfair"). 

This results in me agreeing with SIA in a lot of cases. An example of an exception is Boltzmann brains. A typical reward structure would give no reward for correctly believing that you are a Boltzmann brain. So you should always bet in realistic bets as if you aren't a Boltzmann brain, and for this to be "fair", I set P=0 instead of SIA's U=0.  I find people believing silly things about Boltzmann brains like taking it to be evidence against a theory if that theory proposes that there exists a lot of Boltzmann brains. I think more acceptance of the setting of P=0 instead of U=0 here would cut that nonsense off. To be clear, normal SIA does handle this case fine (that a theory predicting Boltzmann brains is not evidence against it), but setting P=0 would make it more obvious to people's intuitions.

In the case of Sleeping Beauty, this is a highly artificial situation that has been pared down of context to the point that it's ambiguous what would be a typical reward structure, which is why I consider it ambiguous.

simon23d31

The central point of the first half or so of this post  - that for E(X) = P(X)U(X) you could choose different P and U for the same E so bets can be decoupled from probabilities - is a good one.

I would put it this way: choices and consequences are in the territory*; probabilities and utilities are in the map.

Now, it could be that some probability/utility breakdowns are more sensible than others based on practical or aesthetic criteria, and in the next part of this post ("Utility Instability under Thirdism") you make an argument against thirderism based on one such criterion.

However, your claim that Thirder Sleeping Beauty would bet differently before and after the coin toss is not correct. If Sleeping Beauty is asked before the coin toss to bet based on the same reward structure as after the toss she will bet the same way in each case - i.e. Thirder Sleeping Beauty will bet Thirder odds even before the experiment starts, if the coin toss being bet on is particularly the one in this experiment and the reward structure is such that she will be rewarded equally (as assessed by her utility function) for correctness in each awakening.

Now, maybe you find this dependence on what the coin will be used for counterintuitive, but that depends on your own particular taste.

Then, the "technicolor sleeping beauty" part seems to make assumptions where the reward structure is such that it only matters whether you bet or not in a particular universe and not how many times you bet. This is a very "Halfer" assumption on reward structure, even though you are accepting Thirder odds in this case! Also, Thirders can adapt to such a reward structure as well, and follow the same strategy.  

Finally, on Rare Event Sleeping beauty, it seems to me that you are biting the bullet here to some extent to argue that this is not a reason to favour thirderism.

I think, we are fully justified to discard thirdism all together and simply move on, as we have resolved all the actual disagreements.

uh....no. But I do look forward to your next post anyway.

*edit: to be more correct, they're less far up the map stack than probability and utilities. Making this clarification just in case someone might think from that statement that I believe in free will (I don't).

simon2mo20

I think there's a (kind of) loophole here, where we use an "abstract hypothetical" model of a hypothetical future, and optimize for consequences our actions for that hypothetical. Is this what you mean by "understood in abstract terms"? 

More or less, yes (in the case of engineering problems specifically, which I think is more real-world-oriented than most science AI).

The part I don't understand is why you're saying that this is "simpler"? It seems equally complex in kolmogorov complexity and computational complexity.

What I'm saying is "simpler" is that, given a problem that doesn't need to depend on the actual effects of the outputs on the future of the real world (where operating in a simulation is an example, though one that could become riskily close to the real world depending on the information taken into account by the simulation - it might not be a good idea to include highly detailed political risks of other humans thwarting construction in a fusion reactor construction simulation for example), it is simpler for the AI to solve that problem without taking into consideration the effects of the output on the future of the real world than it is to take into account the effects of the output on the future of the real world anyway. 

simon2mo81

Doomimir: But you claim to understand that LLMs that emit plausibly human-written text aren't human. Thus, the AI is not the character it's playing. Similarly, being able to predict the conversation in a bar, doesn't make you drunk. What's there not to get, even for you?

 

So what?

You seem to have an intuition that if you don't understand all the mechanisms for how something works, then it is likely to have some hidden goal and be doing its observed behaviour for instrumental reasons. E.g. the "Alien Actress".

And that makes sense from an evolutionary perspective where you encounter some strange intelligent creature doing some mysterious actions on the savannah. I do not think it make sense if you specifically trained the system to have that particular behaviour by gradient descent.

I think, if you trained something by gradient descent to have some particular behaviour, the most likely thing that resulted from that training is a system tightly tuned to have that particular behaviour, with the simplest arrangement that leads to the trained behaviour.

And if the behaviour you are training something to do is something that doesn't necessarily involve actually trying to pursue some long-range goal, it would be very strange, in my view, for it to turn out that the simplest arrangement to provide that behaviour calculates the effects of the output on the long-range future in order to determine what output to select.

Moreover even if you tried to train it to want to have some effect on the future, I expect you would find it more difficult than expected, since it would learn various heuristics and shortcuts long before actually learning the very complicated algorithm of generating a world model, projecting it forward given the system's outputs, and selecting the output that steers the future to the particular goal. (To others: This is not an invitation to try that. Please don't).

That doesn't mean that an AI trained by gradient descent on a task that usually doesn't involve trying to pursue a long range goal can never be dangerous, or that it can never have goals.

But it does mean that the danger and the goals of such a usually-non-long-range-task-trained AI, if it has them, are downstream of its behaviour.

For example, an extremely advanced text predictor might predict the text output of a dangerous agent through an advanced simulation that is itself a dangerous agent.

And if someone actually manages to train a system by gradient descent to do real-world long range tasks (which probably is a lot easier than making a text predictor that advanced), well then...

BTW all the above is specific to gradient descent. I do expect self-modifying agents, for example, to be much more likely to be dangerous, because actual goals lead to wanting to enhance one's ability and inclination to pursue those goals, whereas non-goal-oriented behaviour will not be self-preserving in general.

simon2mo10

And in Sleeping Beauty case, as I'm going to show in my next post, indeed there are troubles justifying thirders sampling assumption with other conditions of the setting

I look forward to seeing your argument.

I'm giving you a strong upvote for this. It's rare to find a person who notices that Sleeping Beauty is quite different from other "antropic problems" such as incubator problems.

Thanks! But I can't help but wonder if one of your examples of someone who doesn't notice is my past self making the following comment (in a thread for one of your previous posts) which I still endorse:

https://www.lesswrong.com/posts/HQFpRWGbJxjHvTjnw/anthropical-motte-and-bailey-in-two-versions-of-sleeping?commentId=dkosP3hk3QAHr2D3b

I certainly agree that one can have philosophical assumptions such that you sample differently for Sleeping Beauty and Incubator problems, and indeed I would not consider the halfer position particularly tenable in Incubator, whereas I do consider it tenable in Sleeping Beauty.

But ... I did argue in that comment that it is still possible to take a consistent thirder position on both. (In the comment I take the thirder position for sleeping beauty for granted, and argue for it still being possible to apply to Incubator (rather than the other way around, despite being more pro-thirder for Incubator), specifically to rebut an argument in that earlier post of yours that the classic thirder position for Sleeping Beauty didn't apply to Incubator).

Some clarification of my actual view here (rather than my defense of conventional thirderism):

In my view, sampling is not something that occurs in reality, when the "sampling" in question includes sampling between multiple entities that both exist. Each of the entities that actually exists actually exists, and any "sampling" between multiple of such entities occurs (only) in the mind of the observer. (However, can still mix with conventional sampling, in the mind of the observer). Which sampling assumption you use in such cases is in principle arbitrary but in practice should probably be based on how much you care about the correctness of the beliefs of each of the possible entities you are uncertain about being. 

Halferism or thirderism for Sleeping Beauty are both viable, in my view, because one could argue for caring equally about being correct at each awakening (resulting in thirderism) or one could argue for caring equally about being correct collectively in the awakenings for each of the coin results (resulting in halferism). There isn't any particular "skin in the game" to really force a person to make a commitment here.

simon2mo1-1

You seem to be assuming that the ability of the system to find out if security assumptions are false affects whether the falsity of the assumptions have a bad effect. Which is clearly the case for some assumptions - "This AI box I am using is inescapable" - but it doesn't seem immediately obvious to me that this is generally the case. 

Generally speaking, a system can have bad effects if made under bad assumptions (think a nuclear reactor or aircraft control system) even if it doesn't understand what it's doing. Perhaps that's less likely for AI, of course.

And on the other hand, an intelligent system could be aware that an assumption would break down in circumstances that haven't arrived yet, and not do anything about it (or even tell humans about it).

Load More