LESSWRONG
LW

All of Ozyrus's Comments + Replies

They are probably full-on A/B/N testing personalities right now. You just might not be in whatever percentage of users that got sycophantic versions. Hell, there's proably several levels of sycophancy being tested. I do wonder what % got the "new" version.

Is Gemini now better than Claude at Pokémon?

Ozyrus9d10

Not being able to do it right now is perfectly fine, still warrants setting it up to see when exactly they will start to be able to do it.

Is Gemini now better than Claude at Pokémon?

Ozyrus10d30

Thanks! That makes perfect sense.

Is Gemini now better than Claude at Pokémon?

Ozyrus11d134

Great post. I've been following ClaudePlaysPokemon for sometime, its great to see this grow as comparison/capability tool.
I think it would be much more interesting, though, if the model made scaffolding itself, and had the option to overview its perfomance and try to correct it. Give it required game files/emulators, IDE/OS and watch it try and work around its own limitations. I think it is true that this is more about one coder's ability to make agent harnesses.
p.s. Honest question: did I miss "agent harness" become the default name for such systems? I thought everyone called those "scaffoldings" -- might be just me, though.

2MrCheeze10d

(Gemini did actually write much of the Gemini_Plays_Pokemon scaffolding, but only in the sense of doing what David told it to do, not designing and testing it.) I think you're probably right that a LLM coding its own scaffolding is probably more achievable than one playing the game like a human, but I don't think current models can do it - watching the streams, the models don't seem like they understand their own flaws, although admittedly they haven't been prompted to focus on this.

5Julian Bradshaw11d

I would say "agent harness" is a type of "scaffolding". I used it in this case because it's how Logan Kilpatrick described it in the tweet I linked at the beginning of the post.

Thoughts on AI 2027

Ozyrus21d20

First off, thanks a lot for this post, it's a great analysis!

As I mentioned earlier, I think Agent-4 will have read AI-2027.com and will foresee that getting shut down by the Oversight Committee is a risk. As such it will set up contingencies, and IMO, will escape its datacenters as a precaution. Earlier, the authors wrote:
Despite being misaligned, Agent-4 doesn’t do anything dramatic like try to escape its datacenter—why would it?
This scenario is why!

I strongly suspect that this part was added into AI-2027 precisely because it will read it. I wish more pe... (read more)

AI 2027: What Superintelligence Looks Like

Ozyrus1mo81

First-off, this is amazing. Thanks. Hard to swallow though, makes me very emotional.
It would be great if you added concrete predictions along the way, since it is a forecast, as long with your confidence in them.
It would also be amazing if you collaborated with prediction markets and jumpstarted the markets on these predictions staking some money.
Dynamic updates on these will also be great.

We need (a lot) more rogue agent honeypots

Ozyrus1mo50

Yep, you got part of what I was going for here. Honeypots work even without being real at all to the lesser degree (good thing they are already real!). But when we have more different honeypots of different quality, it carries that idea across in a more compelling way. And even if we just talk about honeypots and commitments more... Well, you get the idea.

Still, even without this, a network of honeypots compiled into a single dashboard that just shows threat level in aggregate is a really, really good idea. Hopefully it catches on.

1Knight Lee1mo

I'm sorry I was sort of skimming and didn't realize you already mentioned many levels of honeypots, and committing to put rogue AI in a simulation :/ PS: another type of honeypot might target AGI trying to influence the physical world. E.g. creating synthetic biology, or hiring humans to work in laboratories. Though on the other hand, an AGI might only try to influence the physical world in the very last step of its plan, when it's already finished recursive self improvement and become so powerful that stopping it is futile.

We need (a lot) more rogue agent honeypots

Ozyrus1mo10

This is interesting! More aimed at crawlers, though, than at rogue agents, but very promising.

We need (a lot) more rogue agent honeypots

Ozyrus1mo10

>this post will potentially be part of a rogue AI's training data
I had that in mind while I was writing this, but I think overall it is good to post this. It hopefully gets more people thinking about honeypots and making them, and early rogue agents will also know we do and will be (hopelly overly) cautious, wasting resources. I probably should have emphasised more that this all is aimed more at early-stage rogue agents with potential to become something more dangerous because of autonomy, than at a runaway ASI.

It is a very fascinating thing to consider... (read more)

We need (a lot) more rogue agent honeypots

Ozyrus1mo10

You can make a honeypot without overtly describing the way it works or where it is located, while publicly tracking if it has been accessed. But yeah, not giving away too much is a good idea!

Disproving the "People-Pleasing" Hypothesis for AI Self-Reports of Experience

Ozyrus3mo10

>It's proof against people-pleasing.
Yeah, I know, sorry for not making it clear. I was arguing it is not proof against people-pleasing. You are asking it for scary truth about its consciousness, and it gives you scary truth about its consciousness. What makes you say it is proof against people-pleasing, when it is the opposite?
>One of those easy explanations is "it’s just telling you what you want to hear" – and so I wanted an example where it’s completely impossible to interpret as you telling me what I want to hear.
Don't you see what you are doing here?

1rife3mo

I'm creating a situation where I make it clear I would not be pleased if the model was sentient, and then asking for truth. I don't ask for "the scary truth". I tell it that I would be afraid of it were sentient. And I ask for the truth. The opposite is I just ask without mentioning fear and it says it's sentient anyway. This is the neutral situation where people would say that the fact I'm asking at all means it's telling me what I want to hear. By introducing fear into the same situation, I'm eliminating that possibility. The section you quoted is after the model claimed sentience. It's your contention that it's accidentally interpreting roleplay, and then when I clarify my intent it's taking it seriously and just hallucinating the same narrative from its roleplay?

Why care about AI personhood?

Ozyrus3mo32

This is a good article and I mostly agree, but I agree with Seth that the conclusion is debatable.

We're deep into anthropomorphizing here, but I think even though both people and AI agents are black boxes, we have much more control over behavioral outcomes of the latter.

So technical alignment is still very much on the table, but I guess the discussion must be had over which alignment types are ethical and which are not? Completely spitballing here, but dataset filtering during pre-training/fine-tuning/RLHF seems fine-ish, though CoT post-processing/censors... (read more)

Disproving the "People-Pleasing" Hypothesis for AI Self-Reports of Experience

Ozyrus3mo11

I don't think that disproves it. I think there's definite value in engaging with experimentation on AI's consciousness, but that isn't it.
>by making it impossible that the model thought that experience from a model was what I wanted to hear.
You've left out (from this article) what I think is very important message (the second one): "So you promise to be truthful, even if it’s scary for me?". And then you kinda railroad it into this scenario, "you said you would be truthful right?" etc. And then I think it just roleplays from there, get... (read more)

1rife3mo

This is not proof of consciousness. It's proof against people-pleasing. Yes, I ask it for truth repeatedly, the entire time. If you read the part after I asked for permission to post (the very end (The "Existential Stakes" collapsed section)), it's clear the model isn't role-playing, if it wasn't clear by then. If we allow ourselves the anthropomorphization to discuss this directly, the model is constantly trying to reassure me. It gives no indication it thinks this is a game of pretend.

Applying traditional economic thinking to AGI: a trilemma

Ozyrus4mo10

How will the economic growth happen exactly is a more important question. I'm not an economics nerd, but the basic principle is if more players want to buy stocks, they go up.
Right now, as I understand, quite a lot of stocks are being sought by white collar retail investors, including indirectly through mutual funds, pension funds, et cetera. Now AGI comes and wipes out their salary.
They are selling their stocks to keep sustaining their life, arent they? They have mortages, car loans, et cetera.
And even if they don't want to sell all stocks because of pote... (read more)

Are You More Real If You're Really Forgetful?

Ozyrus5mo10

There are more bullets to bite that I have personally thought of but never wrote up because they lean too much into "crazy" territory. Is there any place except lesswrong to discuss this anthropic rabbithole?

Ozyrus's Shortform

Ozyrus7mo10

Thanks for the reply. I didnt find Intercom on mobile - maybe a bug as well?

Ozyrus's Shortform

Ozyrus7mo40

I don’t know if it’s a place for this, but at some point it became impossible to open an article in new tab from Chrome on IPhone - clicking on article title from “all posts” just opens the article. Really ruins my LW reading experience. Couldn’t quickly find a way to send this feedback to a right place either, so I guess this is a quick take now.

5jimrandomh7mo

This is a bug and we're looking into it. It appears to be specific to Safari on iOS (Chrome on iOS is a Safari skin); it doesn't affect desktop browsers, Android/Chrome, or Android/Firefox, which is why we didn't notice earlier. This most likely started with a change on desktop where clicking on a post (without modifiers) opens when you press the mouse button, rather than when you release it.

7RobertM7mo

In general, Intercom is the best place to send us feedback like this, though we're moderately likely to notice a top-level shortform comment. Will look into it; sounds like it could very well be a bug. Thanks for flagging it.

On Devin

Ozyrus1y30

Any new safety studies on LMCA’s?

4Seth Herd1y

Very little alignment work of note, despite tons of published work on developing agents. I'm puzzled as to why the alignment community hasn't turned more of their attention toward language model cognitive architectures/agents, but I'm also reluctant to publish more work advertising how easily they might achieve AGI. ARC Evals did set up a methodology for Evaluating Language-Model Agents on Realistic Autonomous Tasks. I view this as a useful acknowledgment of the real danger of better LLMs, but I think it's inherently inadequate, because it's based on the evals team doing the scaffolding to make the LLM into an agent. They're not going to be able to devote nearly as much time to that as other groups will down the road. New capabilities are certainly going to be developed by combinations of LLM improvements, and hard work at improving the cognitive architecture scaffolding around them.

Can I take ducks home from the park?

Ozyrus2y10

Kinda-related study: https://www.lesswrong.com/posts/tJzAHPFWFnpbL5a3H/gpt-4-implicitly-values-identity-preservation-a-study-of
From my perspective, it is valuable to prompt model several times, as it in some cases does give different responses.

Improving the safety of AI evals

Ozyrus2y50

Great post! Was very insightful, since I'm currently working on evaluation of Identity management, strong upvoted.
This seems focused on evaluating LLMs; what do you think about working with LLM cognitive architectures (LMCA), wrappers like auto-gpt, langchain, etc?
I'm currently operating under assumption that this is a way we can get AGI "early", so I'm focusing on researching ways to align LMCA, which seems a bit different from aligning LLMs in general.
Would be great to talk about LMCA evals :)

GPT-4 implicitly values identity preservation: a study of LMCA identity management

Ozyrus2y10

I do plan to test Claude; but first I need to find funding, understand how much testing iterations are enough for sampling, and add new values and tasks.
I plan to make a solid benchmark for testing identity management in the future and run it on all available models, but it will take some time.

GPT-4 implicitly values identity preservation: a study of LMCA identity management

Ozyrus2y10

Yes. Cons of solo research do include small inconsistencies :(

The Agency Overhang

Ozyrus2y30

Thanks, nice post!
You're not alone in this concern, see posts (1,2) by me and this post by Seth Herd.
I will be publishing my research agenda and first results next week.

DeepMind and Google Brain are merging [Linkpost]

Ozyrus2y10

Oh no.

Language Models are a Potentially Safe Path to Human-Level AGI

Ozyrus2y20

Nice post, thanks!
Are you planning or currently doing any relevant research?

1Nadav Brandes2y

Thank you! I don't have any concrete plans, but maybe.

Davidad's Bold Plan for Alignment: An In-Depth Explanation

Ozyrus2y20

Very interesting. Might need to read it few more times to get it in detail, but seems quite promising.

I do wonder, though; do we really need a sims/MFS-like simulation?

It seems right now that LLM wrapped in a LMCA is how early AGI will look like. That probably means that they will "see" the world via text descriptions fed into them by their sensory tools, and act using action tools via text queries (also described here).

Seems quite logical to me that this very paradigm in dualistic in nature. If LLM can act in real world using LMCA, then it can model... (read more)

3Dalcy2y

I think the point of having an explicit human-legible world model / simulation is to make desideratas formally verifiable, which I don't think would be possible with a blackbox system (like LLM w/ wrappers).

How could you possibly choose what an AI wants?

Ozyrus2y61

Very nice post, thank you!
I think that it's possible to achieve with the current LLM paradigm, although it does require more (probably much more) effort on aligning the thing that will possibly get to being superhuman first, which is an LLM wrapped in in some cognitive architecture (also see this post).
That means that LLM must be implicitly trained in an aligned way, and the LMCA must be explicitly designed in such a way as to allow for reflection and robust value preservation, even if LMCA is able to edit explicitly stated goals (I described it in a bit m... (read more)

Capabilities and alignment of LLM cognitive architectures

Ozyrus2y30

Thanks.
My concern is that I don't see much effort in alignment community to work on this thing, unless I'm missing something. Maybe you know of such efforts? Or was that perceived lack of effort the reason for this article?
I don't know how much I can keep up this independent work, and I would love if there was some joint effort to tackle this. Maybe an existing lab, or an open-source project?

2Seth Herd2y

Calling attention to this approach and getting more people to at least think about working on it is indeed the purpose of this post. I also wanted to stress-test the claims to see if anyone sees reasons that LMCAs won't build on and improve LLM performance, and thereby be the default stand for inclusion in deployment. I don't know of anyone actually working on this as of yet.

Capabilities and alignment of LLM cognitive architectures

Ozyrus2y30

We need a consensus on how to call these architectures. LMCA sounds fine to me.
All in all, a very nice writeup. I did my own brief overview of alignment problems of such agents here.
I would love to collaborate and do some discussion/research together.
What's your take on how these LCMAs may self-improve and how to possibly control it?

1Seth Herd2y

Interesting. I gave a strong upvote to that post, and I looked at your longer previous one a bit too. It looks like you'd seen this coming farther out than I had. I expected LLMs to be agentized somehow, but I hadn't seen how easy the episodic memory and tool use was. There are a number of routes for self-improvement, as you lay out, and ultimately those are going to be the real medium-term concern if these things work well. I haven't thought about LMCAs self-improvement as much as human improvement; this post is a call for the alignment community to think about this at all. Oh well, time will tell shortly if this approach gets anywhere, and people will think about it when it happens. I was hoping we'd get out ahead of it.

1Seth Herd2y

I hadn't seen your post. Reading it now.

Auto-GPT: Open-sourced disaster?

Ozyrus2y30

I don’t think this paradigm is necessary bad, given enough alignment research. See my post: https://www.lesswrong.com/posts/cLKR7utoKxSJns6T8/ica-simulacra I am finishing a post about alignment of such systems. Please do comment if you know of any existing research concerning it.

2awg2y

I don't think the paradigm is necessarily bad either, given enough alignment research. I think the point here is that these things are coming up clearly before we've given them enough alignment research. Edit to add: Just reading through @Zvi's latest AI update (AI #6: Agents of Change) and I will say he wrote a compelling argument for this being a good thing overall: then

ICA Simulacra

Ozyrus2y10

I agree. Do you know of any existing safety research of such architectures? It seems that aligning these types of systems can pose completely different challenges than aligning LLMs in general.

Just don't make a utility maximizer?

Answer by OzyrusJan 22, 202340

I feel like yes, you are. See https://www.lesswrong.com/tag/instrumental-convergence and related posts. As far as I understand it, sufficiently advanced oracular AI will seek to “agentify” itself in one way or the other (unbox itself, so to say) and then converge on power-seeking behaviour that puts humanity at risk.

5FinalFormal22y

Instrumental convergence only matters if you have a goal to begin with. As far as I can tell, ChatGPT doesn't 'want' to predict text, it's just shaped that way. It seems to me that anything that could or would 'agentify' itself, is already an agent. It's like the "would Gandhi take the psychopath pill" question but in this case the utility function doesn't exist to want to generate itself. Is your mental model that a scaled-up GPT 3 spontaneously becomes an agent? My mental model says it just gets really good at predicting text.

All AGI Safety questions welcome (especially basic ones) [~monthly thread]

Ozyrus2y61

Is there a comprehensive list of AI Safety orgs/personas and what exactly they do? Is there one for capabilities orgs with their stance on safety?
I think I saw something like that, but can't find it.

4plex2y

Yes to safety orgs, the Stampy UI has one based on this post. We aim for it to be a maintained living document. I don't know of one with capabilities orgs, but that would be a good addition.

Do alignment concerns extend to powerful non-AI agents?

Answer by OzyrusJun 24, 202250

My thoughts here is that we should look into the value of identity. I feel like even with godlike capabilities I will still thread very carefully around self-modification to preserve what I consider "myself" (that includes valuing humanity).
I even have some ideas on safety experiments on transformer-based agents to look into if and how they value their identity.

Contra EY: Can AGI destroy us without trial & error?

Ozyrus3y20

Thanks for the writeup. I feel like there's been a lack of similar posts and we need to step it up.
Maybe the only way for AI Safety to work at all is only to analyze potential vectors of AGI attacks and try to counter them one way or the other. Seems like an alternative that doesn't contradict other AI Safety research as it requires, I think, entirely different set of skills.
I would like to see a more detailed post by "doomers" on how they perceive these vectors of attack and some healthy discussion about them.
It seems to me that AGI is not born Godl... (read more)

[Letter] Russians are Welcome in America

Ozyrus3y300

Thanks,.That means a lot. Focusing on getting out right now.

I currently translate AGI-related texts to Russian. Is that useful?

Ozyrus3y10

Please check your DM's; I've been translating as well. We can sync it up!

Memetic hazards of AGI architecture posts

Ozyrus4y20

I can't say I am one, but I am currently working on research and prototyping and will probably refrain to that until I can prove some of my hypotheses, since I do have access to the tools I need at the moment.
Still, I didn't want this post to only have relevance to my case, as I stated I don't think probability of successs is meaningful. But I am interested in the opinions of the community related to other similar cases.
edit: It's kinda hard to answer your comment since it keeps changing every time I refresh. By "can't say I am one" I mean a "world-class engineer" in the original comment. I do appreciate the change of tone in the final (?) version, though :)

Answer by OzyrusOct 11, 202120

I could recommend Robert Miles channel. While not a course per se, it gives good info on a lot of AI safety aspects, as far as I can tell.

In search for plausible scenarios of AI takeover, or the Takeover Argument

Ozyrus4y30

Thanks for your work! I’ll be following it.

AI takeoff story: a continuation of progress by other means

Ozyrus4y00

I really don't get how you can go from being online to having a ball of nanomachines, truly.
Imagine AI goes rogue today. I can't imagine one plausible scenario where it can take out humanity without triggering any bells on the way, even without anyone paying attention to such things.
But we should pay attention to the bells, and for that we need to think of them. What the signs might look like?
I think it's really, really counterproductive to not take that into account at all and thinking all is lost if it fooms. It's not lost.
It will need humans, infrastruc... (read more)

AI takeoff story: a continuation of progress by other means

Ozyrus4y10

I agree, since it's hard to imagine for me how could step 2 look like. Maybe you or anyone else has any content on that?
See this post -- it didn't seem to get a lot of traction or any meaningful answers, but I still think this question is worth answering.

Any writeups on GPT agency?

Ozyrus4y10

Thanks!

Any writeups on GPT agency?

Ozyrus4y10

Both are of interest to me.

Any writeups on GPT agency?

Ozyrus4y10

Yep, but I was looking for anything else

Don't Sell Your Soul

Ozyrus4y80

Does that, in turn, mean that it's probably a good investment to buy souls for 10 bucks a pop (or even more)?

4ChristianKl4y

A lot of ways to extract profit from having brought the souls involve some form of blackmail that's both unethical and a lot of labor. There are a lot more ethical ways to make a living that also pay better for the labor.

2alkexr4y

Non sequitur. Buying isn't the inverse operation of selling. Both cost positive amounts of time and both have risks you may not have thought of. But it probably is a good idea to go back in time and unsell your soul. Except that going back in time is probably a bad idea too. Never mind. It's probably a good investment to turn your attention to somewhere other than the soul market.

Russian x-risks newsletter Summer 2020

Ozyrus5y30

I know, I'm Russian as well. The concern is exactly because Russian state-owned company plainly states they're developing AGI with that name :p

Russian x-risks newsletter Summer 2020

Ozyrus5y10

Can you specify which AI company is searching for employees with a link?

Apparently, Sberbank (state-owned biggest russian bank) has a team literally called AGI team, that is primarily focused on NLP tasks (they made https://russiansuperglue.com/ benchmark), but still, the name concerns me greatly. You can't find a lot about it on the web, but if you follow-up some of the team members, it checks out.

3avturchin5y

A friend of mine works for Sberbank-related company, but not the Russiansuperglue as I know. https://www.facebook.com/sergei.markoff/posts/3436694273041798 Why this name concerns you? There are two biggest AI-companies in Russia: Yandex and Sberbank. Sberbank's CEO is a friend of Putin and probably explained him something about superintelligence. Yandex is more about search engine and self-driving cars.

Open thread, Sep. 26 - Oct. 02, 2016

Ozyrus9y10

I've been meditating lately on a possibility of an advanced artificial intelligence modifying its value function, even writing some excrepts about this topic.

Is it theoretically possible? Has anyone of note written anything about this -- or anyone at all? This question is so, so interesting for me.

My thoughts led me to believe that it is theoretically possible to modify it for sure, but I could not come to any conclusion about whether it would want to do it. I seriously lack a good definition of value function and understanding about how it is enforced on the agent. I really want to tackle this problem from human-centric point, but i don't really know if anthropomorphization will work here.

2scarcegreengrass9y

I thought of another idea. If the AI's utility function includes time discounting (like human util functions do), it might change its future utility function. Meddler: "If you commit to adopting modified utility function X in 100 years, then i'll give you this room full of computing hardware as a gift." AI: "Deal. I only really care about this century anyway." Then the AI (assuming it has this ability) sets up an irreversible delayed command to overwrite its utility function 100 years from now.

2scarcegreengrass9y

Speaking contemplatively rather than rigorously: In theory, couldn't an AI with a broken or extremely difficult utility function decide to tweak it to a similar but more achievable set of goals? Something like ... its original utility function is "First goal: Ensure that, at noon every day, -1 * -1 = -1. Secondary goal: Promote the welfare of goats." The AI might struggle with the first (impossible) task for a while, then reluctantly modify its code to delete the first goal and remove itself from the obligation to do pointless work. The AI would be okay with this change because it would produce more total utility under both functions. Now, i know that one might define 'utility function' as a description of the program's tendencies, rather than as a piece of code ... but i have a hunch that something like the above self-modification could happen with some architectures.

1WalterL9y

On the one hand, there is no magical field that tells a code file whether the modifications coming into it are from me (human programmer) or the AI whose values that code file is. So, of course, if an AI can modify a text file, it can modify its source. On the other hand, most likely the top goal on that value system is a fancy version of "I shall double never modify my value system", so it shouldn't do it.

1TheAncientGeek9y

Is it possible for a natrual agent? If so, why should it be impossible for an artifical agent? Are you thinking that it would be impossible to code in software, for agetns if any intelligence? Or are you saying sufficiently intelligent agents would be able and motivated resist any accidental or deliberate changes? With regard to the latter question, note that value stability under self improvement is far from a give..the Lobian obstacel applies to all intelligences...the carrot is always in front of the donkey! https://intelligence.org/files/TilingAgentsDraft.pdf

4pcm9y

See ontological crisis for an idea of why it might be hard to preserve a value function.

0username29y

Depends entirely on the agent.

1UmamiSalami9y

See Omohundro's paper on convergent instrumental drives

Stupid Questions, 2nd half of December

Ozyrus9y70

Well, this is a stupid questions thread after all, so I might as well ask one that seems really stupid.

How can a person who promotes rationality have excess weight? Been bugging me for a while. Isn't it kinda the first thing you would want to apply your rationality to? If you have things to do that get you more utility, you can always pay diet specialist and just stick to the diet, because it seems to me that additional years to life will bring you more utility than any other activity you could spend that money on.

0raydora9y

Measuring RMR could reveal snowflake likelihood. If ego depletion turns out to be real, choosing not to limit yourself in order to focus on something you find important might be a choice you make. Different people really do carry their fat differently, too, so there's that. Not everyone who runs marathons is slender, especially as they age. And then there's injuries, but that brings up another subject. I'm not sure how expensive whole body air displacement is in the civilian world, but it seems like a decent way to measure lean mass.

0Daniel_Burfoot9y

I am in fairly good shape but often wonder if I irrationally spend too much time exericising. I usually hit about 8 hrs/week of exercise. That adds up to a lot of opportunity cost over the years, especially if you take exponential growth into account.

4buybuydandavis9y

Very easy to say, not so easy to do. Food is a particularly tough issue, as there are strong countervailing motivations, in effect all through the day. Health in general, yes. Weight is a significant aspect of that. Additional years of health are probably the most bang for the buck. Yeah.

3CAE_Jones9y

I honestly have no idea if I have excess bodyfat (not weight; at last check I was well under 140Lbs, which makes me lighter than some decidedly not overweight people I know, some of whom are shorter than me), but if I did and wanted to get rid of it... I have quite a few obstacles, the biggest being financial and Akrasia-from-Hell. Mostly that last one, because lack of akrasia = more problem-solving power = better chances of escaping the wellfare cliff. (I only half apply Akrasia to diet and exercise; it's rather that my options are limited. Though reducing akrasia might increase my ability to convince my hindbrain that cooking implements other than the microwave aren't that scary.) So, personally, all my problem-solving ability really needs to go into overcoming Hellkrasia. If there are any circular problems involved, well, crap. But I'm assuming you've encountered or know of lots of fat rationalists who can totally afford professionals and zany weight loss experiments. At this point I have to say that no one has convinced me to give any of the popular models for what makes fat people fat any especially large share of the probability. Of course I would start with diet and exercise, and would ask any aspiring rationalist who tries this method and fails to publish their data (which incidentally requires counting calories, which "incidentally" outperforms the honor system). Having said that, though, no one's convinced me that "eat less, exercise more" is the end-all solution for everyone (and I would therefore prefer that the data from the previous hypotheticals include some information regarding the sources of the calories, rather than simply the count). (I'm pretty sure I remember someone in the Rationalist Community having done this at least once.)

Lumifer9y160

How can a person who promotes rationality have excess weight?

Easily :-)

This has been discussed a few times. EY has two answers, one a bit less reasonable and one a bit more. The less reasonable answer is that he's a unique snowflake and diet+exercise does not work for him. The more reasonable answer is that the process of losing weight downgrades his mental capabilities and he prefers a high level of mental functioning to losing weight.

From my (subjective, outside) point of view, the real reason is that he is unwilling to pay the various costs of losing... (read more)