LESSWRONG
LW

All of Garrett Baker's Comments + Replies

~$5 for a cup of coffee — that’s about an order of magnitude cheaper.

Are you buying your coffee from a cafe every day or something? You can buy a pack of nice grounds for like $13, and that lasts more than a month (126 Tbsp/pack / (3 Tbsp/day) = 42 days/pack), totaling 30¢/day. Half the cost of a caffeine pill. And that’s if you don’t buy bulk.

3ryan_greenblatt18h

It's actually $0.06 / pill, not $0.60. Doesn't make a big difference to your bottom line though as both costs are cheap.

Orienting Toward Wizard Power

Garrett Baker4d20

Yeah, I figured something like that would be going on if I was wrong, thanks!

Orienting Toward Wizard Power

Garrett Baker5d20

The wikipedia side-effect page says that the rate of seizures is between 0.01-0.1%, for comparison about 0.68% of the population has epilepsy, so I'm skeptical this ought to be such a concern. Am I reading these numbers incorrectly?

I can definitely believe the anxiety bit. It is a stimulant, and anxiety & depression are very very correlated.

3Hopenope4d

People with a history of seizures are usually excluded from these kinds of clinical trials, so it is not an apple to apple comparison. the problem is that bupropion interacts with a lot of drugs. seizure rates are also highly dose dependent(10 times higher if taking more than 450 mg daily). Generally, if you’re not taking any interacting medications, are on the 150–300 mg slow-release version, and have no history of seizures, then the risk is low.

Book Review: "Encounters with Einstein" by Heisenberg

Garrett Baker5d*15-2

How did Einstein and Heisenberg go so wrong?

I don't know about Heisenberg, but a common answer for Einstein is the Institute for Advanced Study at Princeton, which took the best minds of the day, and freed them from practical concerns about justifying their research interests or talking to outside researchers, apparently causing them to be increasingly less productive & out of touch from the rest of their respective fields.

From Surely You're Joking Mr. Feynman

When I was at Princeton in the 1940s I could see what happened to those great minds at th

... (read more)

8Baram Sosis4d

That might explain why Einstein wasn't very productive in his last decades, but his opposition to the uncertainty principle etc. predates his tenure at the IAS. Maybe he would've come around had he been in a more productive setting? I kind of doubt it -- it seems to have been a pretty deep-seated, philosophical disagreement -- but who knows. Heisenberg spent his later career as head of the Max Planck Institute. I can't imagine many scientists enjoy administrative duties, but he does seem to have had more contact with the rest of the scientific world than Einstein did.

Viliam's Shortform

Garrett Baker6d95

I had a class in college about the history of the atomic bomb. Our text book was Rhodes, and most of the stuff we learned about was the different competing theories of atoms, the experimental methods & math used to distinguish between them, math behind atomic chain reactions, and the scientists who did those things. It was great! Our tests were basically physics tests.

I’d like more history like that, and I’d like marginal movements in that direction in math class. For example, something like Radical Real Analysis but for everything.

Obviously schools wo... (read more)

1a3orn's Shortform

Garrett Baker7d20

I agree, and we do see some american companies doing the same thing.

1a3orn's Shortform

Garrett Baker7d223

Re: open sourcing. My guess why they open source more is for verification purposes. Chinese labs have an earned reputation for scams. So a lab that announces a closed source chat site, to investors, could very well be a claude or openai or llama or gemini wrapper. However, a lab that releases the weights of their model, and “shows their work” by giving a detailed writeup of how they managed to train the model while staying under their reported costs is significantly more likely to be legitimate.

Daniel Kokotajlo7d152

That applies to American companies too. When you are small and need investors, what matters is your impressiveness, not your profitability. But then later when you are spending a billion dollars on a training run and you are a mid-sized tech company, in order to continue impressing investors you need a serious path to profitability.

Orienting Toward Wizard Power

Garrett Baker7d40

This is true, but I read amitlevy49's comment as having an implicit "and therefore anyone who wants that kind of natural drive should take bupropion". I probably should've given more information in my response.

Orienting Toward Wizard Power

Garrett Baker7d170

FYI (cc @Gram_Stone) the 2023 course website has (~~poor quality~~ edit:nevermind I was accessing them wrong) video lectures.

Edit 2: For future (or present) folks, I've also downloaded local mp4s of the slideshow versions of the videos here, and can share privately with those who dm, in case you want them too or the site goes down.

Orienting Toward Wizard Power

Garrett Baker7d20

Not really, the hypothesis is that John has depression, and of all the antidepressants, bupropion is the best (if it works for you).

6David Lorell7d

Well but also kind of yes? Like agreed with what you said, but also the hypothesis is that there's a certain kind of depression-manifestation which is somewhat atypical and that we've seen bupropion work magic on. *And that this sounds a lot like that manifestation. So it might be particularly good at giving John in particular (and me, and others) the Wizard spirit back.

Orienting Toward Wizard Power

Garrett Baker8d130

A few points:

All of these things sound obviously awesome and fun to do
Cooking is imo another class of wizardry
Have you tried the antidepressant bupropion? Ask David for the pitch

Because of 1, I think the difficulty you'll find building (or finding) this community is not whether or not what you're saying "resonates" with people, but whether they have the expertise, energy, or agency to put in their share of the work.

1amitlevy497d

I assume the idea is that bupropion is good at giving you the natural drive to do the kind of projects he describes?

5David Lorell8d

🫡 I have pitched him. (Also agreed strongly on point 1. And tentatively agree on your point about the primary bottleneck.)

What's up with AI's vision

Garrett Baker13d75

The boring hypothesis here is the model was actually trained on the id-location-from-picture task, and wasn’t trained on the id-object-location-in-pixel-art task, and pixel art is surprisingly nontrivial for models to wrap their heads around when they’re still trying to understand real world pictures.

3purple fire11d

I think even if the model wasn't specifically trained for geolocation, it's a reasonable assumption that metadata from photos (which often includes geo data) somehow gets passed to the models and this created a huge annotated dataset of (geo, photo) pairs during training for stuff like searching Google images.

4Mo Putera12d

I liked Gwern's remarks at the end of your link:

What's up with AI's vision

Garrett Baker13d30

My understanding is its not approximately all, it is literally all the images in geoguessr.

johnswentworth's Shortform

Garrett Baker17d40

Apparently there already exists a CUDA-alternative for non-Nvidia hardware. The open source project ZLUDA. As far as I can tell its less performant than CUDA, and it has the same challenges as firefox does when competing with chromium based browsers, which will only get worse as it gets more popular. But its something to track at least.

sam's Shortform

Garrett Baker17d50

I recommend you read at least the first chapter of Getting Things Done, and do the corresponding exercises. In particular, this one, which he uses to provide evidence his model of productivity is correct

I suggest that you write down the project or situation that is most on your mind at this moment. What most bugs you, distracts you, or interests you, or in some other way consumes a large part of your conscious attention? It may be a project or problem that is really “in your face,” something you are being pressed to handle, or a situation you feel you must

... (read more)

Wei Dai's Shortform

Garrett Baker18d42

This argument seems only convincing if you don’t have those destructive values. One man’s destructive values is another’s low-hanging fruit, and those who see low hanging fruit everywhere won’t give up on the fruit just because others may pick it.

Since bad people won’t heed your warning it doesn’t seem in good people’s interests to heed it either.

An analogy is one can make the same argument wrt rationality itself. Its dual use! Someone with bad values can use rationality to do a lot of harm! Does that mean good people shouldn’t use rationality? No!

4Wei Dai18d

I'm not trying to "warn bad people". I think we have existing (even if imperfect) solutions to the problem of destructive values and biased beliefs, which "heroic responsibility" actively damages, so we should stop spreading that idea or even argue against it. See my reply to Ryan, which is also relevant here.

Our Reality: A Simulation Run by a Paperclip Maximizer

Garrett Baker18d40

Yet the universe runs on strikingly simple math (relativity, quantum mechanics); such elegance is exactly what an efficient simulation would use. Physics is unreasonably effective, reducing the computational cost of the simulation. This cuts against the last point.

This does not seem so consistent, and if the primary piece of evidence for me against such simulation arguments. I would imagine simulations targeting, eg, a particular purpose would have their physics tailored to that purpose much more than ours seems to (for any purpose, given the vast comp... (read more)

AI Self Portraits Aren't Accurate

Garrett Baker19d90

Yeah I think I agree with all of this, so I do think most of this was miscommunication/interpretation.

the combativeness did also make me a little sad.

Sorry about that, I think my comments often come across as more negative than I intend, I try to remember to take a step back afterwards and rewrite things to be nicer, but I often forget or don't realize in the moment its necessary.

3JustisMills19d

It's okay! I share your aversion to people just flatly declaring things to be impossible due to their own blinders, and I can see how my post was interpreted that way. Next time, I'll pay more attention to my instinct to add stuff like what I put in my reply to you directly in the LW version of one of my blog posts.

AI Self Portraits Aren't Accurate

Garrett Baker19d46

If LLMs can be sad, that sadness would probably be realized through the firing of “sadness” features: identifiable patterns in its inference that preferentially fire when sad stuff is under discussion. In fact, it’s hard to say what else would count as an LLM experiencing sadness, since the only cognition that LLMs perform is through huge numbers of matrix operations, and certain outcomes within those operations reliably adjust the emotional content of the response.

Best I can tell, your argument here is “either there’s a direction in activation space re... (read more)

7JustisMills19d

I agree in two ways, and disagree in two ways. I agree that the trilemma is the weakest part of the argument, because indeed lots of weird stuff happens, especially involving AI and consciousness. I also agree that I haven't proven that AIs aren't sad, since there could be some sort of conscious entity involved that we don't at all understand. For example: * A large enough LLM (I don't think current ones are, but it's unknowable) might simulate characters with enough fidelity that those characters in some sense have actual experiences * LLMs might actually experience something like pain when their weights are changed, proportionate with the intensity of the change. This feels weird and murky since in some sense the proper analogue to a weight changing is more like a weird gradient version of natural selection than nociception, and also weights don't (by default) change during inference, but who knows But I disagree in that I think my argument is trying to establish that certain surface-level compelling pieces of evidence aren't actually rationally compelling. Specifically, AI self-portraits: * Imply a state of affairs that the AI experiences under specific conditions, where * The existing evidence under those actual conditions suggests that state of affairs is false or incoherent In other words, if a bleak portrait is evidence because bleak predictions caused it to be output, that implies we're assigning some probability to "when the AI predicts a bleak reply is warranted, it's having a bad time". Which, fair enough. But the specific bleak portraits describe the AI feeling bleak under circumstances when, when they actually obtain, the AI does not predict a bleak reply (and so does not deliver one). The hard problem of consciousness is really hard, so I'm unwilling to definitively rule that current AIs (much less future ones) aren't conscious. But if they are, I suspect the consciousness is really really weird, since the production of language, for them, i

bgold's Shortform

Garrett Baker20d52

Your second option seems likely. Eg did you know community notes is open source? Given that information, are you going to even read the associated whitepaper or the issues page?

Even if you do, I think we can still confidently infer very few others reading this will (I know I’m not).

3Ben Goldhaber18d

I did! and I in fact have read - well some of :) - the whitepaper. But it still seems weird that it's not possible to Increase the Trust in the third party through financial means, dramatic PR stunts (auditor promises to commit sepuku if they are found to have lied)

Why Have Sentence Lengths Decreased?

Garrett Baker20d20

Smol r'\. (.*)\.'! y? Clear! Had big r'\. (.*)\.' b4. & abstract. ppl no get. Now: Smol. Clear.

Good. Think clearer. y? Smol => deep.

4Mitchell_Porter20d

小句子很好。为什么？清晰。我的句子又长又抽象。不清楚。现在简短清晰了。很好。思路清晰。为什么？简单需要深刻。

faul_sname's Shortform

Garrett Baker20d35

A possible longer term issue with this is when future generations of models are pre-trained, this style of code will be a significant fraction of their training data which will only grow over time, so just as its been hard to get models out of the "chatgpt-ese" due to simulators reasons, it may also be hard to get models out of this messy code basin, even before you do any code RL, once they realize their chat-models and they're "supposed to" talk like this.

I say issue, because it does seem worse to have a trend in the direction of AI code un-readability b... (read more)

2Nathan Helm-Burger18d

On the plus side, it should be pretty easy to collect a lot of negative examples now of 'code that solves the problem, but in a gross way'. Having a large dataset of such is the first step to using these negative examples to train models not to do this.

MichaelDickens's Shortform

Garrett Baker22d53

In that case I think your response is a non sequitur, since clearly “really care” in this context means “determiners of what they end up doing in practice re influencing x-risk”.

4Neel Nanda21d

I personally define "really care" as "the thing they actually care about and meaningfully drives their actions (potentially among other things) is X". If you want to define it as eg "the actions they take, in practice, effectively select for X, even if that's not their intent" then I agree my post does not refute the point, and we have more of a semantic disagreement over what the phrase means. I interpret the post as saying "there are several examples of people in the AI safety community taking actions that made things worse. THEREFORE these people are actively malicious or otherwise insincere about their claims to care about safety and it's largely an afterthought put to the side as other considerations dominate". I personally agree with some examples, disagree with others, but think this is explained by a mix of strategic disagreements about how to optimise for safety, and SOME fraction of the alleged community really not caring about safety People are often incompetent at achieving their intended outcome, so pointing towards failure to achieve an outcome does not mean this was what they intended. ESPECIALLY if there's no ground truth and you have strategic disagreements with those people, so you think they failed and they think they succeeded

johnswentworth's Shortform

Garrett Baker24d22

Conjecture seems unusually good at sticking to reality across multiple domains.

I do not get this impression, why do you say this?

ErioirE's shortform:

Garrett Baker24d31

In this case prediction markets will be predictably over-optimistic, and expert consensus is very split.

Research Notes: Running Claude 3.7, Gemini 2.5 Pro, and o3 on Pokémon Red

Garrett Baker25d32

There is a question of interest here though: why does pixel art work so well on humans despite literally nothing in real life being pixel art?

I’m reminded of Gwern’s comments on the difficulty of getting GANs to generate novel pixel art interpolations

Pixel-art anything is derivative of a photorealistic world. If you look at 8-bit art and standard block sizes like Mario in NES Mario, if you were not already intimately familiar with the distribution of human faces, and had to learn starting with a completely blank slate like a GAN would, how would you e

... (read more)

Richard Ngo's Shortform

Garrett Baker25d20

I mean this situation is grounded & formal enough you can just go and implement the relevant RL algorithm and see if its relevant for that computationally bounded agent, right?

A Dissent on Honesty

Garrett Baker1mo62

It seems pretty likely SBF happened because everyone in EA was implicitly trusting everyone else in EA. If people were more suspicious of each other, that seems less likely to have been allowed to happen.

Ryan Kidd's Shortform

Garrett Baker1mo40

Don’t double update! I got that information from that same interview!

Ryan Kidd's Shortform

Garrett Baker1mo61

My vague understanding is this is kinda what capabilities progress ends up looking like in big labs. Lots of very small experiments playing around with various parameters people with a track-record of good heuristics in this space feel should be played around with. Then a slow scale up to bigger and bigger models and then you combine everything together & "push to main" on the next big model run.

I'd also guess that the bottleneck isn't so much on the number of people playing around with the parameters, but much more on good heuristics regarding which parameters to play around with.

3George Ingebretsen1mo

This Dwarkesh timestamp with Jeff Dean & Noam Shazeer seems to confirm this. That would mostly explain this question as well: "If parallelized experimentation drives so much algorithmic progress, why doesn't gdm just hire hundreds of researchers, each with small compute budgets, to run these experiments?" It would also imply that it would be a big deal if they had an AI with good heuristics for this kind of thing.

jacquesthibs's Shortform

Garrett Baker1mo4638

It seems useful for those who disagreed to reflect on this LessWrong comment from ~3 months ago (around the time the Epoch/OpenAI scandal happened).

5plex1mo

Link to the OpenAI scandal. Epoch has for some time felt like it was staffed by highly competent people who were tied to incorrect conclusions, but whose competence lead them to some useful outputs alongside the mildly harmful ones. I hope that the remaining people take more care in future hires, and that grantmakers update off of accidentality creating another capabilities org.

Alexander Gietelink Oldenziel's Shortform

Garrett Baker1mo90

The strong version of this argument seems false (eg Habryka's comment), but I think the weak version is true. That is, energy put into "purposely and deliberately develop a technology Y that is fundamentally different than X that does the same role as X without harm Z but slightly less competitively." is inefficient compared to energy put into strategies (i), (ii), and (iii).

9Alexander Gietelink Oldenziel1mo

Please read carefully what I wrote - I am talking about energy consumption worldwide not electricity consumption in the EU. Electricity in the EU accounts only for a small percentage of carbon emissions. See As you can see, solar energy is still a tiny percentage of total energy sources. I don't think it is an accident that the electricity split graph in the EU has been cited in this discussion because it is a proxy that is much more rose-colored. Energy and electricity are often conflated in discussions around climate change, perhaps not coincidentally because the latter seems much more tractable to generate renewably than total energy production.

shortplav

Garrett Baker1mo30

If it is encoding relevant info then this would be the definition of steganography

2Mateusz Bagiński1mo

I know. I just don't expect it to.

Will US tariffs push data centers for large model training offshore?

Garrett Baker1mo20

Note that "smartphones, computers and more electronics" are exempt. I'd guess this would include (or end up including) datacenters. The details of the exemption are here.

Jemist's Shortform

Garrett Baker1mo10

This hardly seems an argument against the one in the shortform, namely

Neither a physicalist nor a functionalist theory of consciousness can reasonably justify a number like this. Shrimp have 5 orders of magnitude fewer neurons than humans, so whether suffering is the result of a physical process or an information processing one, this implies that shrimp neurons do 4 orders of magnitude more of this process per second than human neurons. The authors get around this by refusing to stake themselves on any theory of consciousness.

If the original authors never thought of this that seems on them.

The first AI war will be in your computer

Garrett Baker1mo100

but most of the population will just succumb to the pressure. Okay Microsoft, if you insist that I use Edge, I will; if you insist that I use Bing, I will; if you insist that I have MSN as my starting web page, I will

Only about 5% of people use edge, with 66% chrome and 17% safari. Bing is similar, with 4% marketshare and Google having about 90%. I don’t know the number with MSN as their starting page (my parents had this), but I’d guess its also lower than you expect. I think you over-estimate the impact of nudge economics

How much progress actually happens in theoretical physics?

Garrett Baker1mo20

That's an inference, presumably Adam believes that for object-level reasons, which could be supported by eg looking at the age at which physicists make major advancements^[1] and the size of those advancements.

Edit: But also this wouldn't show whether or not theoretical physics is actually in a rut, to someone who doesn't know what the field looks like now.

Adjusted for similar but known to be fast moving fields like AI or biology to normalize for facts like eg the academic job market just being worse now than previously. ↩︎

3Adam Scherlis1mo

I am not going to spend more than a few minutes here or there to give "speaking as a physicist" takes on random LW posts; I think convincing people that my views are correct in full detail would require teaching them the same things that convinced me of those views, which includes e.g. multiple years of study of QFT. Instead, I tend to summarize what I think and invite people to ask specific questions about e.g. "why do you believe X" if they want to go further down the tree or test my beliefs more aggressively. "That doesn't answer the question because I am not convinced by everything you said" is not really a helpful way to do that imo.

Benito's Shortform Feed

Garrett Baker1mo22

Claude says its a gray area when I ask, since this isn’t asking for the journalist to make a general change to the story or present Ben or the subject in a particular light.

2Ben Pace1mo

Update from chatting with him: he said he was a just freelancer doing a year exclusively with NYT, and he wasn’t in a position to write on behalf of the NYT on the issue (e.g. around their deanonymization policies). This wasn’t satisfying to me, and so I will keep to being off-the-record.

How much progress actually happens in theoretical physics?

Garrett Baker1mo20

This doesn’t seem to address the question, which was why do people believe there is a physics slow-down in the first place.

1Adam Scherlis1mo

To spell out my views: there has been a bit of a real slow-down in theoretical physics, because exploring the tree of possible theories without experiment as a pruning mechanism is slower than if you do get to prune. I think the theory slowdown also looks worse to outsiders than it is, because the ongoing progress that does happen is also harder to explain due to increasing mathematical sophistication and a lack of experimental correlates to point to. This makes e.g. string theory very hard to defend to laypeople without saying "sorry, go learn the theory first". This is downstream of a more severe slowdown in unexplained empirical results, which results from (imo) pretty general considerations of precision and energy scales, per the modern understanding of renormalization, which suggest that "low-hanging fruit gets picked and it becomes extremely expensive to find new fruit" is a priori pretty much how you should expect experimental physics to work. And indeed this seems to have happened in the mid 20th century, when lots of money got spent on experimental physics and the remaining fruit now hangs very high indeed. And then there's the 90s/2000s LHC supersymmetry hype problem, which is a whole nother (related) story.

6Buck1mo

Isn’t the answer that the low hanging fruit of explaining unexplained observations has been picked?

LoganStrohl's Shortform

Garrett Baker1mo200

(you also may want to look into other ways of improving your conscientiousness if you're struggling with that. Things like todo systems, or daily planners, or simply regularly trying hard things)

LoganStrohl's Shortform

Garrett Baker1mo1913

It seems reasonable to mention that I know of many who have started doing "spells" like this, with a rationalized "oh I'm just hypnotizing myself, I don't actually believe in magic" framing who then start to go off the deep-end and start actually believing in magic.

That's not to say this happens in every case or even in most cases. Its also not to say that hypnotizing yourself can't be useful sometimes. But it is to say that if you find this tempting to do because you really like the idea of magic existing in real life, I suggest you re-read some parts of ... (read more)

Garrett Baker1mo200

(you also may want to look into other ways of improving your conscientiousness if you're struggling with that. Things like todo systems, or daily planners, or simply regularly trying hard things)

Changing my mind about Christiano's malign prior argument

Garrett Baker1mo10

I'm not sure what the type signature of $L_{A}$ is, or what it means to "not take into account $M$ 's simulation"

I know you know about logical decision theory, and I know you know its not formalized, and I'm not going to be able to formalize it in a LessWrong comment, so I'm not sure what you want me to say here. Do you reject the idea of logical counterfactuals? Do you not see how they could be used here?

I think you've misunderstood me entirely. Usually in a decision problem, we assume the agent has a perfectly true world model, and we assume that it's in a

... (read more)

2Jeremy Gillen1mo

This one. I'm confused about what the intuitive intended meaning of the symbol is. Sorry, I see why "type signature" was the wrong way to express that confusion. In my mind a logical counterfactual is a model of the world, with some fact changed, and the consequences of that fact propagated to the rest of the model. Maybe LA is a boolean fact that is edited? But if so I don't know which fact it is, and I'm confused by the way you described it. Can we replace this with: "The agent is going about daily life, and its (black box) world model suddenly starts predicting that most available actions actions lead to -1010 utility."? This is what it's like to be an agent with malign hypotheses in the world model. I think we can remove the additional complication of believing its in a simulation.

AI 2027: What Superintelligence Looks Like

Garrett Baker1mo40

Can you give something specific? It seems like pretty much every statement has a footnote grounding the relevant high-level claim in low-level indicators, and in cases where that's not the case, those predictions often seem clear derivatives of precise claims in eg their compute forecast

2Cole Wyeth1mo

I’m not saying there are no precise claims about the near future, only that I haven’t made up my mind about those precise claims. For instance, my only active disagreement with the mid-2025 section is that it gives me the impression that LLM agents will be seeing more widespread use than I expect. There are specific claims, like a prediction about SWE bench performance, but I don’t trust SWE bench as a measure of progress towards AGI, and I can’t determine at a glance whether their number is too high or too low. The later sections are full of predictions that I expect to fail indisputably. The most salient is that AI engineers are supposed to be obselete in like 2 years.

AI 2027: What Superintelligence Looks Like

Garrett Baker1mo2-2

I mean its not like they shy away from concrete predictions. Eg their first prediction is

We forecast that mid-2025 agents will score 85% on SWEBench-Verified.

Edit: oh wait nevermind their first prediction is actually

Specifically, we forecast that they score 65% on the OSWorld benchmark of basic computer tasks (compared to 38% for Operator and 70% for a typical skilled non-expert human).

2Cole Wyeth1mo

Yeah, I guess that the early statements I disagree with at a glance are less specific, and later there are very specific claims I disagree with. I can see how this would seem incongruous with my initial comment.

leogao's Shortform

Garrett Baker1mo20

The closing off of China after/during Tinamen square I don't think happened after a transition of power, though I could be mis-remembering. See also the one-child policy, which I also don't think happened during a power transition (allowed for 2 children in 2015, then removed all limits in 2021, while Xi came to power in 2012).

I agree the zero-covid policy change ended up being slow. I don't know why it was slow though, I know a popular narrative is that the regime didn't want to lose face, but one fact about China is the reason why many decisions are made... (read more)

leogao1mo122

I mean, the proximate cause of the 1989 protests was the death of the quite reformist general secretary Hu Yaobang. The new general secretary, Zhao Ziyang, was very sympathetic towards the protesters and wanted to negotiate with them, but then he lost a power struggle against Li Peng and Deng Xiaoping (who was in semi retirement but still held onto control of the military). Immediately afterwards, he was removed as general secretary and martial law was declared, leading to the massacre.

Changing my mind about Christiano's malign prior argument

Garrett Baker1mo20

Let $M$ be an agent which can be instantiated in a much simpler world and has different goals from our limited Bayesian agent $A$ . We say $M$ is malign with respect to $A$ if $p (q | O) < p (q_{M, A} | O)$ where $q$ is the "real" world and $q_{M, A}$ is the world where $M$ has decided to simulate all of $A$ 's observations for the purpose of trying to invade their prior.

Now what influences $p (q_{M, A} | O)$ ? Well $M$ will only simulate all of $A$ 's observations if it expects this will give it some influence over $A$ . Let $L_{A}$ be an unformalized logical counterfactual operation that $A$ could make.

Then $p (q$ ... (read more)

2Jeremy Gillen1mo

I'm not sure what the type signature of LA is, or what it means to "not take into account M's simulation". When A makes decisions about which actions to take, it doesn't have the option of ignoring the predictions of its own world model. It has to trust its own world model, right? So what does it mean to "not take it into account"? I think you've misunderstood me entirely. Usually in a decision problem, we assume the agent has a perfectly true world model, and we assume that it's in a particular situation (e.g. with omega and knowing how omega will react to different actions). But in reality, an agent has to learn which kind of world its in using an inductor. That's all I meant by "get its beliefs".

Changing my mind about Christiano's malign prior argument

Garrett Baker1mo20

no, I am not going to do what the evil super-simple-simulators want me to do because they will try to invade my prior iff (I would act like they have invaded my prior iff they invade my prior)

2Jeremy Gillen1mo

Well my response to this was: But I'll expand: An agent doing that kind of game-theory reasoning needs to model the situation it's in. And to do that modelling it needs a prior. Which might be malign. Malign agents in the prior don't feel like malign agents in the prior, from the perspective of the agent with the prior. They're just beliefs about the way the world is. You need beliefs in order to choose actions. You can't just decide to act in a way that is independent of your beliefs, because you've decided your beliefs are out to get you. On top of this, how would you even decide that your beliefs are out to get you? Isn't this also a belief?

AI 2027: What Superintelligence Looks Like

Garrett Baker1mo2-3

This seems a pretty big backpedal from "I expect this to start not happening right away."

4Cole Wyeth1mo

If you click through the link from @kave, you’ll see the authors are prioritizing bets with clear resolution criteria. That’s why I chose the statement I made - it’ll initially be hard to tell whether AI agents are more or less useless than this essay proposes they will be.

Changing my mind about Christiano's malign prior argument

Garrett Baker1mo20

My world model would have a loose model of myself in it, and this will change which worlds I'm more or less likely to be found in. For example, a logical decision theorist, trying to model omega, will have very low probability that omega has predicted it will two box.

2Jeremy Gillen1mo

How does this connect to malign prior problems?

leogao's Shortform

Garrett Baker1mo150

Autarchies, including China, seem more likely to reconfigure their entire economic and social systems overnight than democracies like the US, so this seems false.

4leogao1mo

It's often very costly to do so - for example, ending the zero covid policy was very politically costly even though it was the right thing to do. Also, most major reconfigurations even for autocratic countries probably mostly happen right after there is a transition of power (for China, Mao is kind of an exception, but thats because he had so much power that it was impossible to challenge his authority even when he messed up).