All of Rafael Harth's Comments + Replies

Another quick way to get at my skepticism for LLM capability: the ability do differentiate good from bad ideas. ImE the only way LLMs can do this (or at least seem to do this) is the prevailing views are different depending on context -- like if there's an answer popular among lay people and a better answer popular among academics, and your tone makes it clear that you belong in the second category, then it will give you the second one, and that can make it seem like it can tell good and bad takes apart. But this clearly doesn't count. If the most popular ... (read more)

2Seth Herd
I had somehow missed your linked post (≤10-year Timelines Remain Unlikely Despite DeepSeek and o3) when you posted it a few months ago. It's great! There's too much to cover here; it touches on a lot of important issues. I think you're pointing to real gaps that probably will slow things down somewhat; those are "thought assesment" which has also been called taste or evaluation, and having adequate skill with sequential thinking or System 2 thinking.  Unfortunately, the better term for that is Type 2 thinking, because it's not a separate system. Similarly, I think assessment also doesn't require a separate system, just a different use of the same system. For complex reasons centering on that, I'm afraid there are shortcuts that might work all too well. The recent successes of scaffolding for that type of thinking indicate that those gaps might be filled all too easily without breakthroughs.  Here are the concerning examples of progress along those lines with no breakthroughs or even new model training, just careful scaffolding of the sort I envisioned in 2023 that hasn't yet really been seen so far, outside of these focused use cases. Perplexity nearly matched OpenAIs o3-powered Deep Research in two weeks of work on scaffolding a lesser model. More elaborately and impressively, Google’s AI co-scientist also used a last-gen model to match cutting-edge medical research hypothesis generation and pruning (summarized near the start of this excellent podcast). This addresses your "assessor" function. LLMs are indeed bad at it, until they're scaffoded with structured prompts to think carefully, iteratively, and critically - and then apparently they can tell good ideas from bad. But they still might have a hard time matching human performance in this area; this is one thing that's giving me hope for longer timelines. I have more thoughts on that excellent article. I've been thinking more about the gaps you focus on; this expands my timelines almost out to ten years, sti
2cubefox
A related take I heard a while ago: LLMs have strongly superhuman declarative knowledge across countless different subject areas. Any human with that much knowledge would be able to come up with many new theories or insights from combining knowledge from different fields. But LLMs apparently can't do this. They don't seem to synthesize, integrate and systematize their knowledge much. Though maybe they have some latent ability to do this, and they only need some special sort of fine-tuning to unlock it, similar to how reasoning training seems to elicit abilities the base models already have.

No I definitely think thought assessment has more to it than just attention. In fact I think you could argue that LLMs' attention equivalent is already more powerful/accurate than human attention.

The only evidence I can provide at this point is the similarity of LLMs to humans who don't pay attention (as first observed in Sarah's post that I linked in the text). If you want to reject the post based on the lack of evidence for this claim, I think that's fair.

2Eli Tyre
You mean that the human attention mechanism is the assessor? 

I think we have specialized architectures for consciously assessing thoughts, whereas LLMs do the equivalent of rattling off the first thing that comes to mind, and reasoning models do the equivalent of repeatedly feeding back what comes to mind into the input (and rattling off the first thing that comes to mind for that input).

2Eli Tyre
Do you have a pointer for why you think that?  My (admittedly weak) understanding of the neuroscience doesn't suggest that there's a specialized mechanism for critique of prior thoughts.

Trump says a lot of stuff that he doesn't do, the set of specific things that presidents don't do is larger than the set of things they do, and tariffs didn't even seem like they'd be super popular with his base if in fact they were implemented. So "~nothing is gonna happen wrt tariffs" seemed like the default outcome with not enough evidence to assume otherwise.

I was also not paying a lot of attention to what he was saying. After the election ended, I made a conscious decision to tune out of politics to protect my mental health. So it was a low informatio... (read more)

I've also noticed this assumption. I myself don't have it, at all. My first thought has always been something like "If we actually get AGI then preventing terrible outcomes will probably require drastic actions and if anything I have less faith in the US government to take those". Which is a pretty different approach from just assuming that AGI being developed by government will automatically lead to a world with values of government . But this a very uncertain take and it wouldn't surprise me if someone smart could change my mind pretty quickly.

These are very poor odds, to the point that they seem to indicate a bullish rather than a bearish position on AI.

If you think the odds of something are , but lots of other people think they are with , then the rational action is not to offer bets at a point close to ; it's to find the closest number to possible. Why would you bet at 1:5 odds if you have reason to believe that some people would be happy to bet at 1:7 odds?

You could make an argument that this type of thinking is too mercenary/materialistic or whatever, but then critique sh... (read more)

3DAL
This only works if you're the only bookmaker in town.  Even if your potential counterparties place their own subjective odds at 1:7, they won't book action with you at 1:7 if they can get 1:5 somewhere else. Perhaps I misread OP's motivations, but presumably if you're looking to make money on these kinds of forecasts, you'd just trade stocks.  Sure, you can't trade OpenAI per se, but there are lot of closely related assets and then you're not stuck in the position of trying to collect on a bet you made with a stranger over the internet. So, the function of offering such a "bet" is more as a signaling device about your beliefs.  In which case, the signal being sent here is not really a bearish one.  
1Remmelt
This is a neat and specific explanation of how I approached it. I tried to be transparent about it though.

I'm glad METR did this work, and I think their approach is sane and we should keep adding data points to this plot.

It sounds like you also think the current points on the plot are accurate? I would strongly dispute this, for all the reasons discussed here and here. I think you can find sets of tasks where the points fit on an exponential curve, but I don't think AI can do 1 hour worth of thinking on all, or even most, practically relevant questions.

7Cole Wyeth
I remember enjoying that post (perhaps I even linked it somewhere?) and I think it’s probably the case that the inefficiency in task length scaling has to do with LLMs having only a subset of cognitive abilities available. I’m not really committed to a view on that here though. The links don’t seem to prove that the points are “inaccurate.” 

In the last few months, GPT models have undergone a clear shift toward more casual language. They now often close a post by asking a question. I strongly dislike this from both a 'what will this do to the public's perception of LLMs' and 'how is my personal experience as a customer' perspective. Maybe this is the reason to finally take Gemini seriously.

I unfortunately don't think this proves anything relevant. The example just shows that there was one question where the market was very uncertain. This neither tells us how certain the market is in general (that depends on its confidence on other policy questions), nor how good this particular estimate was (that, I would argue, depends on how far along the information chart it was, which is not measurable -- but even putting my pet framework aside, it seems intuitively clear "it was 56% and then it happened" doesn't tell you how much information the market... (read more)

2Ape in the coat
Just curious, how comes? Were you simply not paying attention to what he was saying? Or were you not believing in his promises?

Thanks. I've submitted my own post on the 'change our mind form', though I'm not expecting a bounty. I'd instead be interested in making a much bigger bet (bigger than Cole's 100 USD), gonna think about what resolution criterion is best.

I might be misunderstanding how this works, but I don't think I'm gonna win the virtue of The Void anytime soon. Or at all.

9Richard_Kennaway
I got the Void once, just from spinning the wheels, but it doesn't show up on my display of virtues. Apparently I now have a weak upvote strength of 19 and a strong upvote of 103. Similarly for downvotes. But I shall use my powers (short-lived, I'm sure) only for good.
7lc
test
3Sodium
I don't think you're supposed to get the virtue of Void, if you got it, it wouldn't void anymore, would it?
1williawa
How do you get it? Apparently you can't get it from spinning the boxes.
gustaf*184

Yes, that's what the source code says. There are even two explicit guards against it: "make sure you don't land on an impossible reward" and .filter(r => r.weight > 0).

@habryka has the void virtue, though is missing empiricism. Lesswrong user habryka displaying badges for the virtues of  void, argument, evenness, ghiblify, humility, curiosity, lightness, precision, simplicity, scholarship, perfectionism and relinquishment 

Maybe he got it prior to the commit introducing the weight: 0 or via direct access to the database.

I didn't find another way of gaining the virtue.

aphyer352

To get the virtue of the Void you need to turn off the gacha game and go touch grass.  If you fail to achieve that, it is futile to protest that you acted with propriety.

Neil 150

no one's getting a million dollars and an invitation the the beisutsukai 

If people downvoted because they thought the argument wasn’t useful, fine - but then why did no one say that? Why not critique the focus or offer a counter? What actually happened was silence, followed by downvotes. That’s not rational filtering. That’s emotional rejection.

Yeah, I do not endorse the reaction. The situation pattern-matches to other cases where someone new writes things that are so confusing and all over the place that making them ditch the community (which is often the result of excessive downvoting) is arguably a good thing. But I don'... (read more)

3funnyfranco
I appreciate your response, and I'm sorry about the downvotes you got from seeming supportive. I take your point about getting people to read, but I guess the issue is that the only way you can reliably do that is by being an accepted/popular member of the community. And, as a new member, that would be impossible for me. This would be fine on a high school cheerleading forum, but it seems out of place on a forum that claims to value ideas and reason. I will still be leaving, but, as a result of this post, I actually have one more post to make. A final final post. And it will not be popular but it will be eye opening. Due to my karma score I can't post it until next Monday, so keep an eye out for it if you're interested.

I think you probably don't have the right model of what motivated the reception. "AGI will lead to human extinction and will be built because of capitalism" seems to me like a pretty mainstream position on LessWrong. In fact I strongly suspect this is exactly what Eliezer Yudkowsky believes. The extinction part has been well-articulated, and the capitalism part is what I would have assumed is the unspoken background assumption. Like, yeah, if we didn't have a capitalist system, then the entire point about profit motives, pride, and race dynamics wouldn't a... (read more)

0funnyfranco
My idea is not mainstream, although I’ve heard that claim a few times. But whenever I ask people to show me where this argument - that AGI extinction is structurally inevitable due to capitalist competition - has been laid out before, no one can point to anything. What I get instead is vague hand-waving and references to ideas that aren’t what I’m arguing. Most people say capitalism makes alignment harder. I’m saying it makes alignment structurally impossible. That’s a different claim. And as far as I can tell, a novel one. If people downvoted because they thought the argument wasn’t useful, fine - but then why did no one say that? Why not critique the focus or offer a counter? What actually happened was silence, followed by downvotes. That’s not rational filtering. That’s emotional rejection. And if you had read the essay, you’d know it isn’t political. I don’t blame capitalism in a moral sense. I describe a system, and then I show the consequences that follow from its incentives. Socialism or communism could’ve built AGI too - just probably slower. The point isn’t to attack capitalism. It’s to explain how a system optimised for competition inevitably builds the thing that kills us. So if I understand you correctly: you didn’t read the essay, and you’re explaining that other people who also didn’t read the essay dismissed it as “political” because they didn’t read it. Yes. That’s exactly my point. Thank you.

if we didn't have a capitalist system, then the entire point about profit motives, pride, and race dynamics wouldn't apply

Presence of many nations without a central authority still contributes to race dynamics.

Sorry, but isn't this written by an LLM? Especially since milan's other comments ([1], [2], [3]) are clearly in a different style, the emotional component goes from 9/10 to 0/10 with no middle ground.

I find this extremely offensive (and I'm kinda hard to offend I think), especially since I've 'cooperated' with milan's wish to point to specific sections in the other comment. LLMs in posts is one thing, but in comments, yuck. It's like, you're not worthy of me even taking the time to respond to you.

The guidelines don't differentiate between posts and comment... (read more)

0milanrosko
QED

The sentence you quoted is a typo, it's is meant to say that formal languages are extremely impractical.

1milanrosko
well this is also not true. because "practical" as a predicate... is incomplete.... meaning its practical depending on who you ask. Talking over "Formal" or "Natural" languages in a general way is very hard... The rule is this: Any reasoning or method is acceptable in mathematics as long as it leads to sound results.

Here's one section that strikes me as very bad

At its heart, we face a dilemma that captures the paradox of a universe so intricately composed, so profoundly mesmerizing, that the very medium on which its poem is written—matter itself—appears to have absorbed the essence of the verse it bears. And that poem, unmistakably, is you—or more precisely, every version of you that has ever been, or ever will be.

I know what this is trying to do but invoking mythical language when discussing consciousness is very bad practice since it appeals to an emotional resp... (read more)

-4milanrosko
I'm actually amused that you criticized the first paragraph of an essay for being written in prose — it says so much about the internet today.  
0milanrosko
"natural languages are extremely impractical, which is why mathematicians don't write any real proofs in them." I have never seen such a blatant disqualifaction of one's self. Why do you think you are able to talk to these subjects if you are not versed in Proof theory? Just type it into chat gpt: Research proof theory, type theory, and Zermelo–Fraenkel set theory with the axiom of choice (ZFC) before making statements here. At the very least, try not to be miserable. Someone who mistakes prose for an argument should not have the privilege of indulging in misery.

I agree that this sounds not very valuable; sounds like a repackaging of illusionism without adding anything. I'm surprised about the votes (didn't vote myself).

-2milanrosko
Illusionism often takes a functionalist or behavioral route: it says that consciousness is not what it seems, and explains it in terms of cognitive architecture or evolved heuristics. That’s valuable, but EN goes further — or perhaps deeper — by grounding the illusion not just in evolutionary utility, but in formal constraints on self-referential systems. In other words: This brings tools like Gödel’s incompleteness, semantic closure, and regulator theory into the discussion in a way that directly addresses why subjective experience feels indubitable even if it's structurally ungrounded. So yes, it may sound like illusionism — but it tries to explain why illusionism is inevitable, not just assert it. That said, I’d genuinely welcome criticism or counterexamples. If it’s just a rebranding, let’s make that explicit. But if there’s a deeper structure here worth exploring, I hope it earns the scrutiny.

The One True Form of Moral Progress (according to me) is using careful philosophical reasoning to figure out what our values should be, what morality consists of, where our current moral beliefs are wrong, or generally, the contents of normativity (what we should and shouldn't do)

Are you interested in hearing other people's answers to these questions (if they think they have them)?

I agree with various comments that the post doesn't represent all the tradeoffs, but I strong-upvoted this because I think the question is legit interesting. It may be that the answer is no for almost everyone, but it's not obvious.

For those who work on Windows, a nice little quality of life improvement for me was just to hide desktop icons and do everything by searching in the task bar. (Would be even better if the search function wasn't so odd.) Been doing this for about two years and like it much more.

Maybe for others, using the desktop is actually worth it, but for me, it was always cluttering up over time, and the annoyance over it not looking the way I want always outweighed the benefits. It really takes barely longer to go CTRL+ESC+"firef"+ENTER than to double click an icon.

1Morpheus
In that case also consider installing PowerToys and pressing Alt+Space to open applications or files (to avoid unhelpful internet searches etc.).
1Dana
I keep some folders (and often some other transient files) on my desktop and pin my main apps to the taskbar. With apps pinned to your taskbar, you can open a new instance with Windows+shift+num (or just Windows+num if the app isn't open yet). I do the same as you and search for any other apps that I don't want to pin.
4Mateusz Bagiński
I have Ubuntu and I also find myself opening apps mostly by searching. I think the only reason I put anything on desktop is to be reminded that these are the things I'm doing/reading at the moment (?).

I don't think I get it. If I read this graph correctly, it seems to say that if you let a human play chess against an engine and want it to achieve equal performance, then the amount of time the human needs to think grows exponentially (as the engine gets stronger). This doesn't make sense if extrapolated downward, but upward it's about what I would expect. You can compensate for skill by applying more brute force, but it becomes exponentially costly, which fits the exponential graph.

It's probably not perfect -- I'd worry a lot about strategic mistakes in the opening -- but it seems pretty good. So I don't get how this is an argument against the metric.

2Gunnar_Zarncke
It is a decent metric for chess but a) it doesn't generalize to other tasks (as people seem to interpret the METR paper), and less importantly, b) I'm quite confident that people wouldn't beat the chess engines by thinking for years.
Answer by Rafael Harth2-3

Not answerable because METR is a flawed measure, imho.

Should I not have began by talking about background information & explaining my beliefs? Should I have the audience had contextual awareness and gone right into talking about solutions? Or was the problem more along the lines of writing quality, tone, or style?

  • What type of post do you like reading?
  • Would it be alright if I asked for an example so that I could read it?

This is a completely wrong way to think about it, imo. A post isn't this thing with inherent terminal value that you can optimize for regardless of content.

If you think you have an i... (read more)

1Oxidize
Sounds like you're speaking from a set of fundamental different beliefs than I'm used to. I've trained myself to write assuming that the audience is uninformed about the topic I'm writing about. But it sounds like you're writing from the perspective of the LW community being more informed than I can properly understand or conceptualize. How can I gain more information on the flow of information in the Lesswrong community? I assumed any insights I've arrived at as a consequence of my own thinking & conclusions I've reached from various unconnected sources would likely be insights specific to me, but maybe I'm wrong. But yeah, I agree with you just wanting to write something does not sound like a good place to start to be value-additive to this community. I'll remember to only post when I believe I have valuable and unique insights to share.
Rafael HarthΩ6146

I really don't think this is a reasonable measure for ability to do long term tasks, but I don't have the time or energy to fight this battle, so I'll just register my prediction that this paper is not going to age well.

To I guess offer another data point, I've had an obsessive nail-removing[1] habit for about 20 years. I concur that it can happen unconsciously; however noticing it seems to me like 10-20% of the problem; the remaining 80-90% is resisting the urge to follow the habit when you do notice. (As for enjoying it, I think technically yeah but it's for such a short amount of time that it's never worth it. Maybe if you just gave in and were constantly biting instead of trying to resist for as long as possible, it'd be different.) I also think I've solved the notici... (read more)

Oh, nice! The fact that you didn't make the time explicit in the post made me suspect that it was probably much shorter. But yeah, six months is long enough, imo.

3Rafka
I edited the intro to make this clearer, thanks. 

I would highly caution declaring victory too early. I don't know for how long you think you've overcome the habit, but unless it's at least three months, I think you're being premature.

Rafka230

That’s why I waited six months before publishing the post :)

A larger number of people, I think, desperately desperately want LLMs to be a smaller deal than what they are.

Can confirm that I'm one of these people (and yes, I worry a lot about this clouding my judgment).

Again, those are theories of consciousness, not definitions of consciousness.

I would agree that people who use consciousness to denote the computational process vs. the fundamental aspect generally have different theories of consciousness, but they're also using the term to denote two different things.

(I think this is bc consciousness notably different from other phenomena -- e.g., fiber decreasing risk of heart disease -- where the phenomenon is relatively uncontroversial and only the theory about how the phenomenon is explained is up for debate. With ... (read more)

2TAG
Even if they are using the term to denote two different things, they can agree on connotation. Meaning isn't exhausted by denotation (AKA reference, extension). Semantic differences are ubiquitous, but so are differences in background assumptions,in presumed ontology.
2TAG
But that doesn't imply that they disagree about (all of) the meaning of the term "qualia"..since denotation (extension, reference)doesn't exhaust meaning. The other thing is connotation, AKA intension, AKA sense. https://en.m.wikipedia.org/wiki/Sense_and_reference Everyone can understand that the qualia are ,minimally, things like the-way-a-tomato-seems-to-you, so that's agreement on sense , and the disagreement on whether the referent is "physical property", "nonphysical property" , "information processing", etc, arises from different theoretical stances. That's an odd use of "phenomenon"...the physical nature of a heart attack is uncontroversial, and the controversy is about the physical cause. Whereas with qualia, they are phenomenal properly speaking..they are appearences...and yet lack a prima facie interpretation in physical (or information theoretic) terms. Since qualia do present themselves immediately as phenomenal, then outright denial ...feigning anaesthesia or zombiehood.. is a particular poor response to the problem. And the problem is different to "how does one physical event cause another one that is subsequent in time"...it's more like "how or whether qualia, phenomenal consciousness supervenes synchronously on brain states". . If you don't like the terminology, you can invent better terminology. Throughout this exchange , you have been talking in terms of "consciousness" , and I have been replying in terms of "qualia", because "qualia" is a term that was invented to hone in on the problem, on the aspects of consciousness where it isn't obviously just information processing. (I'm personally OK with using information theoretic explanations, such as global workplace theory, to address Easy Problem issues , such as Access Consciousness). Theres a lot to be said for addressing terminological.issues, but it's not an easy win for camp #1.

I think the ability to autonomously find novel problems to solve will emerge as reasoning models scale up. It will emerge because it is instrumental to solving difficult problems.

This of course is not a sufficient reason. (Demonstration: telepathy will emerge [as evolution improves organisms] because it is instrumental to navigating social situations.) It being instrumental means that there is an incentive -- or to be more precise, a downward slope in the loss function toward areas of model space with that property -- which is one required piece, but it... (read more)

Rafael HarthΩ392

Instead of "have LLMs generated novel insights", how about "have LLMs demonstrated the ability to identify which views about a non-formal topic make more or less sense?" This question seems easier to operationalize and I suspect points at a highly related ability.

Fwiw this is the kind of question that has definitely been answered in the training data, so I would not count this as an example of reasoning.

2Yair Halberstadt
I expected so, which is why I was surprised they didn't get it.

I'm just not sure the central claim, that rationalists underestimate the role of luck in intelligence, is true. I've never gotten that impression. At least my assumption going into reading this was already that intelligence was probably 80-90% unearned.

Humans must have gotten this ability from somewhere and it's unlikely the brain has tons of specialized architecture for it.

This is probably a crux; I think the brain does have tons of specialized architecture for it, and if I didn't believe that, I probably wouldn't think thought assessment was as difficult.

The thought generator seems more impressive/fancy/magic-like to me.

Notably people's intuitions about what is impressive/difficult tend to be inversely correlated with reality. The stereotype is (or at least used to be) that AI will be good at ra... (read more)

5Noosphere89
I think this is also a crux. IMO, I think the brain is mostly cortically uniform, ala Steven Byrnes, and in particular I think that the specialized architecture for thought assessment was pretty minimal. The big driver of human success is basically something like the bitter lesson applied to biological brains, combined with humans being very well optimized for tool use, such that they can over time develop technology that is used to dominate the world (it's also helpful that humans can cooperate reasonably below 100 people, which is more than almost all social groups, though I've become much more convinced that cultural learning is way less powerful than Henrich et al have said). (There are papers which show that humans are better at scaling neurons than basically everyone else, but I can't find them right now).

Whether or not every interpretation needs a way to connect measurements to conscious experiences, or whether they need extra machinery?

If we're being extremely pedantic, then then KC is about predicting conscious experience (or sensory input data, if you're an illusionist; one can debate what the right data type is). But this only matters for discussing things like Boltzmann brains. As soon as you assume that there exists an external universe, you can forget about your personal experience just try to estimate the length of the program that runs the univ... (read more)

3Pekka Puupaa
Thank you, this has been a very interesting conversation so far. I originally started writing a much longer reply explaining my position on the interpretation of QM in full, but realized that the explanation would grow so long that it would really need to be its own post. So instead, I'll just make a few shorter remarks. Sorry if these sound a bit snappy. And if one assumes an external universe evolving according to classical laws, the Bohmian interpretation has the lowest KC. If you're going to be baking extra assumptions into your theory, why not go all the way? An interpretation is still a program. All programs have a KC (although it is usually ill-defined). Ultimately I don't think it matters whether we call these objects we're studying theories or interpretations. Has nothing to do with how the universe operates, as I see it. If you'd like, I think we can cast Copenhagen into a more Many Worlds -like framework by considering Many Imaginary Worlds. This is an interpretation, in my opinion functionally equivalent to Copenhagen, where the worlds of MWI are assumed to represent imaginary possibilities rather than real universes. The collapse postulate, then, corresponds to observing that you inhabit a particular imaginary world -- observing that that world is real for you at the moment. By contrast, in ordinary MWI, all worlds are real, and observation simply reduces your uncertainty as to which observer (and in which world) you are. If we accept the functional equivalence between Copenhagen and MIWI, this gives us an upper bound on the KC of Copenhagen. It is at most as complex as MWI. I would argue less. I think we need to distinguish between "playing skill" and "positional evaluation skill". It could be said that DeepBlue is dumber than Kasparov in the sense of being worse at evaluating any given board position than him, while at the same time being a vastly better player than Kasparov simply because it evaluates exponentially more positions. If you know

The reason we can expect Copenhagen-y interpretations to be simpler than other interpretations is because every other interpretation also needs a function to connect measurements to conscious experiences, but usually requires some extra machinery in addition to that.

I don't believe this is correct. But I separately think that it being correct would not make DeepSeek's answer any better. Because that's not what it said, at all. A bad argument does not improve because there exists a different argument that shares the same conclusion.

3Pekka Puupaa
Which part do you disagree with? Whether or not every interpretation needs a way to connect measurements to conscious experiences, or whether they need extra machinery? If the former: you need some way to connect the formalism to conscious experiences, since that's what an interpretation is largely for. It needs to explain how the classical world of your conscious experience is connected to the mathematical formalism. This is true for any interpretation. If you're saying that many worlds does not actually need any extra machinery, I guess the most reasonable way to interpret that in my framework is to say that the branching function is a part of the experience function. I suppose this might correspond to what I've heard termed the Many Minds interpretation, but I don't understand that one in enough detail to say.   Let an argument A be called "steelmannable" if there exists a better argument S with a similar structure and similar assumptions (according to some metric of similarity) that proves the same conclusion as the original argument A. Then S is called a "steelman" of A. It is clear that not all bad arguments are steelmannable. I think it is reasonable to say that steelmannable bad arguments are less nonsensical than bad arguments that are not steelmannable. So the question becomes: can my argument be viewed as a steelman of DeepSeek's argument? I think so. You probably don't. However, since everybody understands their own arguments quite well, ceteris paribus it should be expected that I am more likely to be correct about the relationship between my argument and DeepSeek's in this case. ... Or at least, that would be so if I didn't have an admitted tendency to be too lenient in interpreting AI outputs. Nonetheless, I am not objecting to the claim that DeepSeek's argument is weak, but to the claim that it is nonsense. We can both agree that DeepSeek's argument is not great. But I see glimmers of intelligence in it. And I fully expect that soon we will ha

Here's my take; not a physicist.

So in general, what DeepSeek says here might align better with intuitive complexity, but the point of asking about Kolmogorov Complexity rather than just Occam's Razor is that we're specifically trying to look at formal description length and not intuitive complexity.

Many Worlds does not need extra complexity to explain the branching. The branching happens due to the part of the math that all theories agree on. (In fact, I think a more accurate statement is that the branching is a description of what the math does.)

Then ther... (read more)

5Pekka Puupaa
I am also not a physicist, so perhaps I've misunderstood. I'll outline my reasoning. An interpretation of quantum mechanics does two things: (1) defines what parts of our theory, if any, are ontically "real" and (2) explains how our conscious observations of measurement results are related to the mathematical formalism of QM. The Kolmogorov complexity of different interpretations cannot be defined completely objectively, as DeepSeek also notes. But broadly speaking, defining KC "sanely", it ought to be correlated with a kind of "Occam's razor for conceptual entities", or more precisely, "Occam's razor over defined terms and equations". I think Many Worlds is more conceptually complex than Copenhagen. But I view Copenhagen as a catchall term for a category of interpretations that also includes QBism and Rovelli's RQM. Basically, these are "observer-dependent" interpretations. I myself subscribe to QBism, but I view it as a more rigorous formulation of Copenhagen. So, why should we think Many Worlds is more conceptually complex? Copenhagen is the closest we can come to a "shut up and calculate" interpretation. Pseudomathematically, we can say Copenhagen ~= QM + "simple function connecting measurements to conscious experiences" The reason we can expect Copenhagen-y interpretations to be simpler than other interpretations is because every other interpretation *also* needs a function to connect measurements to conscious experiences, but usually requires some extra machinery in addition to that. Now I maybe don't understand MWI correctly. But as I understand it, what QM mathematically gives you is more like a chaotic flux of possibilities, rather than the kind of branching tree of self-consistent worldlines that MWI requires. The way you split up the quantum state into branches constitutes extra structure on top of QM. Thus: Many Worlds ~= QM + "branching function" + "simple function connecting measurements to conscious experiences" So it seems that MWI ought to

[...] I personally wouldn’t use the word ‘sequential’ for that—I prefer a more vertical metaphor like ‘things building upon other things’—but that’s a matter of taste I guess. Anyway, whatever we want to call it, humans can reliably do a great many steps, although that process unfolds over a long period of time.

…And not just smart humans. Just getting around in the world, using tools, etc., requires giant towers of concepts relying on other previously-learned concepts.

As a clarification for anyone wondering why I didn't use a framing more like this i... (read more)

It's not clear to me that an human, using their brain and a go board for reasoning could beat AlphaZero even if you give them infinite time.

I agree but I dispute that this example is relevant. I don't think there is any step in between "start walking on two legs" to "build a spaceship" that requires as much strictly-type-A reasoning as beating AlphaZero at go or chess. This particular kind of capability class doesn't seem to me to be very relevant.

Also, to the extent that it is relevant, a smart human with infinite time could outperform AlphaGo by progr... (read more)

I do think the human brain uses two very different algorithms/architectures for thought generation and assessment. But this falls within the "things I'm not trying to justify in this post" category. I think if you reject the conclusion based on this, that's completely fair. (I acknowledged in the post that the central claim has a shaky foundation. I think the model should get some points because it does a good job retroactively predicting LLM performance -- like, why LLMs aren't already superhuman -- but probably not enough points to convince anyone.)

I don't think a doubling every 4 or 6 months is plausible. I don't think a doubling on any fixed time is plausible because I don't think overall progress will be exponential. I think you could have exponential progress on thought generation, but this won't yield exponential progress on performance. That's what I was trying to get at with this paragraph:

My hot take is that the graphics I opened the post with were basically correct in modeling thought generation. Perhaps you could argue that progress wasn't quite as fast as the most extreme versions predic

... (read more)
3Asta7k
Are you aware of the recent metr paper which  measured AI Ability to Complete Long Tasks and found out it doubles every 7 months?

I expect difficulty to grow exponentially with argument length. (Based on stuff like it constantly having to go back and double checking even when it got something right.)

Training of DeepSeek-R1 doesn't seem to do anything at all to incentivize shorter reasoning traces, so it's just rechecking again and again because why not. Like if you are taking an important 3 hour written test, and you are done in 1 hour, it's prudent to spend the remaining 2 hours obsessively verifying everything.

This is true but I don't think it really matters for eventual performance. If someone thinks about a problem for a month, the number of times they went wrong on reasoning steps during the process barely influences the eventual output. Maybe they take a little longer. But essentially performance is relatively insensitive to errors if the error-correcting mechanism is reliable.

I think this is actually a reason why most benchmarks are misleading (humans make mistakes there, and they influence the rating).

If thought assessment is as hard as thought generation and you need a thought assessor to get AGI (two non-obvious conditionals), then how do you estimate the time to develop a thought assessor? From which point on do you start to measure the amount of time it took to come up with the transformer architecture?

The snappy answer would be "1956 because that's when AI started; it took 61 years to invent the transformer architecture that lead to thought generation, so the equivalent insight for thought assessment will take about 61 years". I don't think that's the correct answer, but neither is "2019 because that's when AI first kinda resembled AGI".

5Dirichlet-to-Neumann
The transformer architecture was basically developed as soon as we got the computational power to make it useful. If a thought assessor is required and we are aware of the problem, and we have literally billions in funding to make it happen, I don't expect this to be that hard. 
AnthonyC145

Keep in mind that we're now at the stage of "Leading AI labs can raise tens to hundreds of billions of dollars to fund continued development of their technology and infrastructure." AKA in the next couple of years we'll see AI investment comparable to or exceeding the total that has ever been invested in the field. Calendar time is not the primary metric, when effort is scaling this fast.

A lot of that next wave of funding will go to physical infrastructure, but if there is an identified research bottleneck, with a plausible claim to being the major bottlen... (read more)

7Davidmanheim
Transformers work for many other tasks, and it seems incredibly likely to me that the expressiveness includes not only game playing, vision, and language, but also other things the brain does. And to bolster this point, the human brain doesn't use two completely different architectures! So I'll reverse the question; why do you think the thought assessor is fundamentally different from other neural functions that we know transformers can do? 

I generally think that [autonomous actions due to misalignment] and [human misuse] are distinct categories with pretty different properties. The part you quoted addresses the former (as does most of the post). I agree that there are scenarios where the second is feasible and the first isn't. I think you could sort of argue that this falls under AIs enhancing human intelligence.

So, I agree that there has been substantial progress in the past year, hence the post title. But I think if you naively extrapolate that rate of progress, you get around 15 years.

The problem with the three examples you've mentioned is again that they're all comparing human cognitive work across a short amount of time with AI performance. I think the relevant scale doesn't go from 5th grade performance over 8th grade performance to university-level performance or whatever, but from "what a smart human can do in 5 minutes" over "what a smart human can do in ... (read more)

4Davidmanheim
As I said in my top level comment, I don't see a reason to think that once the issue is identified as they key barrier, work on addressing it would be so slow.

I think if you look at "horizon length"---at what task duration (in terms of human completion time) do the AIs get the task right 50% of the time---the trends will indicate doubling times of maybe 4 months (though 6 months is plausible). Let's say 6 months more conservatively. I think AIs are at like 30 minutes on math? And 1 hour on software engineering. It's a bit unclear, but let's go with that. Then, to get to 64 hours on math, we'd need 7 doublings = 3.5 years. So, I think the naive trend extrapolation is much faster than you think? (And this estimate strikes me as conservative at least for math IMO.)

I don't the experience of no-self contradicts any of the above.

In general, I think you could probably make some factual statements about the nature of consciousness that's true and that you learn from attaining no-self, if you phrased it very carefully, but I don't think that's the point.

The way I'd phrase what happens would be mostly in terms of attachment. You don't feel as implicated by things that affect you anymore, you have less anxiety, that kind of thing. I think a really good analogy is just that regular consciousness starts to resemble consciousness during a flow state.

I would have been shocked if twin sisters cared equally about nieces and kids. Genetic similarity is one factor, not the entire story.

3Ustice
I agree.  I’m not a twin, but I am a parent, and I have a a nephew, and my son has a stepsister who has called me Uncle Jason since she could talk.  I don’t feel closer to my nephew than I am with my “niece.” I normally wouldn’t make a distinction based on genetics, except that it is relevant here. I’m not closer with my sister’s kids than I am with the other two.  Also, I’m not sure closeness is really even a good distinction. I’m not generally responsible for my niece or nephew, but if they or my son needed me to travel across the country to rescue them from some bad situation, I’d do it. I love those kids.  Being responsible for a child may present as being closer to them, So does spending a lot of time with a child. One could argue that these are two aspects of closeness. Neither of those things have anything to do with genetics.  Personality can be a huge factor in closeness too, and there is a huge variation in personality, even amongst identical twins.  Genetics seems only tangentially related to closeness, and mostly because the vast majority of children are genetically related to their parents. Family is complex, and often has more to do with shared history than anything else. 
Load More