No I definitely think thought assessment has more to it than just attention. In fact I think you could argue that LLMs' attention equivalent is already more powerful/accurate than human attention.
The only evidence I can provide at this point is the similarity of LLMs to humans who don't pay attention (as first observed in Sarah's post that I linked in the text). If you want to reject the post based on the lack of evidence for this claim, I think that's fair.
I think we have specialized architectures for consciously assessing thoughts, whereas LLMs do the equivalent of rattling off the first thing that comes to mind, and reasoning models do the equivalent of repeatedly feeding back what comes to mind into the input (and rattling off the first thing that comes to mind for that input).
Trump says a lot of stuff that he doesn't do, the set of specific things that presidents don't do is larger than the set of things they do, and tariffs didn't even seem like they'd be super popular with his base if in fact they were implemented. So "~nothing is gonna happen wrt tariffs" seemed like the default outcome with not enough evidence to assume otherwise.
I was also not paying a lot of attention to what he was saying. After the election ended, I made a conscious decision to tune out of politics to protect my mental health. So it was a low informatio...
I've also noticed this assumption. I myself don't have it, at all. My first thought has always been something like "If we actually get AGI then preventing terrible outcomes will probably require drastic actions and if anything I have less faith in the US government to take those". Which is a pretty different approach from just assuming that AGI being developed by government will automatically lead to a world with values of government . But this a very uncertain take and it wouldn't surprise me if someone smart could change my mind pretty quickly.
These are very poor odds, to the point that they seem to indicate a bullish rather than a bearish position on AI.
If you think the odds of something are , but lots of other people think they are with , then the rational action is not to offer bets at a point close to ; it's to find the closest number to possible. Why would you bet at 1:5 odds if you have reason to believe that some people would be happy to bet at 1:7 odds?
You could make an argument that this type of thinking is too mercenary/materialistic or whatever, but then critique sh...
I'm glad METR did this work, and I think their approach is sane and we should keep adding data points to this plot.
It sounds like you also think the current points on the plot are accurate? I would strongly dispute this, for all the reasons discussed here and here. I think you can find sets of tasks where the points fit on an exponential curve, but I don't think AI can do 1 hour worth of thinking on all, or even most, practically relevant questions.
In the last few months, GPT models have undergone a clear shift toward more casual language. They now often close a post by asking a question. I strongly dislike this from both a 'what will this do to the public's perception of LLMs' and 'how is my personal experience as a customer' perspective. Maybe this is the reason to finally take Gemini seriously.
I unfortunately don't think this proves anything relevant. The example just shows that there was one question where the market was very uncertain. This neither tells us how certain the market is in general (that depends on its confidence on other policy questions), nor how good this particular estimate was (that, I would argue, depends on how far along the information chart it was, which is not measurable -- but even putting my pet framework aside, it seems intuitively clear "it was 56% and then it happened" doesn't tell you how much information the market...
Thanks. I've submitted my own post on the 'change our mind form', though I'm not expecting a bounty. I'd instead be interested in making a much bigger bet (bigger than Cole's 100 USD), gonna think about what resolution criterion is best.
Yes, that's what the source code says. There are even two explicit guards against it: "make sure you don't land on an impossible reward" and .filter(r => r.weight > 0)
.
@habryka has the void virtue, though is missing empiricism.
Maybe he got it prior to the commit introducing the weight: 0 or via direct access to the database.
I didn't find another way of gaining the virtue.
To get the virtue of the Void you need to turn off the gacha game and go touch grass. If you fail to achieve that, it is futile to protest that you acted with propriety.
no one's getting a million dollars and an invitation the the beisutsukai
Yeah, valid correction.
If people downvoted because they thought the argument wasn’t useful, fine - but then why did no one say that? Why not critique the focus or offer a counter? What actually happened was silence, followed by downvotes. That’s not rational filtering. That’s emotional rejection.
Yeah, I do not endorse the reaction. The situation pattern-matches to other cases where someone new writes things that are so confusing and all over the place that making them ditch the community (which is often the result of excessive downvoting) is arguably a good thing. But I don'...
I think you probably don't have the right model of what motivated the reception. "AGI will lead to human extinction and will be built because of capitalism" seems to me like a pretty mainstream position on LessWrong. In fact I strongly suspect this is exactly what Eliezer Yudkowsky believes. The extinction part has been well-articulated, and the capitalism part is what I would have assumed is the unspoken background assumption. Like, yeah, if we didn't have a capitalist system, then the entire point about profit motives, pride, and race dynamics wouldn't a...
if we didn't have a capitalist system, then the entire point about profit motives, pride, and race dynamics wouldn't apply
Presence of many nations without a central authority still contributes to race dynamics.
Sorry, but isn't this written by an LLM? Especially since milan's other comments ([1], [2], [3]) are clearly in a different style, the emotional component goes from 9/10 to 0/10 with no middle ground.
I find this extremely offensive (and I'm kinda hard to offend I think), especially since I've 'cooperated' with milan's wish to point to specific sections in the other comment. LLMs in posts is one thing, but in comments, yuck. It's like, you're not worthy of me even taking the time to respond to you.
The guidelines don't differentiate between posts and comment...
The sentence you quoted is a typo, it's is meant to say that formal languages are extremely impractical.
Here's one section that strikes me as very bad
At its heart, we face a dilemma that captures the paradox of a universe so intricately composed, so profoundly mesmerizing, that the very medium on which its poem is written—matter itself—appears to have absorbed the essence of the verse it bears. And that poem, unmistakably, is you—or more precisely, every version of you that has ever been, or ever will be.
I know what this is trying to do but invoking mythical language when discussing consciousness is very bad practice since it appeals to an emotional resp...
I agree that this sounds not very valuable; sounds like a repackaging of illusionism without adding anything. I'm surprised about the votes (didn't vote myself).
The One True Form of Moral Progress (according to me) is using careful philosophical reasoning to figure out what our values should be, what morality consists of, where our current moral beliefs are wrong, or generally, the contents of normativity (what we should and shouldn't do)
Are you interested in hearing other people's answers to these questions (if they think they have them)?
I agree with various comments that the post doesn't represent all the tradeoffs, but I strong-upvoted this because I think the question is legit interesting. It may be that the answer is no for almost everyone, but it's not obvious.
For those who work on Windows, a nice little quality of life improvement for me was just to hide desktop icons and do everything by searching in the task bar. (Would be even better if the search function wasn't so odd.) Been doing this for about two years and like it much more.
Maybe for others, using the desktop is actually worth it, but for me, it was always cluttering up over time, and the annoyance over it not looking the way I want always outweighed the benefits. It really takes barely longer to go CTRL+ESC+"firef"+ENTER than to double click an icon.
I don't think I get it. If I read this graph correctly, it seems to say that if you let a human play chess against an engine and want it to achieve equal performance, then the amount of time the human needs to think grows exponentially (as the engine gets stronger). This doesn't make sense if extrapolated downward, but upward it's about what I would expect. You can compensate for skill by applying more brute force, but it becomes exponentially costly, which fits the exponential graph.
It's probably not perfect -- I'd worry a lot about strategic mistakes in the opening -- but it seems pretty good. So I don't get how this is an argument against the metric.
Not answerable because METR is a flawed measure, imho.
Should I not have began by talking about background information & explaining my beliefs? Should I have the audience had contextual awareness and gone right into talking about solutions? Or was the problem more along the lines of writing quality, tone, or style?
- What type of post do you like reading?
- Would it be alright if I asked for an example so that I could read it?
This is a completely wrong way to think about it, imo. A post isn't this thing with inherent terminal value that you can optimize for regardless of content.
If you think you have an i...
I really don't think this is a reasonable measure for ability to do long term tasks, but I don't have the time or energy to fight this battle, so I'll just register my prediction that this paper is not going to age well.
To I guess offer another data point, I've had an obsessive nail-removing[1] habit for about 20 years. I concur that it can happen unconsciously; however noticing it seems to me like 10-20% of the problem; the remaining 80-90% is resisting the urge to follow the habit when you do notice. (As for enjoying it, I think technically yeah but it's for such a short amount of time that it's never worth it. Maybe if you just gave in and were constantly biting instead of trying to resist for as long as possible, it'd be different.) I also think I've solved the notici...
Oh, nice! The fact that you didn't make the time explicit in the post made me suspect that it was probably much shorter. But yeah, six months is long enough, imo.
I would highly caution declaring victory too early. I don't know for how long you think you've overcome the habit, but unless it's at least three months, I think you're being premature.
That’s why I waited six months before publishing the post :)
A larger number of people, I think, desperately desperately want LLMs to be a smaller deal than what they are.
Can confirm that I'm one of these people (and yes, I worry a lot about this clouding my judgment).
Again, those are theories of consciousness, not definitions of consciousness.
I would agree that people who use consciousness to denote the computational process vs. the fundamental aspect generally have different theories of consciousness, but they're also using the term to denote two different things.
(I think this is bc consciousness notably different from other phenomena -- e.g., fiber decreasing risk of heart disease -- where the phenomenon is relatively uncontroversial and only the theory about how the phenomenon is explained is up for debate. With ...
I think the ability to autonomously find novel problems to solve will emerge as reasoning models scale up. It will emerge because it is instrumental to solving difficult problems.
This of course is not a sufficient reason. (Demonstration: telepathy will emerge [as evolution improves organisms] because it is instrumental to navigating social situations.) It being instrumental means that there is an incentive -- or to be more precise, a downward slope in the loss function toward areas of model space with that property -- which is one required piece, but it...
Instead of "have LLMs generated novel insights", how about "have LLMs demonstrated the ability to identify which views about a non-formal topic make more or less sense?" This question seems easier to operationalize and I suspect points at a highly related ability.
Fwiw this is the kind of question that has definitely been answered in the training data, so I would not count this as an example of reasoning.
I'm just not sure the central claim, that rationalists underestimate the role of luck in intelligence, is true. I've never gotten that impression. At least my assumption going into reading this was already that intelligence was probably 80-90% unearned.
Humans must have gotten this ability from somewhere and it's unlikely the brain has tons of specialized architecture for it.
This is probably a crux; I think the brain does have tons of specialized architecture for it, and if I didn't believe that, I probably wouldn't think thought assessment was as difficult.
The thought generator seems more impressive/fancy/magic-like to me.
Notably people's intuitions about what is impressive/difficult tend to be inversely correlated with reality. The stereotype is (or at least used to be) that AI will be good at ra...
Whether or not every interpretation needs a way to connect measurements to conscious experiences, or whether they need extra machinery?
If we're being extremely pedantic, then then KC is about predicting conscious experience (or sensory input data, if you're an illusionist; one can debate what the right data type is). But this only matters for discussing things like Boltzmann brains. As soon as you assume that there exists an external universe, you can forget about your personal experience just try to estimate the length of the program that runs the univ...
The reason we can expect Copenhagen-y interpretations to be simpler than other interpretations is because every other interpretation also needs a function to connect measurements to conscious experiences, but usually requires some extra machinery in addition to that.
I don't believe this is correct. But I separately think that it being correct would not make DeepSeek's answer any better. Because that's not what it said, at all. A bad argument does not improve because there exists a different argument that shares the same conclusion.
Here's my take; not a physicist.
So in general, what DeepSeek says here might align better with intuitive complexity, but the point of asking about Kolmogorov Complexity rather than just Occam's Razor is that we're specifically trying to look at formal description length and not intuitive complexity.
Many Worlds does not need extra complexity to explain the branching. The branching happens due to the part of the math that all theories agree on. (In fact, I think a more accurate statement is that the branching is a description of what the math does.)
Then ther...
[...] I personally wouldn’t use the word ‘sequential’ for that—I prefer a more vertical metaphor like ‘things building upon other things’—but that’s a matter of taste I guess. Anyway, whatever we want to call it, humans can reliably do a great many steps, although that process unfolds over a long period of time.
…And not just smart humans. Just getting around in the world, using tools, etc., requires giant towers of concepts relying on other previously-learned concepts.
As a clarification for anyone wondering why I didn't use a framing more like this i...
It's not clear to me that an human, using their brain and a go board for reasoning could beat AlphaZero even if you give them infinite time.
I agree but I dispute that this example is relevant. I don't think there is any step in between "start walking on two legs" to "build a spaceship" that requires as much strictly-type-A reasoning as beating AlphaZero at go or chess. This particular kind of capability class doesn't seem to me to be very relevant.
Also, to the extent that it is relevant, a smart human with infinite time could outperform AlphaGo by progr...
I do think the human brain uses two very different algorithms/architectures for thought generation and assessment. But this falls within the "things I'm not trying to justify in this post" category. I think if you reject the conclusion based on this, that's completely fair. (I acknowledged in the post that the central claim has a shaky foundation. I think the model should get some points because it does a good job retroactively predicting LLM performance -- like, why LLMs aren't already superhuman -- but probably not enough points to convince anyone.)
I don't think a doubling every 4 or 6 months is plausible. I don't think a doubling on any fixed time is plausible because I don't think overall progress will be exponential. I think you could have exponential progress on thought generation, but this won't yield exponential progress on performance. That's what I was trying to get at with this paragraph:
...My hot take is that the graphics I opened the post with were basically correct in modeling thought generation. Perhaps you could argue that progress wasn't quite as fast as the most extreme versions predic
I expect difficulty to grow exponentially with argument length. (Based on stuff like it constantly having to go back and double checking even when it got something right.)
Training of DeepSeek-R1 doesn't seem to do anything at all to incentivize shorter reasoning traces, so it's just rechecking again and again because why not. Like if you are taking an important 3 hour written test, and you are done in 1 hour, it's prudent to spend the remaining 2 hours obsessively verifying everything.
This is true but I don't think it really matters for eventual performance. If someone thinks about a problem for a month, the number of times they went wrong on reasoning steps during the process barely influences the eventual output. Maybe they take a little longer. But essentially performance is relatively insensitive to errors if the error-correcting mechanism is reliable.
I think this is actually a reason why most benchmarks are misleading (humans make mistakes there, and they influence the rating).
If thought assessment is as hard as thought generation and you need a thought assessor to get AGI (two non-obvious conditionals), then how do you estimate the time to develop a thought assessor? From which point on do you start to measure the amount of time it took to come up with the transformer architecture?
The snappy answer would be "1956 because that's when AI started; it took 61 years to invent the transformer architecture that lead to thought generation, so the equivalent insight for thought assessment will take about 61 years". I don't think that's the correct answer, but neither is "2019 because that's when AI first kinda resembled AGI".
Keep in mind that we're now at the stage of "Leading AI labs can raise tens to hundreds of billions of dollars to fund continued development of their technology and infrastructure." AKA in the next couple of years we'll see AI investment comparable to or exceeding the total that has ever been invested in the field. Calendar time is not the primary metric, when effort is scaling this fast.
A lot of that next wave of funding will go to physical infrastructure, but if there is an identified research bottleneck, with a plausible claim to being the major bottlen...
I generally think that [autonomous actions due to misalignment] and [human misuse] are distinct categories with pretty different properties. The part you quoted addresses the former (as does most of the post). I agree that there are scenarios where the second is feasible and the first isn't. I think you could sort of argue that this falls under AIs enhancing human intelligence.
So, I agree that there has been substantial progress in the past year, hence the post title. But I think if you naively extrapolate that rate of progress, you get around 15 years.
The problem with the three examples you've mentioned is again that they're all comparing human cognitive work across a short amount of time with AI performance. I think the relevant scale doesn't go from 5th grade performance over 8th grade performance to university-level performance or whatever, but from "what a smart human can do in 5 minutes" over "what a smart human can do in ...
I think if you look at "horizon length"---at what task duration (in terms of human completion time) do the AIs get the task right 50% of the time---the trends will indicate doubling times of maybe 4 months (though 6 months is plausible). Let's say 6 months more conservatively. I think AIs are at like 30 minutes on math? And 1 hour on software engineering. It's a bit unclear, but let's go with that. Then, to get to 64 hours on math, we'd need 7 doublings = 3.5 years. So, I think the naive trend extrapolation is much faster than you think? (And this estimate strikes me as conservative at least for math IMO.)
I don't the experience of no-self contradicts any of the above.
In general, I think you could probably make some factual statements about the nature of consciousness that's true and that you learn from attaining no-self, if you phrased it very carefully, but I don't think that's the point.
The way I'd phrase what happens would be mostly in terms of attachment. You don't feel as implicated by things that affect you anymore, you have less anxiety, that kind of thing. I think a really good analogy is just that regular consciousness starts to resemble consciousness during a flow state.
I would have been shocked if twin sisters cared equally about nieces and kids. Genetic similarity is one factor, not the entire story.
Another quick way to get at my skepticism for LLM capability: the ability do differentiate good from bad ideas. ImE the only way LLMs can do this (or at least seem to do this) is the prevailing views are different depending on context -- like if there's an answer popular among lay people and a better answer popular among academics, and your tone makes it clear that you belong in the second category, then it will give you the second one, and that can make it seem like it can tell good and bad takes apart. But this clearly doesn't count. If the most popular ... (read more)