Is it a crazy coincidence that AlphaZero taught itself chess and explosively outperformed humans without any programmed knowledge of chess, then asymptoted out at almost exactly 2017 stockfish performance? I need to look into it more, but it appears like AlphaZero would curbstomp 2012 stockfish and get curbstomped in turn by 2025 stockfish.

It almost only makes sense if the entire growth in stockfish performance since 2017 is casually downstream of the AlphaZero paper.

Reply

Claude 4

Hastings11d20

Alas, if I were the sort of person who could learn my lesson from negative feedback and stop telling shaggy dog jokes, then I would have learned that lesson long ago. (Then again, perhaps I should notice this pattern and change now: as they say, better nate than lever)

Reply

Claude 4

Hastings12d20

If you form the block matrix Y = (real(A), -imag(A) ; -imag(A), -real(A)), the real valued eigenvectors and eigenvalues of Y correspond to conjugate eigenvectors and eigenvalues of A; the complex conjugate pairs of eigenvalues of Y don’t correspond to anything. Then, for any angle theta, you can multiply a conjugate eigenvector v of A by exp(i theta) to get a new conjugate eigenvector, and find its associated eigenvalue by elementwise dividing A conj(exp itheta v) by v. The conjugate eigenvalues form up to 10 continuous rings around the origin,

Reply

Claude 4

Hastings13d62

The claude.ai chat interface only lets claude run javascript code, but “import numpy as np” is python. As a result, the code doesn’t run. This is an extremely common and funny issue for me across claude versions, but it must be caused by something in my prompting style if other people aren’t hitting it

Typically if I don’t stop it claude does eventually recover, the new version was able to catch itself and switch to javascript after only three script submissions! In the end, claude 4 opus produced a confident, well informed, and completely wrong answer, but its a nasty problem that I didn’t expect it to solve- has to do with quantum time reversal. The humor doesn’t really involve the math. ( For reference, if the METR time horizons people want to steal it, this problem took me about 4 hours, but I am an unusually slow mathematician. If you know that the common name for this is an antilinear eigenvalue, not a conjugate eigenvalue, then it can be solved with a simple arxiv search, but one is not always gifted the True Name of a math concept by which its literature may be summoned; and I had not found the true name yet when I prompted claude or made the original post)

Reply

1

Claude 4

Hastings14d30

It looks like the singularity, whether or not it is coming, is not coming today

Reply

3

2

Interest In Conflict Is Instrumentally Convergent

Hastings16d20

fwiw, these are what I'd say a 2std failure case of a rationalist meetup looks like

https://www.wired.com/story/delirious-violent-impossible-true-story-zizians/

https://variety.com/2025/tv/news/julia-garner-caroline-ellison-ftx-series-netflix-1236385385/

https://www.wired.com/story/book-excerpt-the-optimist-open-ai-sam-altman/

(Ways my claim could be false: there could have been way more than 150 rationalist meetups, so that these are lower than 2 std, or these could not have, at any point in their development, counted as rationalist meetups, or ziz, sam, and eliezer could have intended these outcomes, so these don't count as failures)

Reply

leogao's Shortform

Hastings23d71

This period of global safety is not fairly distributed,

But it is also real

https://data.unicef.org/resources/levels-and-trends-in-child-mortality-2024/

Reply

Prediction Markets Are Mediocre

Hastings2mo10

There’s no guarantee that prediction markets do well,

Reply

On GPT-4.5

Hastings3mo90

I’m really curious what people’s theories are on why openai released this and not o3?

My old main theory was that they would have to charge so much for o3 that it would create bad PR, but this is now much less likely.

My first remaining guess is that they don’t want competitors extracting full o3 reasoning traces to train on. I guess its also possible that o3 is just dangerous. On the other side of capabilities, its technically possible that o3 is benchmark gamed so hard that its outputs are not usable.

Reply

Alexander Gietelink Oldenziel's Shortform

Hastings3mo20

I’m slowly accepting that my ADHD sucks to inhabit, but that it is objectively working and my feeling that it is a secret superpower isn’t entirely cope. Certainly I miss deadlines and raise my advisor’s blood pressure, but at this point I’ve got multiple CVPR papers.

The question is: do my research results trace back to me involuntarily exploring the beautiful research directions, even when I am trying very hard to focus on the work in front of me, that I am expected/required to be doing?

Or, do I have innate ability that is being held back by ADHD, and I would be far more successful if I could just have self control? I think fear of this possibility contributes an unhealthy level of ambition: if I’m successful enough, it wouldn’t leave room above for the “far more successful” version of me without ADHD to eclipse me.

Reply