kave — LessWrong

LESSWRONG
is fundraising!
LW

kave — LessWrong

kave11dModerator Comment96

I don't think non-substantive aggression like this is appropriate for LessWrong. I appreciate that you pulled out quotes, but "typical LessWrong word salad" is not a sufficiently specific complaint to support the sneering.

Given that you imply you're leaving, perhaps no moderation action is needed. But I expect I'll take moderation action if you stick around and keep engaging in this way.

Toss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

kave11d50

Huh! Where did you find the Stripe donation link? Did you just have it saved from last year?

Insights into Claude Opus 4.5 from Pokémon

kave15d111

Curated. I appreciate this post's concreteness.

It can be hard to really understand what numbers in a benchmark mean. To do so, you have to be pretty familiar with the task distribution, which is often a little surprising. And, if you are bothering to get familiar with it, you probably already know how the LLM performs. So it's hard to be sure you're judging the difficulty accurately, rather than using your sense of the LLM's intelligence to infer the task difficulty.

Fortunately, a Pokémon game involves a bunch of different tasks, and I'm pretty familiar with them from childhood gameboy sessions. So LLM performance on the game can provide some helpful intuitions about LLM performance in general. Of course, you don't get all the niceties of statistical power and so on, but I still find it a helpful data source to include.

This post does a good job abstracting some of the subskills involved and provides lots of deliciously specific examples for the claims. It's also quite entertaining!

papetoast's Shortforms

kave23d64

Thank you!

Would it help if the prompt read more like a menu?

Reviews should provide information that help evaluate a post. For example:
What does this post add to the conversation?
How did this post affect you, your thinking, and your actions?
Does it make accurate claims? Does it carve reality at the joints? How do you know?
Is there a subclaim of this post that you can test?
What followup work would you like to see building on this post?

kave's Shortform

kave25d210

The predicted winners for future years of the review are now visible on the Best of LessWrong page! Here are the top ten guesses for the currently ongoing 2024 review:

(I've already voted on several of these! I doctored the screenshot to hide my votes)

I think LessWrong's annual review is better than karma at finding the best and most enduring posts. Part of the dream for the review prediction markets is bringing some of that high-quality signal from the future into the present. That signal is currently highlighted with gold karma on the post item, if the prediction market has a high enough probability.

Currently the markets are pretty thinly traded, but I think they already have decent signal. They could do a lot better, I think, with a little more smart trading. It would be a nice bonus if this UI attracted a bit more betting.

Hopefully coming soon: a tag on the markets which indicates which year review they'll be in, to make it a bit easier for consistency traders to make their bag.

Overview of strong human intelligence amplification methods

kave1mo*72Review for 2024 Review

Human intelligence amplification is very important. Though I have become a bit less excited about it lately, I do still guess it's the best way for humanity to make it to a glorious destiny. I found that having a bunch of different methods in one place organised my thoughts, and I could more seriously think about what approaches might work.

I appreciate that Tsvi included things as "hard" as brain emulation and as soft as rationality, tools for thought and social epistemology.

"No-one in my org puts money in their pension"

kave1mo40Review for 2024 Review

I liked this post. I thought it was interesting to read about how Tobes' relation to AI changed, and the anecdotes were helpfully concrete. I could imagine him in those moments, and get a sense of how he was feeling.

I found this post helpful for relating to some of my friends and family as AI has been in the news more, and they connect it to my work and concerns.

A more concrete thing I took away: the author describing looking out of his window and meditating on the end reaching him through that window. I find this a helpful practice, and sometimes I like to look out of a window and think about various endgames and how they might land in my apartment or workplace or grocery store.

D&D.Sci Scenario Index

kave1mo60Review for 2024 Review

I'm a big fan of this series. I think that puzzles and exercises are undersupplied on LessWrong, especially ones that are fun, a bit collaborative and a bit competitive. I've recently been trying my hand at some of the backlog, and it's been pretty cool. I can feel that I'm getting at least a bit better at compressing the dimensionality of the data as I investigate it.

In general, I'd guess that data science is a pretty important epistemological skill. I think LessWrongers aren't as strong in it as they ideally would be. This is in part because of a justified suspicion that people just pour in data and confusion, and get out more official-looking confusion. I'd say that a central point of this series is: how do you avoid confusing yourself with data by actually thinking about things?

"It's a 10% chance which I did 10 times, so it should be 100%"

kave1mo40Review for 2024 Review

I have the impression that I reach for this rule fairly frequently. I only ontologise it as a rule to look out for because of this post. (I normally can't remember the exact number, so have to go via the compound interest derivation).

zroe1's Shortform

kave1mo50

(My plus is conditional on me not being the adjudicator)

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments