Ghibli creature 2
Customize
Thomas Kwa
*Ω301650
1
Some versions of the METR time horizon paper from alternate universes: Measuring AI Ability to Take Over Small Countries (idea by Caleb Parikh) Abstract: Many are worried that AI will take over the world, but extrapolation from existing benchmarks suffers from a large distributional shift that makes it difficult to forecast the date of world takeover. We rectify this by constructing a suite of 193 realistic, diverse countries with territory sizes from 0.44 to 17 million km^2. Taking over most countries requires acting over a long time horizon, with the exception of France. Over the last 6 years, the land area that AI can successfully take over with 50% success rate has increased from 0 to 0 km^2, doubling 0 times per year (95% CI 0.0-0.0 yearly doublings); extrapolation suggests that AI world takeover is unlikely to occur in the near future. To address concerns about the narrowness of our distribution, we also study AI ability to take over small planets and asteroids, and find similar trends. When Will Worrying About AI Be Automated? Abstract: Since 2019, the amount of time LW has spent worrying about AI has doubled every seven months, and now constitutes the primary bottleneck to AI safety research. Automation of worrying would be transformative to the research landscape, but worrying includes several complex behaviors, ranging from simple fretting to concern, anxiety, perseveration, and existential dread, and so is difficult to measure. We benchmark the ability of frontier AIs to worry about common topics like disease, romantic rejection, and job security, and find that current frontier models such as Claude 3.7 Sonnet already outperform top humans, especially in existential dread. If these results generalize to worrying about AI risk, AI systems will be capable of autonomously worrying about their own capabilities by the end of this year, allowing us to outsource all our AI concerns to the systems themselves. Estimating Time Since The Singularity Early work
Seems like Unicode officially added a "person being paperclipped" emoji: Here's how it looks in your browser: 🙂‍↕️ Whether they did this as a joke or to raise awareness of AI risk, I like it! Source: https://emojipedia.org/emoji-15.1
lc
920
7
My strong upvotes are now giving +1 and my regular upvotes give +2.
RobertM
400
0
Pico-lightcone purchases are back up, now that we think we've ruled out any obvious remaining bugs.  (But do let us know if you buy any and don't get credited within a few minutes.)
keltan
300
0
I feel a deep love and appreciation for this place, and the people who inhabit it.

Popular Comments

Recent Discussion

1 Introduction

Crosspost from my blog

A beetle lay crushed in the dirt,
Its carapace cracked, torn, and hurt.
It twitched in despair,
Gasping for air —
A world unaware of its hurt.

(I think this might be one of my most important articles so I’d really appreciate if you could like, share, and restack it—thanks! Also, when I scheduled this article to be released, I did not know it was April 1. Really, seriously, this is not an April fools day post.)

Cute Bug Images – Browse 248,055 Stock Photos, Vectors, and Video | Adobe  Stock

Imagine we discovered that the world was filled with tiny worlds that we were constantly destroying, akin to the world in Horton Hears a Who. Every second, normal actions had extreme impacts on millions of beings just as intelligent and sentient as us. In such a world, given the sheer numerosity of...

Humans display a bias called scope neglect. Because we can’t intuitively grok how much larger some big numbers are than others, we have a tendency to treat big numbers all the same. People will pay as much to save 2,000 birds as 20,000 and 200,000 birds.

This is a deeply misleading characterization of that study.

3Jiro
But you pulled the number 600000 out of thin air. People, when asked to name a small number or a large number, will usually name numbers within a certain range, and think "well, that number sounds good to me". That doesn't mean that the number really is small or large enough. It may be in normal situations--$600000 can buy a lot--but if you try to do calculations with it, the fact that people name numbers in certain ranges lets you manipulate the result by starting from a "conservative" number and coming to an absurd conclusion. If it was, oh, 10000000000000000000000 instead, your conclusion would be very different. The fact that not many people will pick 10000000000000000000000, and that you can conclude insect suffering is important based on 600000, says more about how people pick numbers than it does about insect suffering. When you ask the question "what would you pay to save 2000 birds", the fact that your question contains the number 2000 is information about how many birds it is important and/or possible to save. If you ask the question with different numbers, each version of the question provides different information, and therefore should produce inconsistent answers (unless it's a poll question specifically designed to test different numbers, but most people won't take that into account).
4Shankar Sivarajan
Someone (unclear who) made a whole bunch of these along the same vein: https://kennaway.org.uk/writings/Insanity-Wolf-Sanity-Check.html 
4dr_s
  Define "not human". If someone is, say, completely acephalus, I feel justified in not worrying much about their suffering. Suffering requires a certain degree of sentience to be appreciated and be called, well, suffering. In humans I also think that our unique ability to conceptualise ourselves in space and time heightens the weight of suffering significantly. We don't just suffer at a time. We suffer, we remember not suffering in the past, we dread more future suffering, and so on so forth. Animals don't all necessarily live in the present (well, hard to tell, but many behaviours don't seem to lean that way) but they do seem to have a smaller and less complex time horizon than ours. The problem is the distinction between suffering as "harmful thing you react to" and the qualia of suffering. Learning behaviours that lead you to avoid things associated with negative feedback isn't hard; any reinforcement learning system can do that just fine. If I spin up trillions of instances of a chess engine that is always condemned to lose no matter how it plays, am I creating the new worst thing in the world? Obviously what feels to us like it's worth worrying about is "there is negative feedback, and there is something that it feels like to experience that feedback in a much more raw way than just a rational understanding that you shouldn't do that again". And it's not obvious when that line is crossed in information-processing systems. We know it's crossed for us. Similarity to us does matter because it means similarity in brain structure and thus higher prior that something works kind of in the same way with respect to this specific matter. Insects are about as different as it gets from us while still counting as having a nervous system that actually does a decent amount of processing. Insects barely have brains. We probably aren't that far off from being able to decently simulate an EM of an insect. I am not saying insects can't possibly be suffering, but they're the

Hey Everyone,

It is with a sense of... considerable cognitive dissonance that I am letting you all know about a significant development for the future trajectory of LessWrong. After extensive internal deliberation, projections of financial runways, and what I can only describe as a series of profoundly unexpected coordination challenges, the Lightcone Infrastructure team has agreed in principle to the acquisition of LessWrong by EA.

I assure you, nothing about how LessWrong operates on a day to day level will change. I have always cared deeply about the robustness and integrity of our institutions, and I am fully aligned with our stakeholders at EA. 

To be honest, the key thing that EA brings to the table is money and talent. While the recent layoffs in EAs broader industry have been...

63leogao
the intent is to provide the user with a sense of pride and accomplishment for unlocking different rationality methods.
habryka
90

Absolutely, that is our sole motivation.

1Maloew
Are the new songs going to be posted to youtube/spotify, or should I be downloading them?
10Zane
I have 5/12 virtues but one of them is the virtue of perfectionism so I can't stop thinking about the other 7.
165Thomas Kwa
Some versions of the METR time horizon paper from alternate universes: Measuring AI Ability to Take Over Small Countries (idea by Caleb Parikh) Abstract: Many are worried that AI will take over the world, but extrapolation from existing benchmarks suffers from a large distributional shift that makes it difficult to forecast the date of world takeover. We rectify this by constructing a suite of 193 realistic, diverse countries with territory sizes from 0.44 to 17 million km^2. Taking over most countries requires acting over a long time horizon, with the exception of France. Over the last 6 years, the land area that AI can successfully take over with 50% success rate has increased from 0 to 0 km^2, doubling 0 times per year (95% CI 0.0-0.0 yearly doublings); extrapolation suggests that AI world takeover is unlikely to occur in the near future. To address concerns about the narrowness of our distribution, we also study AI ability to take over small planets and asteroids, and find similar trends. When Will Worrying About AI Be Automated? Abstract: Since 2019, the amount of time LW has spent worrying about AI has doubled every seven months, and now constitutes the primary bottleneck to AI safety research. Automation of worrying would be transformative to the research landscape, but worrying includes several complex behaviors, ranging from simple fretting to concern, anxiety, perseveration, and existential dread, and so is difficult to measure. We benchmark the ability of frontier AIs to worry about common topics like disease, romantic rejection, and job security, and find that current frontier models such as Claude 3.7 Sonnet already outperform top humans, especially in existential dread. If these results generalize to worrying about AI risk, AI systems will be capable of autonomously worrying about their own capabilities by the end of this year, allowing us to outsource all our AI concerns to the systems themselves. Estimating Time Since The Singularity Early work
Buck
Ω8230

A few months ago, I accidentally used France as an example of a small country that it wouldn't be that catastrophic for AIs to take over, while giving a talk in France 😬

Epistemic status: Using UDT as a case study for the tools developed in my meta-theory of rationality sequence so far, which means all previous posts are prerequisites. This post is the result of conversations with many people at the CMU agent foundations conference, including particularly Daniel A. Herrmann, Ayden Mohensi, Scott Garrabrant, and Abram Demski. I am a bit of an outsider to the development of UDT and logical induction, though I've worked on pretty closely related things.

I'd like to discuss the limits of consistency as an optimality standard for rational agents. A lot of fascinating discourse and useful techniques have been built around it, but I think that it can be in tension with learning at the extremes. Updateless decision theory (UDT) is one of those...

Wei Dai
260

At this point, someone sufficiently MIRI-brained might start to think about (something equivalent to) Tegmark's level 4 mathematical multiverse, where such agents might theoretically outperform others. Personally, I see no direct reason to believe in the mathematical multiverse as a real object, and I think this might be a case of the mind projection fallacy - computational multiverses are something that agents reason about in order to succeed in the real universe[3]. Even if a mathematical multiverse does exist (I can't rule it out) and we can somehow le

... (read more)
7Cole Wyeth
It looks like I have many points of agreement with Martin. 
7Cole Wyeth
Very interesting! I have been enjoying reading up on Seidenfeld's work. 

After ~3 years as the ACX Meetup Czar, I've decided to resign from my position, and I intend to scale back my work with the LessWrong community as well. While this transition is not without some sadness, I'm excited for my next project.

I'm the Meetup Czar of the new Fewerstupidmistakesity community.

We're calling it Fewerstupidmistakesity because people get confused about what "Rationality" means, and this would create less confusion. It would be a stupid mistake to name your philosophical movement something very similar to an existing movement that's somewhat related but not quite the same thing. You'd spend years with people confusing the two. 

What's Fewerstupidmistakesity about? It's about making fewer stupid mistakes, ideally down to zero such stupid mistakes. Turns out, human brains have lots of scientifically proven...

I think that rationalists should consider taking more showers.

As Eliezer Yudkowsky once said, boredom makes us human. The childhoods of exceptional people often include excessive boredom as a trait that helped cultivate their genius:

A common theme in the biographies is that the area of study which would eventually give them fame came to them almost like a wild hallucination induced by overdosing on boredom. They would be overcome by an obsession arising from within.

Unfortunately, most people don't like boredom, and we now have little metal boxes and big metal boxes filled with bright displays that help distract us all the time, but there is still an effective way to induce boredom in a modern population: showering.

When you shower (or bathe, that also works), you usually are cut...

Aella
873

Strong disagree. This is an ineffective way to create boredom. Showers are overly stimulating, with horrible changes in temperature, the sensation of water assaulting you nonstop, and requiring laborious motions to do the bare minimum of scrubbing required to make society not mad at you. A much better way to be bored is to go on a walk outside or lift weights at the gym or listen to me talk about my data cleaning issues

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

Introduction

Decision theory is about how to behave rationally under conditions of uncertainty, especially if this uncertainty involves being acausally blackmailed and/or gaslit by alien superintelligent basilisks.

Decision theory has found numerous practical applications, including proving the existence of God and generating endless LessWrong comments since the beginning of time.

However, despite the apparent simplicity of "just choose the best action", no comprehensive decision theory that resolves all decision theory dilemmas has yet been formalized. This paper at long last resolves this dilemma, by introducing a new decision theory: VDT.

Decision theory problems and existing theories

Some common existing decision theories are:

  • Causal Decision Theory (CDT): select the action that *causes* the best outcome.
  • Evidential Decision Theory (EDT): select the action that you would be happiest to learn that you had taken.
  • Functional Decision Theory
...

I find this hilarious, but also a little scary. As in, I don't base my choices/morality off of what an AI says, but see in this article a possibility that I could be convinced to do so. It also makes me wonder, since LLM's are basically curated repositories of most everything that humans have written, if the true decision theory is just "do what most humans would do in this situation".

4Jon Garcia
I think VDT scales extremely well, and we can generalize it to say: "Do whatever our current ASI overlord tells us has the best vibes." This works for any possible future scenario: 1. ASI is aligned with human values: ASI knows best! We'll be much happier following its advice. 2. ASI is not aligned but also not actively malicious: ASI will most likely just want us out of its way so it can get on with its universe-conquering plans. The more we tend to do what it says, the less inclined it will be to exterminate all life. 3. ASI is actively malicious: Just do whatever it says. Might as well get this farce of existence over with as soon as possible. Great post! (Caution: The validity of this comment may expire on April 2.)
4Gurkenglas
Well, what does it say about the trolley problem?
1satchlj
Claude says the vibes are 'inherently cursed' But then it chooses not to pull the lever because it's 'less karmically disruptive'

PDF version. berkeleygenomics.org. Twitter thread. (Bluesky copy.)

Summary

The world will soon use human germline genomic engineering technology. The benefits will be enormous: Our children will be long-lived, will have strong and diverse capacities, and will be halfway to the end of all illness.

To quickly bring about this world and make it a good one, it has to be a world that is beneficial, or at least acceptable, to a great majority of people. What laws would make this world beneficial to most, and acceptable to approximately all? We'll have to chew on this question ongoingly.

Genomic Liberty is a proposal for one overarching principle, among others, to guide public policy and legislation around germline engineering. It asserts:

Parents have the right to freely choose the genomes of their children.

If upheld,...

River
10

I think the frames in which you are looking at this are just completely wrong. We aren't really talking about "decisions about an individuals' reproduction". We are talking about how a parent can treat their child. This is something that is already highly regulated by the state, CPS is a thing, and it is good that it is a thing. There may be debates to be had about whether CPS has gone too far on certain issues, but there is a core sort of evil that CPS exists to address, and that it is good for the state to address. And blinding your child is a very core ... (read more)

Double
10

In addition to money, education, careers, and internal organs, citizens of wealthy countries have an additional valuable resource they could direct to effective causes: their hands in marriage, which can be effectively allocated in one of two ways.

For one, professionals are usually much more impactful doing their work in wealthy countries. Otherwise promising EAs in South Sudan have little chance to make a significant impact on existential risks, animal welfare, or even global poverty. The immigration process is difficult and often rejects or holds up good... (read more)

LessOnline 2025

Ticket prices increase in 1 day

Join our Festival of Blogging and Truthseeking from May 30 - Jun 1, Berkeley, CA