Ghibli creature 2
Customize
Thomas Kwa
*Ω281630
0
Some versions of the METR time horizon paper from alternate universes: Measuring AI Ability to Take Over Small Countries (idea by Caleb Parikh) Abstract: Many are worried that AI will take over the world, but extrapolation from existing benchmarks suffers from a large distributional shift that makes it difficult to forecast the date of world takeover. We rectify this by constructing a suite of 193 realistic, diverse countries with territory sizes from 0.44 to 17 million km^2. Taking over most countries requires acting over a long time horizon, with the exception of France. Over the last 6 years, the land area that AI can successfully take over with 50% success rate has increased from 0 to 0 km^2, doubling 0 times per year (95% CI 0.0-0.0 yearly doublings); extrapolation suggests that AI world takeover is unlikely to occur in the near future. To address concerns about the narrowness of our distribution, we also study AI ability to take over small planets and asteroids, and find similar trends. When Will Worrying About AI Be Automated? Abstract: Since 2019, the amount of time LW has spent worrying about AI has doubled every seven months, and now constitutes the primary bottleneck to AI safety research. Automation of worrying would be transformative to the research landscape, but worrying includes several complex behaviors, ranging from simple fretting to concern, anxiety, perseveration, and existential dread, and so is difficult to measure. We benchmark the ability of frontier AIs to worry about common topics like disease, romantic rejection, and job security, and find that current frontier models such as Claude 3.7 Sonnet already outperform top humans, especially in existential dread. If these results generalize to worrying about AI risk, AI systems will be capable of autonomously worrying about their own capabilities by the end of this year, allowing us to outsource all our AI concerns to the systems themselves. Estimating Time Since The Singularity Early work
Seems like Unicode officially added a "person being paperclipped" emoji: Here's how it looks in your browser: 🙂‍↕️ Whether they did this as a joke or to raise awareness of AI risk, I like it! Source: https://emojipedia.org/emoji-15.1
lc
920
7
My strong upvotes are now giving +1 and my regular upvotes give +2.
keltan
250
0
I feel a deep love and appreciation for this place, and the people who inhabit it.
RobertM
180
0
Pico-lightcone purchases are back up, now that we think we've ruled out any obvious remaining bugs.  (But do let us know if you buy any and don't get credited within a few minutes.)

Popular Comments

Recent Discussion

PDF version. berkeleygenomics.org. Twitter thread. (Bluesky copy.)

Summary

The world will soon use human germline genomic engineering technology. The benefits will be enormous: Our children will be long-lived, will have strong and diverse capacities, and will be halfway to the end of all illness.

To quickly bring about this world and make it a good one, it has to be a world that is beneficial, or at least acceptable, to a great majority of people. What laws would make this world beneficial to most, and acceptable to approximately all? We'll have to chew on this question ongoingly.

Genomic Liberty is a proposal for one overarching principle, among others, to guide public policy and legislation around germline engineering. It asserts:

Parents have the right to freely choose the genomes of their children.

If upheld,...

River
10

I think the frames in which you are looking at this are just completely wrong. We aren't really talking about "decisions about an individuals' reproduction". We are talking about how a parent can treat their child. This is something that is already highly regulated by the state, CPS is a thing, and it is good that it is a thing. There may be debates to be had about whether CPS has gone too far on certain issues, but there is a core sort of evil that CPS exists to address, and that it is good for the state to address. And blinding your child is a very core ... (read more)

Introduction

Decision theory is about how to behave rationally under conditions of uncertainty, especially if this uncertainty involves being acausally blackmailed and/or gaslit by alien superintelligent basilisks.

Decision theory has found numerous practical applications, including proving the existence of God and generating endless LessWrong comments since the beginning of time.

However, despite the apparent simplicity of "just choose the best action", no comprehensive decision theory that resolves all decision theory dilemmas has yet been formalized. This paper at long last resolves this dilemma, by introducing a new decision theory: VDT.

Decision theory problems and existing theories

Some common existing decision theories are:

  • Causal Decision Theory (CDT): select the action that *causes* the best outcome.
  • Evidential Decision Theory (EDT): select the action that you would be happiest to learn that you had taken.
  • Functional Decision Theory
...
4Jon Garcia
I think VDT scales extremely well, and we can generalize it to say: "Do whatever our current ASI overlord tells us has the best vibes." This works for any possible future scenario: 1. ASI is aligned with human values: ASI knows best! We'll be much happier following its advice. 2. ASI is not aligned but also not actively malicious: ASI will most likely just want us out of its way so it can get on with its universe-conquering plans. The more we tend to do what it says, the less inclined it will be to exterminate all life. 3. ASI is actively malicious: Just do whatever it says. Might as well get this farce of existence over with as soon as possible. Great post! (Caution: The validity of this comment may expire on April 2.)
4Gurkenglas
Well, what does it say about the trolley problem?

Claude says the vibes are 'inherently cursed'

But then it chooses not to pull the lever because it's 'less karmically disruptive'

Hey Everyone,

It is with a sense of... considerable cognitive dissonance that I am letting you all know about a significant development for the future trajectory of LessWrong. After extensive internal deliberation, projections of financial runways, and what I can only describe as a series of profoundly unexpected coordination challenges, the Lightcone Infrastructure team has agreed in principle to the acquisition of LessWrong by EA.

I assure you, nothing about how LessWrong operates on a day to day level will change. I have always cared deeply about the robustness and integrity of our institutions, and I am fully aligned with our stakeholders at EA. 

To be honest, the key thing that EA brings to the table is money and talent. While the recent layoffs in EAs broader industry have been...

Maloew
10

Are the new songs going to be posted to youtube/spotify, or should I be downloading them?

10Zane
I have 5/12 virtues but one of them is the virtue of perfectionism so I can't stop thinking about the other 7.
1lsusr
You're too late. Lightcone converted LW karma into USD at a rate of $1 USD per karma on April 1, 2022.
5RobertM
Sorry, there was a temporary bug where we were returning mismatched reward indicators to the client.  It's since been patched!  I don't believe anybody actually rolled The Void during this period.
Double
10

In addition to money, education, careers, and internal organs, citizens of wealthy countries have an additional valuable resource they could direct to effective causes: their hands in marriage, which can be effectively allocated in one of two ways.

For one, professionals are usually much more impactful doing their work in wealthy countries. Otherwise promising EAs in South Sudan have little chance to make a significant impact on existential risks, animal welfare, or even global poverty. The immigration process is difficult and often rejects or holds up good... (read more)

I think that rationalists should consider taking more showers.

As Eliezer Yudkowsky once said, boredom makes us human. The childhoods of exceptional people often include excessive boredom as a trait that helped cultivate their genius:

A common theme in the biographies is that the area of study which would eventually give them fame came to them almost like a wild hallucination induced by overdosing on boredom. They would be overcome by an obsession arising from within.

Unfortunately, most people don't like boredom, and we now have little metal boxes and big metal boxes filled with bright displays that help distract us all the time, but there is still an effective way to induce boredom in a modern population: showering.

When you shower (or bathe, that also works), you usually are cut...

I think Instruction-following AGI is easier and more likely than value aligned AGI, and that this accounts for one major crux of disagreement on alignment difficulty. I got several responses to that piece that didn't dispute that intent alignment is easier, but argued we shouldn't give up on value alignment. I think that's right. Here's another way to frame the value of personal intent alignment: we can use a superintelligent instruction-following AGI to solve full value alignment.

This is different than automated alignment research; it's not hoping tool AI can help with our homework, it's making an AGI smarter than us in every way do our homework for us. It's a longer term plan. Having a superintelligent, largely autonomous entity that just really likes taking instructions from puny...

I agree that a potential route to get there is personal intent alignment.

What are your thoughts on using a survey like World Value Survey to get value alignment?

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

Written as part of the AIXI agent foundations sequence, underlying research supported by the LTFF.

Epistemic status: In order to construct a centralized defense of AIXI I have given some criticisms less consideration here than they merit. Many arguments will be (or already are) expanded on in greater depth throughout the sequence. In hindsight, I think it may have been better to explore each objection in its own post and then write this post as a summary/centralized reference, rather than writing it in the middle of that process. Some of my takes have already become more nuanced. This should be treated as a living document.

With the possible exception of the learning-theoretic agenda, most major approaches to agent foundations research construct their own paradigm and mathematical tools which are...

Wei Dai
Ω240

My objection to this argument is that it not only assumes that Predictoria accepts it is plausibly being simulated by Adversaria, which seems like a pure complexity penalty over the baseline physics it would infer otherwise unless that helps to explain observations,

Let's assume for simplicity that both Predictoria and Adversaria are deterministic and nonbranching universes with the same laws of physics but potentially different starting conditions. Adversaria has colonized its universe and can run a trillion simulations of Predictoria in parallel. Again... (read more)

keltan
250

I feel a deep love and appreciation for this place, and the people who inhabit it.

In the debate over AI development, two movements stand as opposites: PauseAI calls for slowing down AI progress, and e/acc (effective accelerationism) calls for rapid advancement.  But what if both sides are working against their own stated interests?  What if the most rational strategy for each would be to adopt the other's tactics—if not their ultimate goals?

AI development speed ultimately comes down to policy decisions, which are themselves downstream of public opinion.  No matter how compelling technical arguments might be on either side, widespread sentiment will determine what regulations are politically viable.

Public opinion is most powerfully mobilized against technologies following visible disasters.  Consider nuclear power: despite being statistically safer than fossil fuels, its development has been stagnant for decades.  Why?  Not because of environmental activists, but because...

LessOnline 2025

Ticket prices increase in 1 day

Join our Festival of Blogging and Truthseeking from May 30 - Jun 1, Berkeley, CA