LESSWRONG
LW

All of jaan's Comments + Replies

correct! i’ve tried to use this symmetry argument (“how do you know you’re not the clone?”) over the years to explain the multiverse: https://youtu.be/29AgSo6KOtI?t=869

What percent of the sun would a Dyson Sphere cover?

jaan8mo40

interesting! still, aestivation seems to easily trump the black hole heat dumping, no?

Wei Dai8mo124

From Bennett et el's reply to the aestivation paper:

Thus we come to our first conclusion: a civilization can freely erase bits without forgoing larger future rewards up until the point when all accessible bounded resources are jointly thermalized.

They don't mention black holes specifically, but my interpretation of this is that a civilization can first dump waste heat into a large black hole, and then later when the CMB temperature drops below that of the black hole, reverse course to use Hawking radiation of the black hole as energy source and CMB as ... (read more)

What percent of the sun would a Dyson Sphere cover?

jaan8mo126

dyson spheres are for newbs; real men (and ASIs, i strongly suspect) starlift.

8Wei Dai8mo

Yes, advanced civilizations should convert stellar matter 100% into energy using something like the Hawking radiation of small black holes, then dump waste heat into large black holes.

MIRI 2024 Communications Strategy

jaan9mo2312

thank you for continuing to stretch the overton window! note that, luckily, the “off-switch” is now inside the window (though just barely so, and i hear that big tech is actively - and very myopically - lobbying against on-chip governance). i just got back from a UN AIAB meeting and our interim report does include the sentence “Develop and collectively maintain an emergency response capacity, off-switches and other stabilization measures” (while rest of the report assumes that AI will not be a big deal any time soon).

Jaan Tallinn's 2023 Philanthropy Overview

jaan9mo40

thanks! basically, i think that the top priority should be to (quickly!) slow down the extinction race. if that’s successful, we’ll have time for more deliberate interventions — and the one you propose sounds confidently net positive to me! (with sign uncertainties being so common, confident net positive interventions are surprisingly rare).

"If we go extinct due to misaligned AI, at least nature will continue, right? ... right?"

jaan10mo30

AI takeover.

"If we go extinct due to misaligned AI, at least nature will continue, right? ... right?"

jaan10mo30

i might be confused about this but “witnessing a super-early universe” seems to support “a typical universe moment is not generating observer moments for your reference class”. but, yeah, anthropics is very confusing, so i’m not confident in this.

owencb10mo100

OK hmm I think I understand what you mean.

I would have thought about it like this:

"our reference class" includes roughly the observations we make before observing that we're very early in the universe
- This includes stuff like being a pre-singularity civilization
The anthropics here suggest there won't be lots of civs later arising and being in our reference class and then finding that they're much later in universe histories
It doesn't speak to the existence or otherwise of future human-observer moments in a post-singularity civilization

... but as you say anthropics is confusing, so I might be getting this wrong.

2plex10mo

By my models of anthropics, I think this goes through.

"If we go extinct due to misaligned AI, at least nature will continue, right? ... right?"

jaan10mo127

three most convincing arguments i know for OP’s thesis are:

atoms on earth are “close by” and thus much more valuable to fast running ASI than the atoms elsewhere.
(somewhat contrary to the previous argument), an ASI will be interested in quickly reaching the edge of the hubble volume, as that’s slipping behind the cosmic horizon — so it will starlift the sun for its initial energy budget.
robin hanson’s “grabby aliens” argument: witnessing a super-young universe (as we do) is strong evidence against it remaining compatible with biological life for l

... (read more)

4ryan_greenblatt10mo

I've thought a bit about actions to reduce the probability that AI takeover involves violent conflict. I don't think there are any amazing looking options. If goverments were generally more competent that would help. Having some sort of apparatus for negotiating with rogue AIs could also help, but I expect this is politically infeasible and not that leveraged to advocate for on the margin.

2Mitchell_Porter10mo

In preparation for what?

6owencb10mo

I think point 2 is plausible but doesn't super support the idea that it would eliminate the biosphere; if it cared a little, it could be fairly cheap to take some actions to preserve at least a version of it (including humans), even if starlifting the sun. Point 1 is the argument which I most see as supporting the thesis that misaligned AI would eliminate humanity and the biosphere. And then I'm not sure how robust it is (it seems premised partly on translating our evolved intuitions about discount rates over to imagining the scenario from the perspective of the AI system).

2owencb10mo

Wait, how does the grabby aliens argument support this? I understand that it points to "the universe will be carved up between expansive spacefaring civilizations" (without reference to whether those are biological or not), and also to "the universe will cease to be a place where new biological civilizations can emerge" (without reference to what will happen to existing civilizations). But am I missing an inferential step?

RSPs are pauses done right

jaan1y135

i would love to see competing RSPs (or, better yet, RTDPs, as @Joe_Collman pointed out in a cousin comment).

RSPs are pauses done right

jaan1yΩ81812

Sure, but I guess I would say that we're back to nebulous territory then—how much longer than six months? When if ever does the pause end?

i agree that, if hashed out, the end criteria may very well resemble RSPs. still, i would strongly advocate for scaling moratorium until widely (internationally) acceptable RSPs are put in place.

I'd very surprised if there was substantial x-risk from the next model generation.

i share the intuition that the current and next LLM generations are unlikely an xrisk. however, i don't trust my (or anyone else's) intuitons stron... (read more)

RSPs are pauses done right

jaan1yΩ184226

the FLI letter asked for “pause for at least 6 months the training of AI systems more powerful than GPT-4” and i’m very much willing to defend that!

my own worry with RSPs is that they bake in (and legitimise) the assumptions that a) near term (eval-less) scaling poses trivial xrisk, and b) there is a substantial period during which models trigger evals but are existentially safe. you must have thought about them, so i’m curious what you think.

that said, thank you for the post, it’s a very valuable discussion to have! upvoted.

4evhub1y

Sure, but I guess I would say that we're back to nebulous territory then—how much longer than six months? When if ever does the pause end? I agree that this is mostly baked in, but I think I'm pretty happy to accept it. I'd very surprised if there was substantial x-risk from the next model generation. But also I would argue that, if the next generation of models do pose an x-risk, we've mostly already lost—we just don't yet have anything close to the sort of regulatory regime we'd need to deal with that in place. So instead I would argue that we should be planning a bit further ahead than that, and trying to get something actually workable in place further out—which should also be easier to do because of the dynamic where organizations are more willing to sacrifice potential future value than current realized value. Yeah, I agree that this is tricky. Theoretically, since we can set the eval bar at any capability level, there should exist capability levels that you can eval for and that are safe but scaling beyond them is not. The problem, of course, is whether we can effectively identify the right capabilities levels to evaluate in advance. The fact that different capabilities are highly correlated with each other makes this easier in some ways—lots of different early warning signs will all be correlated—but harder in other ways—the dangerous capabilities will also be correlated, so they could all come at you at once. Probably the most important intervention here is to keep applying your evals while you're training your next model generation, so they trigger as soon as possible. As long as there's some continuity in capabilities, that should get you pretty far. Another thing you can do is put strict limits on how much labs are allowed to scale their next model generation relative to the models that have been definitively evaluated to be safe. And furthermore, my sense is that at least in the current scaling paradigm, the capabilities of the next model generation

Sharing Information About Nonlinear

jaan2y42

the werewolf vs villager strategy heuristic is brilliant. thank you!

2jimrandomh2y

Credit to Benquo's writing for giving me the idea.

Demystifying Born's rule

jaan2y30

if i understand it correctly (i may not!), scott aaronson argues that hidden variable theories (such as bohmian / pilot wave) imply hypercomputation (which should count as an evidence against them): https://www.scottaaronson.com/papers/npcomplete.pdf

6Mitchell_Porter2y

If hypercomputation is defined as computing the uncomputable, then that's not his idea. It's just a quantum speedup better than the usual quantum speedup (defining a quantum complexity class DQP that is a little bigger than BQP). Also, Scott's Bohmian speedup requires access to what the hidden variables were doing at arbitrary times. But in Bohmian mechanics, measuring an observable perturbs complementary observables (i.e. observables that are in some kind of "uncertainty relation" to the first) in exactly the same way as in ordinary quantum mechanics. There is a way (in both Bohmian mechanics and standard quantum mechanics) to get at this kind of trajectory information, without overly perturbing the system evolution - "weak measurements". But weak measurements only provide weak information about the measured observable - that's the price of not violating the uncertainty principle. A weak measuring device is correlated with the physical property it is measuring, but only weakly. I mention this because someone ought to see how it affects Scott's Bohmian speedup, if you get the history information using weak measurements. (Also because weak measurements may have an obscure yet fundamental relationship to Bohmian mechanics.) Is the resulting complexity class DQP, BQP, P, something else? I do not know.

Jaan Tallinn's 2022 Philanthropy Overview

jaan2y142

interesting, i have bewelltuned.com in my reading queue for a few years now -- i take your comment as an upvote!

myself i swear by FDT (somewhat abstract, sure, but seems to work well) and freestyle dancing (the opposite of abstract, but also seems to work well). also coding (eg, just spent several days using pandas to combine and clean up my philanthropy data) -- code grounds one in reality.

On the FLI Open Letter

jaan2y113

having seen the “kitchen side” of the letter effort, i endorse almost all zvi’s points here. one thing i’d add is that one of my hopes urging the letter along was to create common knowledge that a lot of people (we’re going to get to 100k signatures it looks like) are afraid of the thing that comes after GPT4. like i am.

thanks, everyone, who signed.

EDIT: basically this: https://twitter.com/andreas212nyc/status/1641795173972672512

We have to Upgrade

jaan2y249

while it’s easy to agree with some abstract version of “upgrade” (as in try to channel AI capability gains into our ability to align them), the main bottleneck to physical upgrading is the speed difference between silicon and wet carbon: https://www.lesswrong.com/posts/Ccsx339LE9Jhoii9K/slow-motion-videos-as-ai-risk-intuition-pumps

9Jed McCaleb2y

Yeah to be clear I don't think "upgrading" is easy. It might not even be possible in a way that makes it relevant. But I do think it offers some hope in an otherwise pretty bleak landscape.

All AGI Safety questions welcome (especially basic ones) [~monthly thread]

jaan2y32

yup, i tried invoking church-turing once, too. worked about as well as you’d expect :)

All AGI Safety questions welcome (especially basic ones) [~monthly thread]

jaan2y51

looks great, thanks for doing this!

one question i get every once in a while and wish i had a canonical answer to is (probably can be worded more pithily):

"humans have always thought their minds are equivalent to whatever's their latest technological achievement -- eg, see the steam engines. computers are just the latest fad that we currently compare our minds to, so it's silly to think they somehow pose a threat. move on, nothing to see here."

note that the canonical answer has to work for people whose ontology does not include the concepts of "computation"... (read more)

2Richard_Kennaway2y

Most of the threat comes from the space of possible super-capable minds that are not human. (This does not mean that human-like AIs would be less dangerous, only that they are a small part of the space of possibilities.)

2Ben Livengood2y

Agents are the real problem. Intelligent goal-directed adversarial behavior is something almost everyone understands whether it is other humans or ants or crop-destroying pests. We're close to being able to create new, faster, more intelligent agents out of computers.

2Lone Pine2y

I think the technical answer comes down to the Church-Turing thesis and the computability of the physical universe, but obviously that's not a great answer for the compscidegreeless among us.

We don’t trade with ants

jaan2y42

the potentially enormous speed difference (https://www.lesswrong.com/posts/Ccsx339LE9Jhoii9K/slow-motion-videos-as-ai-risk-intuition-pumps) will almost certainly be an effective communications barrier between humans and AI. there’s a wonderful scene of AIs vs humans negotiation in william hertling’s “A.I. apocalypse” that highlights this.

AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years

jaan2y223

i agree that there's the 3rd alternative future that the post does not consider (unless i missed it!):

3. markets remain in an inadequate equilibrium until the end of times, because those participants (like myself!) who consider short timelines remain in too small minority to "call the bluff".

see the big short for a dramatic depiction of such situation.

great post otherwise. upvoted.

4soth022y

Coincidentally, that scene in The Big Short takes place on January 11 (2007) :D

Decision theory does not imply that we get to have nice things

jaan2y*10

yeah, this seems to be the crux: what will CEV prescribe for spending the altruistic (reciprocal cooperation) budget on. my intuition continues to insist that purchasing the original star systems from UFAIs is pretty high on the shopping list, but i can see arguments (including a few you gave above) against that.

oh, btw, one sad failure mode would be getting clipped by a proto-UFAI that’s too stupid to realise it’s in a multi-agent environment or something,

ETA: and, tbc, just like interstice points out below, my “us/me” label casts a wider net than “us in this particular everett branch where things look particularly bleak”.

Decision theory does not imply that we get to have nice things

jaan2y73

roger. i think (and my model of you agrees) that this discussion bottoms out in speculating what CEV (or equivalent) would prescribe.

my own intuition (as somewhat supported by the moral progress/moral circle expansion in our culture) is that it will have a nonzero component of “try to help out the fellow humans/biologicals/evolved minds/conscious minds/agents with diminishing utility function if not too expensive, and especially if they would do the same in your position”.

So8res2y107

tbc, i also suspect & hope that our moral circle will expand to include all fellow sentients. (but it doesn't follow from that that paying paperclippers to unkill their creators is a good use of limited resources. for instance, those are resources that could perhaps be more efficiently spent purchasing and instantiating the stored mindstates of killed aliens that the surviving-branch humans meet at the edge of their own expansion.)

but also, yeah, i agree it's all guesswork. we have friends out there in the multiverse who will be willing to give us some... (read more)

Decision theory does not imply that we get to have nice things

jaan2y266

yeah, as far as i can currently tell (and influence), we’re totally going to use a sizeable fraction of FAI-worlds to help out the less fortunate ones. or perhaps implement a more general strategy, like mutual insurance pact of evolved minds (MIPEM).

this, indeed, assumes that human CEV has diminishing returns to resources, but (unlike nate in the sibling comment!) i’d be shocked if that wasn’t true.

So8res2y115

one thing that makes this tricky is that, even if you think there's a 20% chance we make it, that's not the same as thinking that 20% of Everett branches starting in this position make it. my guess is that whether we win or lose from the current board position is grossly overdetermined, and what we're fighting for (and uncertain about) is which way it's overdetermined. (like how we probably have more than one in a billion odds that the light speed limit can be broken, but that doesn't mean that we think that one in every billion photons breaks the limit.) ... (read more)

Jaan Tallinn's 2021 Philanthropy Overview

jaan3y70

sure, this is always a consideration. i'd even claim that the "wait.. what about the negative side effects?" question is a potential expected value spoiler for pretty much all longtermist interventions (because they often aim for effects that are multiple causal steps down the road), and as such not really specific to software.

Create a prediction market in two minutes on Manifold Markets

jaan3y80

great idea! since my metamed days i’ve been wishing there was a prediction market for personal medical outcomes — it feels like manifold mechanism might be a good fit for this (eg, at the extreme end, consider the “will this be my last market if i undertake the surgery X at Y?” question). should you decide to develop such aspect at one point, i’d be very interested in supporting/subsidising.

3Austin Chen3y

Yes, that's absolutely the kind of prediction market we'd love to enable at Manifold! I'd love to chat more about specifically the personal medical use case, and we'd already been considering applying to SFF -- let's get in touch (I'm akrolsmir@gmail.com).

Biology-Inspired AGI Timelines: The Trick That Never Works

jaan3y50

actually, the premise of david brin’s existence is a close match to moravec’s paragraph (not a coincidence, i bet, given that david hung around similar circles).

[Linkpost] Chinese government's guidelines on AI

jaan3y250

confirmed. as far as i can tell (i’ve talked to him for about 2h in total) yi really seems to care, and i’m really impressed by his ability to influence such official documents.

Soares, Tallinn, and Yudkowsky discuss AGI cognition

jaan3y150

indeed, i even gave a talk almost a decade ago about the evolution:humans :: humans:AGI symmetry (see below)!

what confuses me though is that "is general reasoner" and "can support cultural evolution" properties seemed to emerge pretty much simultaneously in humans -- a coincidence that requires its own explanation (or dissolution). furthermore, eliezer seems to think that the former property is much more important / discontinuity causing than the latter. and, indeed, outsized progress being made by individual human reasoners (scientists/inventors/etc.) see... (read more)

1Gram Stone3y

If information is 'transmitted' by modified environments and conspecifics biasing individual search, marginal fitness returns on individual learning ability increase, while from the outside it looks just like 'cultural 'evolution.''

4Vanessa Kosoy3y

I think that these properties encourage each other's evolution. When you're a more general reasoner, you have a bigger hypothesis space, specifying a hypothesis requires more information, so you also benefit more from transmitting information. Conversely, once you can transmit information, general reasoning becomes much more useful since you effectively have access to much bigger datasets.

9Vaniver3y

David Deutsch (in The Beginning of Infinity) argues, as I recall, that they're basically the same faculty. In order to copy someone else / "carry on a tradition", you need to model what they're doing (so that you can copy it), and similarly for originators to tell whether students are correctly carrying on the tradition. The main thing that's interesting about his explanation is how he explains the development of general reasoning capacity, which we now think of as a tradition-breaking faculty, in the midst of tradition-promoting selection. If you buy that story, it ends up being another example of treacherous turn from human history (where individual thinkers, operating faster than cultural evolution, started pursuing their own values).

How To Get Into Independent Research On Alignment/Agency

jaan3y230

amazing post! scaling up the community of independent alignment researchers sounds like one of the most robust ways to convert money into relevant insights.

Can you control the past?

jaan4y30

indeed they are now. retrocausality in action? :)

1AnthonyC4y

Obligatory: https://xkcd.com/2480/

Jaan Tallinn's 2020 Philanthropy Overview

jaan4y70

well, i've always considered human life extension as less important than "civilisation's life extension" (ie, xrisk reduction). still, they're both very important causes, and i'm happy to support both, especially given that they don't compete much for talent. as for the LRI specifically, i believe they simply haven't applied to more recent SFF grant rounds.