(The existence of exceptions is why I said "most anyone" instead of "anyone".)
To be clear, my recommendation for SB-1047 was not "be basically the same bill but talk about extinction risks and levy a few more restrictions on the labs", but rather "focus very explicitly on the extinction threat; say 'this bill is trying to address a looming danger described by a variety of scientists and industry leaders' or suchlike, shape the bill differently to actually address the extinction threat straightforwardly".
I don't have a strong take on whether SB-1047 would have been more likely to pass in that world. My recollection is that, back when...
I don't think most anyone who's studied the issues at hand thinks the chance of danger is "really small", even among people who disagree with me quite a lot (see e.g. here). I think folks who retreat to arguments like "you should pay attention to this even if you think there's a really small chance of it happening" are doing a bunch of damage, and this is one of many problems I attribute to a lack of this "courage" stuff I'm trying to describe.
When I speak of "finding a position you have courage in", I do not mean "find a position that you think should be ...
A few claims from the post (made at varying levels of explicitness) are:
1 . Often people are themselves motivated by concern X (ex: "the race to superintelligence is reckless and highly dangerous") and decide to talk about concern Y instead (ex: "AI-enabled biorisks"), perhaps because they think it is more palatable.
2 . Focusing on the "palatable" concerns is a pretty grave mistake.
2a. The claims Y are often not in fact more palatable; people are often pretty willing to talk about the concerns that actually motivate you.
2b. When people try talking about th...
Ok. I don't think your original post is clear about which of these many different theses it has, or which points it thinks are evidence for other points, or how strongly you think any of them.
I don't know how to understand your thesis other than "in politics you should always pitch people by saying how the issue looks to you, Overton window or personalized persuasion style be damned". I think the strong version of this claim is obviously false. Though maybe it's good advice for you (because it matches your personality profile) and perhaps it's good advice ...
Huh! I've been in various conversations with elected officials and have had the sense that most people speak without the courage of their convictions (which is not quite the same thing as "confidence", but which is more what the post is about, and which is the property I'm more interested in discussing in this comment section, and one factor of the lack of courage is broadcasting uncertainty about things like "25% vs 90+%" when they could instead be broadcasting confidence about "this is ridiculous and should stop"). In my experience, it's common to the po...
I don't think the weed/local turf wars really cause the problems here, why do you think that?
The hypothesized effect is: people who have been engaged in the weeds/turf wars think of themselves as "uncertain" (between e.g. the 25%ers and the 90+%ers) and forget that they're actually quite confident about some proposition like "this whole situation is reckless and crazy and Earth would be way better off if we stopped". And then there's a disconnect where (e.g.) an elected official asks a local how bad things look, and they answer while mentally inhabiting...
I agree that it's usually helpful and kind to model your conversation-partner's belief-state (and act accordingly).
And for the avoidance of doubt: I am not advocating that anyone pretend they think something is obvious when they in fact do not.
By "share your concerns as if they’re obvious and sensible", I was primarily attempting to communicate something more like: I think it's easy for LessWrong locals to get lost in arguments like whether AI might go fine because we're all in a simulation anyway, or confused by turf wars about whether AI has a 90+% chanc...
I do think that this community is generally dramatically failing to make the argument "humanity is building machine superintelligence while having very little idea of what it's doing, and that's just pretty crazy on its face" because it keeps getting lost in the weeds (or in local turf wars).
I don't think the weed/local turf wars really cause the problems here, why do you think that?
The weeds/local turf wars seem like way smaller problems for AI-safety-concerned people communicating that the situation seems crazy than e.g. the fact that a bunch of the AI s...
That doesn't spark any memories (and people who know me rarely describe my conversational style as "soft and ever-so-polite"). My best guess is nevertheless that this tweet is based on a real event (albeit filtered through some misunderstandings, e.g. perhaps my tendency to talk with a tone of confidence was misinterpreted as a status game; or perhaps I made some hamfisted attempt to signal "I don't actually like talking about work on dates" and accidentally signaled "I think you're dumb if you don't already believe these conclusions I'm reciting in respon...
I think you should leave the comments.
"Here is an example of Nate's passion for AI Safety not working" seems like a reasonably relevant comment, albeit entirely anecdotal and low effort.
Your comment is almost guaranteed to "ratio" theirs. It seems unlikely that the thread will be massively derailed if you don't delete.
Plus deleting the comment looks bad and will add to the story. Your comment feels like it is already close to the optimal response.
I don't quite see how any of this relates to the topic at hand,
It relates to the topic because it's one piece of anecdotal evidence about the empirical results of your messaging strategy (much as the post mentions a number of other pieces of anecdotal evidence): negative polarization is a possible outcome, not just support or lack-of-support.
perhaps my tendency to talk with a tone of confidence was misinterpreted as a status game
Um, yes, confidence and status are related. You're familiar with emotive conjugation, right? "I talk with a tone of confid...
Just commenting narrowly on how it relates to the topic at hand: I read it as anecdotal evidence about how things might go if you speak with someone and you "share your concerns as if they’re obvious and sensible", which is that people might perceive you as thinking they're dumb for not understanding something so obvious, which can backfire if it's in fact not obvious to them.
Are you claiming that this is mistaken, or rather that this is correct but it's not a problem?
mistaken.
But if you like money, you’ll pay more for a contract on coin B.
this is an invalid step. it's true in some cases but not others, depending on how the act of paying for a contract on coin B (with no additional knowledge of whether it's double-headed) affects the chance that the market tosses coin B.
short version: the analogy between a conditional prediction market and the laser-scanner-simulation setup only holds for bids that don't push the contract into execution. (similarly: i agree that, in conditional prediction markets, you sometimes wish to pay more for a contract that is less valuable in counterfactual expectation; but again, this happens only insofar as your bids do not cause the relevant condition to become true.)
longer version:
suppose there's a coin that you're pretty sure is biased such that it comes up heads 40% of the time, and a contra...
the trick is that the argument stops working for conditions that start to look like they might trigger. so the argument doesn't disrupt the idea that conditional prediction markets put the highest price on the best choice, but it does disrupt the idea that the pricings for unlikely conditions are counterfactually accurate.
for intuition, suppose there's a conditional prediction market for medical treatments for cancer. one of the treatments is "cut off the left leg." if certain scans and tests come back just the right way (1% chance likely) then cutting off...
A variety of translations are lined up.
We have an advertising campaign planned, and we'll be working with professional publicists. We have a healthy budget for it already :-)
I'm told that Australians will be able to purchase the UK e-book, and that it'll be ready to go in a week or so.
There's not a short answer; subtitles and cover art are over-constrained and the choices have many stakeholders (and authors rarely have final say over artwork). The differences reflect different input from different publishing-houses in different territories, who hopefully have decent intuitions about their markets.
The US and UK versions will have different covers and subtitles. I'm not sure why the US version shows up on the .co.uk website. We've asked the publishers to take a look.
We're still in the final proofreading stages for the English version, so the translators haven't started translating yet. But they're queued up.
Something I've done in the past is to send text that I intended to be translated through machine translation, and then back, with low latency, and gain confidence in the semantic stability of the process.
Rewrite english, click, click.
Rewrite english, click, click.
Rewrite english... click, click... oh! Now it round trips with high fidelity. Excellent. Ship that!
Given the potentially massive importance of a Chinese version, it may be worth burning $8,000 to start the translation before proofreading is done, particularly if your translators come back with questions that are better clarified in the English text. I'd pay money to help speed this up if that's the bottleneck[1]. When I was in China I didn't have a good way of explaining what I was doing and why.
I'm working mostly off savings and wouldn't especially want to, but I would to make it happen.
We're targeting a broad audience, and so our focus groups have been more like completely uninformed folks than like informed skeptics. (We've spent plenty of time honing arguments with informed skeptics, but that sort of content will appear in the accompanying online resources, rather than in the book itself.) I think that the quotes the post leads with speak to our ability to engage with our intended audience.
I put in the quote from Rob solely for the purpose of answering the question of whether regular LW readers would have anything to gain personally fr...
My guess is that "I'm excited and want a few for my friends and family!" is fine if it's happening naturally, and that "I'll buy a large number to pump up the sales" just gets filtered out. But it's hard to say; the people who compile best-seller lists are presumably intentionally opaque about this. I wouldn't sweat it too much as long as you're not trying to game it.
I donated $25k. Thanks for doing what you do.
I agree that in real life the entropy argument is an argument in favor of it being actually pretty hard to fool a superintelligence into thinking it might be early in Tegmark III when it's not (even if you yourself are a superintelligence, unless you're doing a huge amount of intercepting its internal sanity checks (which puts significant strain on the trade possibilities and which flirts with being a technical-threat)). And I agree that if you can't fool a superintelligence into thinking it might be early in Tegmark III when it's not, then the purchasing ...
Dávid graciously proposed a bet, and while we were attempting to bang out details, he convinced me of two points:
The entropy of the simulators’ distribution need not be more than the entropy of the (square of the) wave function in any relevant sense. Despite the fact that subjective entropy may be huge, physical entropy is still low (because the simulations happen on a high-amplitude ridge of the wave function, after all). Furthermore, in the limit, simulators could probably just keep an eye out for local evolved life forms in their domain and wait until o...
Thanks to Nate for conceding this point.
I still think that other than just buying freedom to doomed aliens, we should run some non-evolved simulations of our own with inhabitants that are preferably p-zombies or animated by outside actors. If we can do this in the way that the AI doesn't notice it's in a simulation (I think this should be doable), this will provide evidence to the AI that civilizations do this simulation game (and not just the alien-buying) in general, and this buys us some safety in worlds where the AI eventually notices there are n...
I'm happy to stake $100 that, conditional on us agreeing on three judges and banging out the terms, a majority will agree with me about the contents of the spoilered comment.
If the simulators have only one simulation to run, sure. The trouble is that the simulators have simulations they could run, and so the "other case" requires additional bits (where is the crossent between the simulators' distribution over UFAIs and physics' distribution over UFAIs).
If necessary, we can run let pgysical biological life emerge on the faraway planet and develop AI while we are observing them from space.
Consider the gas example again.
If you have gas that was compressed into the corner a long time ago and has long since expanded to f...
I basically endorse @dxu here.
Fleshing out the argument a bit more: the part where the AI looks around this universe and concludes it's almost certainly either in basement reality or in some simulation (rather than in the void between branches) is doing quite a lot of heavy lifting.
You might protest that neither we nor the AI have the power to verify that our branch actually has high amplitude inherited from some very low-entropy state such as the big bang, as a Solomonoff inductor would. What's the justification for inferring from the observation that we ...
seems to me to have all the components of a right answer! ...and some of a wrong answer. (we can safely assume that the future civ discards all the AIs that can tell they're simulated a priori; that's an easy tell.)
I'm heartened somewhat by your parenthetical pointing out that the AI's prior on simulation is low account of there being too many AIs for simulators to simulate, which I see as the crux of the matter.
My answer is in spoilers, in case anyone else wants to answer and tell me (on their honor) that their answer is independent from mine, which will hopefully erode my belief that most folk outside MIRI have a really difficult time fielding wacky decision theory Qs correctly.
The sleight of hand is at the point where God tells both AIs that they're the only AIs (and insinuates that they have comparable degree).
Consider an AI that looks around and sees that it sure seems to be somewhere in Tegmark III. The hypothesis "I am in the basement of some branch that
The only thing we need there is that the AI can't distinguish sims from base reality, so it thinks it's more likely to be in a sim, as there are more sims.
I don't think this part does any work, as I touched on elsewhere. An AI that cares about the outer world doesn't care how many instances are in sims versus reality (and considers this fact to be under its control much moreso than yours, to boot). An AI that cares about instantiation-weighted experience considers your offer to be a technical-threat and ignores you. (Your reasons to make the offer would...
One complication that I mentioned in another thread but not this one (IIRC) is the question of how much more entropy there is in a distant trade partner's model of Tegmark III (after spending whatever resources they allocate) than there is entropy in the actual (squared) wave function, or at least how much more entropy there is in the parts of the model that pertain to which civilizations fall.
In other words: how hard is it for distant trade partners to figure out that it was us who died, rather than some other plausible-looking human civilization that doe...
Starting from now? I agree that that's true in some worlds that I consider plausible, at least, and I agree that worlds whose survival-probabilities are sensitive to my choices are the ones that render my choices meaningful (regardless of how determinisic they are).
Conditional on Earth being utterly doomed, are we (today) fewer than 75 qbitflips from being in a good state? I'm not sure, it probably varies across the doomed worlds where I have decent amounts of subjective probability. It depends how much time we have on the clock, depends where the points o...
What are you trying to argue? (I don't currently know what position y'all think I have or what position you're arguing for. Taking a shot in the dark: I agree that quantum bitflips have loads more influence on the outcome the earlier in time they are.)
You often claim that conditional on us failing in alignment, alignment was so unlikely that among branches that had roughyly the same people (genetically) during the Singularity, only 2^-75 survives.
My first claim is not "fewer than 1 in 2^75 of the possible configurations of human populations navigate the problem successfully".
My first claim is more like "given a population of humans that doesn't even come close to navigating the problem successfully (given some unoptimized configuration of the background particles), probably you'd need to spend quite ...
the "you can't save us by flipping 75 bits" thing seems much more likely to me on a timescale of years than a timescale of decades; I'm fairly confident that quantum fluctuations can cause different people to be born, and so if you're looking 50 years back you can reroll the population dice.
This point feels like a technicality, but I want to debate it because I think a fair number of your other claims depend on it.
You often claim that conditional on us failing in alignment, alignment was so unlikely that among branches that had roughyly the same people (genetically) during the Singularity, only 2^-75 survives. This is important, because then we can't rely on other versions of ourselves "selfishly" entering an insurance contract with us, and we need to rely on the charity of Dath Ilan that branched off long ago. I agree that's a big diff...
Summarizing my stance into a top-level comment (after some discussion, mostly with Ryan):
I was responding to David saying
Otherwise, I largely agree with your comment, except that I think that us deciding to pay if we win is entangled with/evidence for a general willingness to pay among the gods, and in that sense it's partially "our" decision doing the work of saving us.
and was insinuating that we deserve extremely little credit for such a choice, in the same way that a child deserves extremely little credit for a fireman saving someone that the child could not (even if it's true that the child and the fireman share some aspects of a decis...
Attempting to summarize your argument as I currently understand it, perhaps something like:
...Suppose humanity wants to be insured against death, and is willing to spend 1/million of its resources in worlds where it lives for 1/trillion of those resources in worlds where it would otherwise die.
It suffices, then, for humanity to be the sort of civilization that, if it matures, would comb through the multiverse looking for [other civilizations in this set], and find ones that died, and verify that they would have acted as follows if they'd survived, and then
Thanks for the cool discussion Ryan and Nate! This thread seemed pretty insightful to me. Here’s some thoughts / things I’d like to clarify (mostly responding to Nate's comments).[1]
Who’s doing this trade?
In places it sounds like Ryan and Nate are talking about predecessor civilisations like humanity agreeing to the mutual insurance scheme? But humans aren’t currently capable of making our decisions logically dependent on those of aliens, or capable of rescuing them. So to be precise the entity engaging in this scheme or other acausal interactions on our b...
What does degree of determination have to do with it? If you lived in a fully deterministic universe, and you were uncertain whether it was going to live or die, would you give up on it on the mere grounds that the answer is deterministic (despite your own uncertainty about which answer is physically determined)?
I think I'm confused why you work on AI safety then, if you believe the end-state is already 2^75 level overdetermined.
It's probably physically overdetermined one way or another, but we're not sure which way yet. We're still unsure about things like "how sensitive is the population to argument" and "how sensibly do government respond if the population shifts".
But this uncertainty -- about which way things are overdetermined by the laws of physics -- does not bear all that much relationship to the expected ratio of (squared) quantum amplitude between bra...
Background: I think there's a common local misconception of logical decision theory that it has something to do with making "commitments" including while you "lack knowledge". That's not my view.
I pay the driver in Parfit's hitchhiker not because I "committed to do so", but because when I'm standing at the ATM and imagine not paying, I imagine dying in the desert. Because that's what my counterfactuals say to imagine. To someone with a more broken method of evaluating counterfactuals, I might pseudo-justify my reasoning by saying "I am acting as you would ...
"last minute" was intended to reference whatever timescale David would think was the relevant point of branch-off. (I don't know where he'd think it goes; there's a tradeoff where the later you push it the more that the people on the surviving branch care about you rather than about some other doomed population, and the earlier you push it the more that the people on the surviving branch have loads and loads of doomed populations to care after.)
I chose the phrase "last minute" because it is an idiom that is ambiguous over timescales (unlike, say, "last thr...
Do you buy that in this case, the aliens would like to make the deal and thus UDT from this epistemic perspective would pay out?
If they had literally no other options on offer, sure. But trouble arises when the competant ones can refine P(takeover) for the various planets by thinking a little further.
maybe your objection is that aliens would prefer to make the deal with beings more similar to them
It's more like: people don't enter into insurance pools against cancer with the dude who smoked his whole life and has a tumor the size of a grapefruit in ...
I largely agree with your comment, except that I think that us deciding to pay if we win is entangled with/evidence for a general willingness to pay among the gods, and in that sense it's partially "our" decision doing the work of saving us.
Sure, like how when a child sees a fireman pull a woman out of a burning building and says "if I were that big and strong, I would also pull people out of burning buildings", in a sense it's partially the child's decsiion that does the work of saving the woman. (There's maybe a little overlap in how they run the same...
There's a question of how thick the Everett branches are, where someone is willing to pay for us. Towards one extreme, you have the literal people who literally died, before they have branched much; these branches need to happen close to the last minute. Towards the other extreme, you have all evolved life, some fraction of which you might imagine might care to pay for any other evolved species.
The problem with expecting folks at the first extreme to pay for you is that they're almost all dead (like dead). The problem with expecting folks at the ...
Conditional on the civilization around us flubbing the alignment problem, I'm skeptical that humanity has anything like a 1% survival rate (across any branches since, say, 12 Kya). (Haven't thought about it a ton, but doom looks pretty overdetermined to me, in a way that's intertwined with how recorded history has played otu.)
My guess is that the doomed/poor branches of humanity vastly outweigh the rich branches, such that the rich branches of humanity lack the resources to pay for everyone. (My rough mental estimate for this is something like: you've prob...
Taking a second stab at naming the top reasons I expect this to fail (after Ryan pointed out that my first stab was based on a failure of reading comprehension on my part, thanks Ryan):
This proposal seems to me to have the form "the fragments of humanity that survive offer to spend a (larger) fraction of their universe on the AI's goals so long as the AI spends a (smaller) fraction of its universe on their goals, with the ratio in accordance to the degree of magical-reality-fluid-or-whatever that reality allots to each".
(Note that I think this is not at al...
I agree that in scenarios where humanity survives in 1/X portion of reality and offers to put 1/Y portion of their universe towards the AI's goals, then so long as 1/(X * Y) of a universe is more than it'd cost the AI to preserve the humans, the humans are at least making an offer that is worth more than nothing.
(There is then a question of how many better offers get made. My top guess is that X and Y are such that there's not much effect here, and that the question of what happens to saved copies of human minds is more about whether aliens buy copies than about which dead or distant trade-partners make offers that the AI considers juciest.)
Here's how I think this choice looks from the perspective of a correctly-reasoning AI (having skimmed only your proposal for humanity in the future):
(EDIT: as Ryan points out, the deal he's proposing is one where the AI is allowed to grab the universe in front of it so long as it gives humans one planet's worth of resources, which does change the calculus and makes the deal worthwhile. I had not read the proposal carefully enough to notice this, oops, sorry. There critique below is invalid if humanity asks only for one planet. Note however that this is sti...
Note that we don't want the AI to hand over the reins to humans, just to avoid killing humans when possible/cheap and we're willing to pay quite a bit for this (in the proposal, IDK if I personally think we should pay). So, it should look like a good offer for AIs who care about Tegmark IV (with a measure etc).
So, if humans execute this scheme, the AI's options should look something like:
This is an excerpt from a comment I wrote on the EA forum, extracted and crossposted here by request:
There's a phenomenon where a gambler places their money on 32, and then the roulette wheel comes up 23, and they say "I'm such a fool; I should have bet 23".
More useful would be to say "I'm such a fool; I should have noticed that the EV of this gamble is negative." Now at least you aren't asking for magic lottery powers.
Even more useful would be to say "I'm such a fool; I had three chances to notice that this bet was bad: when my partner was trying to ex...
my original 100:1 was a typo, where i meant 2^-100:1.
this number was in reference to ronny's 2^-10000:1.
when ronny said:
I’m like look, I used to think the chances of alignment by default were like 2^-10000:1
i interpreted him to mean "i expect it takes 10k bits of description to nail down human values, and so if one is literally randomly sampling programs, they should naively expect 1:2^10000 odds against alignment".
i personally think this is wrong, for reasons brought up later in the convo--namely, the relevant question is not how many bits is takes to...
(From a moderation perspective:
- I consider the following question-cluster to be squarely topical: "Suppose one believes it is evil to advance AI capabilities towards superintelligence, on the grounds that such a superintelligence would quite likely to kill us all. Suppose further that one fails to unapologetically name this perceived evil as 'evil', e.g. out of a sense of social discomfort. Is that a failure of courage, in the sense of this post?"
- I consider the following question-cluster to be a tangent: "Suppose person X is contributing to a project that I
... (read more)