All of RobinHanson's Comments + Replies

Seems to me I spent a big % of my post arguing against the rapid growth claim. 

4PeterMcCluskey
I'd say your post focused on convincing the average techie or academic that Eliezer is wrong, but didn't try to focus on what Eliezer would see as cruxes. That might be a reasonable choice of where to focus, given the results of prior attempts to address Eliezer's cruxes. You gave a gave a better summary of why Eliezer's cruxes are controversial in this section of Age of Em. I'll make another attempt to focus on Eliezer's cruxes. Intelligence Explosion Microeconomics seems to be the main place where Eliezer attempts to do more than say "my model is better than your reference class". Here's a summary of what I consider his most interesting evidence: At the time that was written, there may have been substantial expert support for those ideas. But more recent books have convinced me Eliezer is very wrong here. Henrich's book The Secret of Our Success presents fairly strong evidence that humans did not stumble on any algorithmic improvement that could be confused with a core of general intelligence. Human uniqueness derives mainly from better transmission of knowledge. Herculano-Houzel's book The Human Advantage presents clear evidence that large primates are pushing the limits of how big their brains can be. Getting enough calories takes more than 8 hours per day of foraging and feeding. That suggests strong evolutionary pressures for bigger brains, enough to reach a balance with the pressure from starvation risk. The four-fold increase in human brain size was likely less important than culture, but I see plenty of inconclusive hints that it was more important than new algorithms.

Come on, most every business tracks revenue in great detail. If customers were getting unhappy with the firm's services and rapidly switching en mass, the firm would quickly become very aware, and looking into the problem in great detail. 

3RobertM
I don't understand what part of my comment this is meant to be replying to.  Is the claim that modern consumer software isn't extremely buggy because customers have a preference for less buggy software, and therefore will strongly prefer providers of less buggy software? This model doesn't capture much of the relevant detail: * revenue attribution is extremely difficult * switching costs are often high * there are very rarely more than a few providers of comparable software * customers value things about software other than it being bug-free But also, you could just check whether software has bugs in real life, instead of attempting to derive it from that model (which would give you bad results anyways). Having both used and written quite a lot of software, I am sorry to tell you that it has a lot of bugs across nearly all domains, and that decisions about whether to fix bugs are only ever driven by revenue considerations to the extent that the company can measure the impact of any given bug in a straightforward enough manner.  Tech companies are more likely to catch bugs in payment and user registration flows, because those tend to be closely monitored, but coverage elsewhere can be extremely spotty (and bugs definitely slip through in payment and user registration flows too). But, ultimately, this seems irrelevant to the point I was making, since I don't really expect an unaligned superintelligence to, what, cause company revenues to dip by behaving badly before it's succeeded in its takeover attempt?

You complain that my estimating rates from historical trends is arbitrary, but you offer no other basis for estimating such rates. You only appeal to uncertainty. But there are several other assumptions required for this doomsday scenario. If all you have is logical possibility to argue for piling on several a priori unlikely assumptions, it gets hard to take that seriously. 

1Liron
My reasoning stems from believing that AI-space contains designs that can easily plan effective strategies to get the universe into virtually any configuration. And they’re going to be low-complexity designs. Because engineering stuff in the universe isn’t a hard problem from a complexity theory perspective. Why should the path from today to the first instantiation of such an algorithm be long? So I think we can state properties of an unprecedented future that first-principles computer science can constrain, and historical trends can’t.

You keep invoking the scenario of a single dominant AI that is extremely intelligent. But that only happens AFTER a single AI fooms to be much better than all other AIs. You can't invoke its super intelligence to explain why its owners fail to notice and control its early growth. 

4JNS
You don't have to invoke it per se. External observables on what the current racers are doing, leads me to be fairly confident that they say some right things, but the reality is they move as fast as possible basically "ship now, fix later". Then we have the fact that interpretability is in its infancy, currently we don't know what happens inside SOTA models. Likely not something exotic, but we can't tell, and if you can't tell on current narrow systems, how are we going to fare on powerful systems[1]? In that world, I think this would be very probable Without any metrics on the system, outside of the output it generates, how do you tell?  And then we have the fact, that once somebody gets there, they will be compelled to move into the "useful but we cannot do" regime very quickly.  Not necessarily by the people who built it, but by the C suite and board of whatever company got there first. At that point, it seems to come down to luck.  Lets assume that I am wrong, my entire ontology[2] is wrong, which means all my thinking is wrong, and all my conclusion are bunk.  So what does the ontology look like in a world where does not happen. I should add, that this is a genuine question.  I have an ontology that seems to be approximately the same as EY's, which basically means whatever he says / writes, I am not confused or surprised. But I don't know what Robins looks like, and maybe I am just dumb, and its coherently extractable from his writing and talks, and I failed to do so (likely). I any case, I really would like to have that understanding, to the point where I can Steelman whatever Robin writes or says. That's a big ask, and unreasonable, but maybe understanding the above, would get me going. 1. ^ I avoid the usual 2 and 3 letter acronyms. They are memetic attractors, and they are so powerful that most people can't get unstuck, which leads to all talk being sucked into irrelevant things. They are systems, mechanistic nothing more.
3Zachary Pruckowski
Most of the tools we use end up cartelized. There are 3-5 major OS kernels, browser engines, office suites, smartphone families, search engines, web servers, and databases. I’d suspect the odds are pretty high that we have one AI with 40%+ market share and a real chance we’ll have an AI market where the market leader has 80%+ market share (and the attendant huge fraction of development resources).
RobertM145

We don't need superintelligence to explain why a person or organization training a model on some new architecture would either fail to notice its growth in capabilities, or stop it if they did notice:

  1. We don't currently have a good operationalization for measuring the qualities of a model that might be dangerous.
  2. Organizations don't currently have anything resembling circuit-breakers in their training setups to stop the training run if a model hits some threshold measurement on a proxy of those dangerous qualities (a proxy we don't even have yet!  ARC e
... (read more)
8Liron
I agree that rapid capability gain is a key part of the AI doom scenario. During the Manhattan project, Feynman prevented an accident by pointing out that labs were storing too much uranium too close together. We’re not just lucky that the accident was prevented; we’re also lucky that if the accident had happened, the nuclear chain reaction wouldn’t have fed on the atmosphere. We similarly depend on luck whenever a new AI capability gain such as LLM general-topic chatting emerges. We’re lucky that it’s not a capability that can feed on itself rapidly. Maybe we’ll keep being lucky when new AI advances happen, and each time it’ll keep being more like past human economic progress or like past human software development. But there’s also a significant chance that it could instead be more like a slightly-worse-than-nuclear-weapon scenario. We just keep taking next steps of unknown magnitude into an attractor of superintelligent AI. At some point our steps will trigger a rapid positive-feedback slide where each step is dealing with very powerful and complex things that we’re far from being able to understand. I just don’t see why there’s more than 90% chance that this will proceed at a survivable pace.

I comment on this paper here: https://www.overcomingbias.com/2022/07/cooks-critique-of-our-earliness-argument.html

4Tristan Cook
Thanks for your response Robin! I've written a reply to you on the EA Forum here

That's an exponential with mean 0.7, or mean 1/0.7?

1Tristan Cook
Mean 0.7

"My prior on  is distributed 

I don't understand this notation. It reads to me like "103+ 5 Gy";  how is that a distribution? 

3Tristan Cook
Sorry, this is very unclear notation. The Exp(0.7) is meant to be a random variable exponentially distributed with parameter 0.7.

It seems the key feature of this remaining story is the "coalition of AIs" part. I can believe that AIs would get powerful, what I'm skeptical about is the claim that they naturally form a coalition of them against us. Which is also what I object to in your prior comments. Horses are terrible at coordination compared to humans, and humans weren't built by horses and integrated into a horse society, with each human originally in the service of a particular horse.  

7Daniel Kokotajlo
Yes, horses are terrible at coordination compared to humans, and that's a big part of why they lost. At some point in prehistory the horses could have coordinated to crush the humans, and didn't. Even in the 1900's horses could have gone on strike, gone to war, etc. and negotiated better working conditions etc. but didn't becuase they weren't smart enough. Similarly, humans are terrible at coordination compared to AIs. I agree that the fact that humans build the AIs is a huge advantage. But it's only a huge advantage insofar as we solve alignment or otherwise limit AI capabilities in ways that serve our interests; if instead we YOLO it or half-ass it, and end up with super capable unaligned agentic AGIs... then we squandered our advantage. I don't think the integration into society or service relationship thing matters much. Imagine the following fictional case: The island of Atlantis is in the mid-Atlantic. It's an agrarian empire at the tech level of the ancient Greeks. Atlantis also contains huge oil, coal, gold, and rare earth deposits. In 1500 Atlantis is discovered by European explorers, who set up trading posts along the coast. For some religious reason, Atlanteans decide to implement the following strict laws: (a) No Atlantean can learn any language other than Atlantean. (b) It is heresy, thoughtcrime, for any Atlantean to understand European ideas. You can use European technology (guns, etc.) but you can't know how it works and you certainly can't know how to build it yourself, because that would mean you've been infected by foreign ideas. (The point of these restrictions is to mimic how in humans vs. AI, the humans won't be able to keep up with AIs in capabilities no matter how hard they try. They can use the fancy gadgets and apps the AIs build, but they can't contribute to frontier R&D. By contrast in historical cases of colonialism and conquest, the technologically weaker side can in principle learn the scientific and economic methods and then cat

Its not enough that AI might appear in a few decades, you also need something useful you can do about it now, compared to investing your money to have more to spend later when concrete problems appear.

I just read through your "what 2026 looks like" post, but didn't see how it is a problematic scenario. Why should we want to work ahead of time to prepare for that scenario?

9Daniel Kokotajlo
Thanks for reading! No existential catastrophe has happened yet in that scenario. The catastrophe will probably happen in 2027-2028 or so if the scenario continues the way I think is most plausible. I'm sorry that I never got around to writing that part, if I had, this would be a much more efficient conversation! Basically to find our disagreement (assuming you agree my story up till 2026 is plausible) we need to discuss what the plausible continuations to the story are. Surprise surprise, I think that there are plausible (indeed most-plausible) continuations of the story in which the bigass foundation model chatbots (which, in the story, are misaligned) as they scale up become strategically aware, agentic planners with superhuman persuasion and coding ability by around 2027. From there (absent clever intervention from prepared people) they gradually accumulate power and influence, slowly at first and then quickly as the R&D acceleration gets going. Maybe singularity happens around 2030ish, but well before that point the long-term trajectory of the world is firmly in the grip of some coalition of AIs. Do you think (a) that the What 2026 Looks Like story is not plausible, (b) that the continuation of it I just sketched is not plausible, or (c) that it's not worth worrying about or preparing for? I'm guessing (b). Again it's unfortunate that I haven't finished the story, if I had we'd have more detailed things to disagree on. ETA: Since I doubt you have time or interest to engage with me at length about the 2026 scenario, what do you think about the bulk of my criticism above? About coordination ability? Both of us like to analogize AI stuff to historical stuff, it seems we differ in which historical examples we draw from -- for you, the steam engine and other bits of machinery, for me, colonial conquests and revolutions.

In our simulations, we find it overwhelmingly likely that any such spherical volume of an alien civ would be much larger than the full moon in the sky. So no need to study distant galaxies in fine detail; look for huge spheres in the sky. 

"or more likely we are an early civilization in the universe (according to Robin Hanson’s “Grabby Aliens” model) so, 2) quite possibly there are no grabby aliens populating the universe with S-Risks yet" 

But our model implies that there are in fact many aliens out there right now. Just not in our backward light cone.

Aw,  I still don't know which face goes with the TGGP name.

1teageegeepea
I was wearing a shirt designed by one of your colleagues.

Wow, it seems that EVERYONE here has this counter argument "You say humans look weird according to this calculation, but here are other ways we are weird that you don't explain." But there is NO WAY to explain all ways we are weird, because we are in fact weird in some ways. For each way that we are weird, we should be looking for some other way to see the situation that makes us look less weird. But there is no guarantee of finding that; we just just actually be weird. https://www.overcomingbias.com/2021/07/why-are-we-weird.html

2rodarmor
What do you think about my counter-argument here, which I think is close, but not exactly of the form you describe: https://www.lesswrong.com/posts/RrG8F9SsfpEk9P8yi/robin-hanson-s-grabby-aliens-model-explained-part-1?commentId=zTn8t2kWFHoXqA3Zc To sum up "You say humans look weird according to this calculation and come up with a model which explains that, however that model makes humans look even more weird, so if the argument for the model being correct is that it reduces weirdness, that argument isn't very strong."

You have the date of the great filter paper wrong; it was 1998, not 1996.

9Writer
Robin Hanson, you know nothing about Robin Hanson. You first wrote the paper in 1996 and then last updated it in 1998. ... or so says Wikipedia, that's why I wrote 1996. I just made this clear in the video description anyway, tell me if Wikipedia got this wrong.  Btw, views have nicely snowballed from your endorsement on Twitter, so thanks a lot for it.

Yes, a zoo hypothesis is much like a simulation hypothesis, and the data we use cannot exclude it. (Nor can they exclude a simulation hypothesis.) We choose to assume that grabby aliens change their volumes in some clearly visible way, exactly to exclude zoo hypotheses. 

4avturchin
If we are inside the colonisation volume, its change will be isotropic and will not look strange for us. For example, if aliens completely eliminated stars of X class as they are the best source of energy, we will not observe it, as there will be no X stars in any direction. 

Your point #1 misses the whole norm violation element. The reason it hurts if others are told about an affair is that others disapprove. That isn't why loud music hurts.

2ESRogs
Suppose it were easy to split potential blackmail scenarios into whistleblower scenarios (where the value of the information to society is quite positive) and embarrassing-but-useless scenarios (where it is not). Would you support legalizing blackmail in both classes, or just the first class? EDIT: I ask because, I think (at least part of) your argument is that if we legalize paying off whistleblowers, then that's okay, because would-be-whistleblowers still have an incentive to find wrongdoing, and the perpetrators still have an incentive to avoid that wrongdoing (or at least hide it, but hiding has costs, so on the margin it should mean doing less). (This reminds me a bit of white hat hackers claiming bug bounties.) Meanwhile, the anti-blackmail people argue that you don't want people to be incentivized to find ways to harm each other. So, if you could cleanly separate out the public benefit from the harm, on a case-by-case basis (rather than having to go with simple heuristics like "gossip is usually net beneficially"), it seems like you might be able to get to a synthesis of the two views.

Imagine there's a law against tattoos, and I say "Yes some gang members wear them but so do many others. Maybe just outlaw gang tattoos?" You could then respond that I'm messing with edge cases, so we should just leave the rule alone.

6Vaniver
A realistic example of this is that many onsen ban tattoos as an implicit ban on yakuza, which also ends up hitting foreign tourists with tattoos. It feels to me like there's a plausible deniability point that's important here ("oh, it's not that we have anything against yakuza, we just think tattoos are inappropriate for mysterious reasons") and a simplicity point that's important here (rather than a subjective judgment of whether or not a tattoo is a yakuza tattoo, there's the objective judgment of whether or not a tattoo is present). I can see it going both ways, where sometimes the more complex rule doesn't pay for itself, and sometimes it does, but I think it's important to take into account the costs of rule complexity.

You will allow harmful gossip, but not blackmail, because the first might be pursuing your "values", but the second is seeking to harm. Yet the second can have many motives, and is mostly commonly to get money. And you are focused too much on motives, rather than on outcomes.

1curi
If I threaten to do X unless you pay me, then the motive for making that threat is getting money. However, I don't get money for doing X. There are separate things involved (threat and action) with different motives.
2Ericf
Ok. I don't think that's the central example of what people, including Zvi, are picturing when you say "legalize blackmail." In fact, de-criminalizing that specific interaction, but leaving alone laws & norms against uncapped extraction, threats, etc. might find few opponents.

The sensible approach is. to demand a stream of payments over time. If you reveal it to others who also demand streams, that will cut how much of a stream they are willing to pay you.

You are very much in the minority if you want to abolish norms in general.

5jimmy
There's a parallel here with the fifth amendment's protection from self incrimination making it harder to enforce laws and laws being good on average. This isn't paradoxical because the fifth amendment doesn't make it equally difficult to enforce all laws. Actions that harm other people tend to have other ways of leaving evidence that can be used to convict. If you murder someone, the body is proof that someone has been harmed and the DNA in your van points towards you being the culprit. If you steal someone's bike, you don't have to confess in order to be caught with the stolen bike. On the other hand, things that stay in the privacy of your own home with consenting adults are *much* harder to acquire evidence for if you aren't allowed to force people to testify against themselves. They're also much less likely to be things that actually need to be sought out and punished. If it were the case that one coherent agent were picking all the rules with good intent, then it wouldn't make sense to create rules that make enforcement of other rules harder. There isn't one coherent agent picking all the rules and intent isn't always good, so it's important to fight for meta rules that make it selectively hard to enforce any bad rules that get through. You can try to argue that preventing blackmail isn't selective *enough* (or that it selects in the wrong direction), but you can't just equate blackmail with "norm enforcement [applied evenly across the board]".
3Dan B
I'm not arguing for abolishing norms. You are arguing for dramatically increasing the rate of norm enforcement, and I'm arguing for keeping norm enforcement at the current level. Above, I've provided several examples of ways that I think that increasing the rate of norm enforcement could have bad effects. Do you have some examples of ways that you think that increasing the rate of norm enforcement could have good effects? Note that, for this purpose, we are only counting norm enforcements that are so severe that people would be willing to pay a blackmail fee to escape them. You can't say "there's a norm against littering, so increasing the rate of enforcing that norm would decrease littering" unless you have a plausible scenario in which people would get blackmailed for littering.

NDAs are also legal in the case where info was known before the agreement. For example, Trump using NDAs to keep affairs secret.

1Sweetgum
Then perhaps we should ban this form of NDAs, rather than legalizing blackmail. They seem to have a pretty negative reputation already, and the NDAs that are necessary for business are the other type (signed before info is known).
3Richard_Kennaway
In that case, a key difference between an NDA and blackmail is that the former fulfils the requirements of a contract, while the latter does not (and not merely by being a currently illegal act). With an NDA where the information is already shared, the party who would prefer that it go no further proactively offers something in return for the other's continued silence. Each party is offering a consideration to the other. If the other party had initiated the matter by threatening to reveal the information unless paid off, there is no contract. Threatening harm and offering to refrain is not a valid consideration. On the contrary, it is the very definition of extortion. Compare cases where it is not information that is at issue. If a housing developer threatens to build an eyesore next to your property unless you pay him off, that is extortion. If you discover that he is planning to build something you would prefer not to be built, you might offer to buy the land from him. That would be a legal agreement. I don't know if you would favour legalising all forms of extortion, but that would be a different argument.
2purge
But the typical use of NDAs is notably different from the typical use of blackmail, isn't it? Even though in principle they could be used in all the same situations, they're aren't used that way in practice. Doesn't that make it reasonable to treat them differently?

"models are brittle" and "models are limited" ARE the generic complaints I pointed to.

We have lots of models that are useful even when the conclusions follow pretty directly. Such as supply and demand. The question is whether such models are useful, not if they are simple.

There are THOUSANDS of critiques out there of the form "Economic theory can't be trusted because economic theory analyses make assumptions that can't be proven and are often wrong, and conclusions are often sensitive to assumptions." Really, this is a very standard and generic critique, and of course it is quite wrong, as such a critique can be equally made against any area of theory whatsoever, in any field.

7TAG
But of course, it can't be used against them all equally. Physics is so good you can send a probe to a planet millions of miles away. But trying to achieve a practical result in economics is largely guesswork.
6apc
Aside from the arguments we made about modelling unawareness, I don't think we were claiming that econ theory wouldn't be useful. We argue that new agency models could tell us about the levels of rents extracted by AI agents; just that i) we can't infer much from existing models because they model different situations and are brittle, ii) that models won't shed light on phenomena beyond what they are trying to model

The agency literature is there to model real agency relations in the world. Those real relations no doubt contain plenty of "unawareness". If models without unawareness were failing to capture and explain a big fraction of real agency problems, there would be plenty of scope for people to try to fill that gap via models that include it. The claim that this couldn't work because such models are limited seems just arbitrary and wrong to me. So either one must claim that AI-related unawareness is of a very different type or scale from ordinary... (read more)

2Tom Davidson
I agree that the Bostrom/Yudkowsky scenario implies AI-related unawareness is of a very different scale from ordinary human cases. From an outside view perspective, this is a strike against the scenario. However, this deviation from past trends does follow fairly naturally (though not necessarily) from the hypothesis of a sudden and massive intelligence gap
9apc
The economists I spoke to seemed to think that in agency unawareness models conclusions follow pretty immediately from the assumptions and so don't teach you much. It's not that they can't model real agency problems, just that you don't learn much from the model. Perhaps if we'd spoken to more economists there would have been more disagreement on this point.

"Hanson believes that the principal-agent literature (PAL) provides strong evidence against rents being this high."

I didn't say that. This is what I actually said:

"surely the burden of 'proof' (really argument) should lie on those say this case is radically different from most found in our large and robust agency literatures."

Uh, we are talking about holding people to MUCH higher rationality standards than the ability to parse Phil arguments.

"At its worst, there might be pressure to carve out the parts of ourselves that make us human, like Hanson discusses in Age of Em."

To be clear, while some people do claim that such such things might happen in an Age of Em, I'm not one of them. Of course I can't exclude such things in the long run; few things can be excluded in the long run. But that doesn't seem at all likely to me in the short run.

4Richard_Ngo
Apologies for the mischaracterisation. I've changed this to refer to Scott Alexander's post which predicts this pressure.

You are a bit too quick to allow the reader the presumption that they have more algorithmic faith than the other folks they talk to. Yes if you are super rational and they are not, you can ignore them. But how did you come to be confident in that description of the situation?

8jessicata
Being able to parse philosophical arguments is evidence of being rational. When you make philosophical arguments, you should think of yourself as only conveying content to those who are rationally parsing things, and conveying only appearance/gloss/style to those who aren't rationally parsing things.

Everything I'm saying is definitely symmetric across persons, even if, as an author, I prefer to phrase it in the second person. (A previous post included a clarifying parenthetical to this effect at the end, but this one did not.)

That is, if someone who trusted your rationality noticed that you seemed visibly unmoved by their strongest arguments, they might think that the lack of agreement implies that they should update towards your position, but another possibility is that their trust has been misplaced! If they find themselves living a world of painted

... (read more)

Seems like you guys might have (or be able to create) a dataset on who makes what kind of forecasts, and who tends to be accurate or hyped re them. Would be great if you could publish some simple stats from such a dataset.

1AnthonyC
I probably could, but could not share such without permission from marketing, which creates a high risk of bias.

To be clear, Foresight asked each speakers to offer a topic for participants to forecast on, related to our talks. This was the topic I offered. That is NOT the same as my making a prediction on that topic. Instead, that is to say that the chance on this question seemed an unusual combination of verifiable in a year and relevant to the chances on other topics I talked about.

Foresight asked us to offer topics for participants to forecast on, related to our talks. This was the topic I offered. That is NOT the same as my making a prediction on that topic. Instead, that is to say that the chance on this question is an unusual combination of verifiable in a year and relevant to the chances on other topics I talked about.

2Ben Pace
Ah, that makes sense, thanks.
habryka120

Note that all three of the linked paper are about "boundedly rational agents with perfectly rational principals" or about "equally boundedly rational agents and principals". I have been so far unable to find any papers that follow the described pattern of "boundedly rational principals and perfectly rational agents".

The % of world income that goes to computer hardware & software, and the % of useful tasks that are done by them.

Most models have an agent who is fully rational, but I'm not sure what you mean by "principal is very limited".

I'd also want to know that ratio X for each of the previous booms. There isn't a discrete threshold, because analogies go on a continuum from more to less relevant. An unusually high X would be noteworthy and relevant, but not make prior analogies irrelevant.

Wei Dai290

Robin, I'm very confused by your response. The question I asked was for references to the specific models you talked about (with boundedly rational principals and perfectly rational agents), not how to find academic papers with the words "principal" and "agent" in them.

Did you misunderstand my question, or is this your way of saying "look it up yourself"? I have searched through the 5 review papers you cited in your blog post for mentions of models of this kind, and also searched on Google Scholar, with negative results. I can try to do more extensive sear

... (read more)

My understanding is that this progress looks much less of a trend deviation when you scale it against the hardware and other resources devoted to these tasks. And of course in any larger area there are always subareas which happen to progress faster. So we have to judge how large is a subarea that is going faster, and is that size unusually large.

Life extension also suffers from the 100,000 fans hype problem.

I'll respond to comments here, at least for a few days.

3Matthew Barnett
Other than, say looking at our computers and comparing them to insects, what other signposts should we look for, if we want to calibrate progress towards domain-general artificial intelligence?
Wei Dai140

You previously wrote:

We do have some models of [boundedly] rational principals with perfectly rational agents, and those models don’t display huge added agency rents. If you want to claim that relative intelligence creates large agency problems, you should offer concrete models that show such an effect.

The conclusions of those models seem very counterintuitive to me. I think the most likely explanation is that they make some assumptions that I do not expect to apply to the default scenarios involving humans and AGI. To check this, can you please refere

... (read more)
5Ofer
It seems you consider previous AI booms to be a useful reference class for today's progress in AI. Suppose we will learn that the fraction of global GDP that currently goes into AI research is at least X times higher than in any previous AI boom. What is roughly the smallest X for which you'll change your mind (i.e. no longer consider previous AI booms to be a useful reference class for today's progress in AI)? [EDIT: added "at least"]
1[anonymous]
Mostly unrelated to your point about AI, your comments about the 100,000 fans having the potential to cause harm rang true to me. Are there other areas in which you think the many non-expert fans problem is especially bad (as opposed to computer security, which you view as healthy in this respect)? ---------------------------------------- Would you consider progress on image recognition and machine translation as outside view evidence for lumpiness? Accuracies on ImageNet, an image classification benchmark, dropped by >10% over a 4-year period (graph below) mostly due to the successful scaling up of a type of neural network. This also seems relevant to your point about AI researchers who have been in the field for a long time being more skeptical. My understanding is that most AI researchers would not have predicted such rapid progress on this benchmark before it happened. That said, I can see how you still might argue this is an example of over-emphasizing a simple form of perception, which in reality is much more complicated and involves a bunch of different interlocking pieces.

Markets can work fine with only a few participants. But they do need sufficient incentives to participate.

"of all the hidden factors which caused the market consensus to reach this point, which, if any of them, do we have any power to affect?" A prediction market can only answer the question you ask it. You can use a conditional market to ask if a particular factor has an effect on an outcome. Yes of course it will cost more to ask more questions. If there were a lot of possible factors, you might offer a prize to whomever proposes a factor that turns out to have a big effect. Yes it would cost to offer such a prize, because it could be work to find such factors.

1SebastianG
Good point. But it is not just a cost problem. My conjecture in the above comment is that conditional markets are more prone to market failure because the structure of conditional questions decreases the pool of people who can participate. I need more examples of conditional markets in action to figure out what the greatest causes of market failure are for conditional markets.

I was once that young and naive. But I'd never heard of this book Moral Mazes. Seems great, and I intend to read it. https://twitter.com/robinhanson/status/1136260917644185606

The CEO proposal is to fire them at the end of the quarter if the prices just before then so indicate. This solves the problem of the market traders expecting later traders to have more info than they. And it doesn't mean that the board can't fire them at other times for other reasons.

Zvi140

We could expect prices prior to end of quarter to be strange, then, and potentially containing very strange information, but can also argue it shouldn't matter. So this is like the two-stage proposal. In stage 1 board decides whether to fire or not anyway, in stage 2 the prediction market decides whether to fire him anyway with a burst of activity, which has the advantage that you get your money back fast if it doesn't happen, and if it does happen you can just be long/short the stock. then if the board decides to fire him because it was 'to... (read more)

The claim that AI is vastly better at coordination seems to me implausible on its face. I'm open to argument, but will remain skeptical until I hear good arguments.

1mako yass
I'd expect a designed thing to have much cleaner, much more comprehensible internals. If you gave a human a compromise utility function and told them that it was a perfect average of their desires (or their tribe's desires) and their opponents' desires, they would not be able to verify this, they wouldn't recognise their utility function, they might not even individually possess it (again, human values seem to be a bit distributed), and they would be inclined to reject a fair deal, humans tend to see their other only in extreme shades, more foreign than they really are. Do you not believe that an AGI is likely to be self-comprehending? I wonder, sir, do you still not anticipate foom? Is it connected to that disagreement?
Wei Dai100

Have you considered the specific mechanism that I proposed, and if so what do you find implausible about it? (If not, see this longer post or this shorter comment.)

I did manage to find a quote from you that perhaps explains most of our disagreement on this specific mechanism:

There are many other factors that influence coordination, after all; even perfect value matching is consistent with quite poor coordination.

Can you elaborate on what these other factors are? It seems to me that most coordination costs in the real world come from value differences,

... (read more)
2Dagon
As a subset of the claim that AI is vastly better at everything, being vastly better at coordination is plausible. The specific arguments that AI somehow has (unlike any intelligence or optimization process we know of today) introspection into it's "utility function" or can provide non-behavioral evidence of it's intent to similarly-powerful AIs seem pretty weak. I haven't seen anyone attempting to model shifting equilibria and negotiation/conflict among AIs (and coalitions of AIs and of AIs + humans) with differing goals and levels of computational power, so it seems pretty unfounded to speculate on how "coordination" as a general topic will play out.
5ryan_b
It seems to me that computers don't suffer from most of the constraints humans do. For example, AI can expose its source code and its error-less memory. Humans have no such option, and our very best approximations are made of stories and error-prone memory. They can provide guarantees which humans cannot, simulate one another within precise boundaries in a way humans cannot, calculate risk and confidence levels in a way humans cannot, communicate their preferences precisely in a way humans cannot. All of this seems to point in the direction of increased clarity and accuracy of trust. On the other hand, I see no reason to believe AI will have the strong bias in favor of coordination or trust that we have, so it is possible that clear and accurate trust levels will make coordination a rare event. That seems off to me though, because it feels like saying they would be better off working alone in a world filled with potential competitors. That statement flatly disagrees with my reading of history.
4habryka
What evidence would convince you otherwise? Would superhuman performance in games that require difficult coordination be compelling? Deepmind has outlined Hanabi as one of the next games to tackle: https://arxiv.org/abs/1902.00506

Secrecy CAN have private value. But it isn't at all clear that we are typically together better off with secrets. There are some cases, to be sure, where that is true. But there are also so many cases where it is not.

6cousin_it
It seems to me that removing privacy would mostly help religions, political movements and other movements that feed on conformity of their members. That doesn't seem like a small thing - I'm not sure what benefit could counterbalance that.
2Dagon
Quite agree - depending on how you aggregate individual values and weigh the adversarial motives, it's quite possible that "we" are often worse off with secrets. It's not clear whether or when that's the case from the "simple model" argument, though. And certainly there are cases where unilateral revelations while others retain privacy are harmful. Anytime you'd like to play poker where your cards are face-up and mine are known only to me, let me know. I would love to explore whether private information is similar to other capital, where overall welfare can be improved by redistribution, but only under certain assumptions of growth, aggregation and individual benefits canceling out others' harms.

My guess is that the reason is close to why security is so bad: Its hard to add security to an architecture that didn't consider it up front, and most projects are in too much of a rush to take time to do that. Similarly, it takes time to think about what parts of a system should own what and be trusted to judge what.. Easier/faster to just make a system that does things, without attending to this, even if that is very costly in the long run. When the long run arrives, the earlier players are usually gone.

We have to imagine that we have some influence over the allocation of something, or there's nothing to debate here. Call it "resources" or "talent" or whatever, if there's nothing to move, there's nothing to discuss.

I'm skeptical solving hard philosophical problems will be of much use here. Once we see the actual form of relevant systems then we can do lots of useful work on concrete variations.

I'd call "human labor being obsolete within 10 years … 15%, and within 20 years … 35%" crazy extreme predi... (read more)

8Wei Dai
Let me rephrase my argument to be clearer. You suggested earlier, "and if resources today can be traded for a lot more resources later, the temptation to wait should be strong." This advice could be directed at either funders or researchers (or both). It doesn't seem to make sense for researchers, since they can't, by not working on AI alignment today, cause more AI alignment researchers to appear in the future. And I think a funder should think, "There will be plenty of funding for AI alignment research in the future when there are clearer warning signs. I could save and invest this money, and spend it in the future on alignment, but it will just be adding to the future pool of funding, and the marginal utility will be pretty low because at the margins, it will be hard to turn money into qualified alignment researchers in the future just as it is hard to do that today." So I'm saying this particular reallocation of resources that you suggested does not make sense, but the money/talent could still be reallocated some other way (for example to some other altruistic cause today). Do you have either a counterargument or another suggestion that you think is better than spending on AI alignment today? Have you seen my recent posts that argued for or supported this? If not I can link them: Three AI Safety Related Ideas, Two Neglected Problems in Human-AI Safety, Beyond Astronomical Waste, The Argument from Philosophical Difficulty. Sure, but why can't philosophical work be a complement to that? I won't defend these numbers because I haven't put much thought into this topic personally (since my own reasons don't depend on these numbers, and I doubt that I can do much better than deferring to others). But at what probabilities would you say that substantial work on alignment today would start to be worthwhile (assuming the philosophical difficulty argument doesn't apply)? What do you think a world where such probabilities are reasonable would look like?
Load More