LESSWRONG
LW

3439
AnthonyC
3429Ω2813200
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
7AnthonyC's Shortform
6mo
2
8Dependencies and conditional probabilities in weather forecasts
Q
4y
Q
2
27Money creation and debt
Q
5y
Q
15
19Superintelligence and physical law
9y
1
1Scope sensitivity?
10y
3
23Types of recursion
12y
16
14David Brooks from the NY Times writes on earning-to-give
12y
3
9Cryonics priors
13y
22
7AnthonyC's Shortform
6mo
2
How does the current AI paradigm give rise to the "superagency" that IABIED is concerned with?
AnthonyC3d20

But this is already presupposing the existence of the superintelligence whose feasibility we are trying to explain.

Strictly speaking I only presupposed an AI could reach close to the limits of human intelligence in terms of thinking ability, but with the inherent speed and parallelizability and memory advantages of a digital mind.

Do you have any examples handy of AI being successful at real-world goals?

In small ways (aka sized appropriately for current AI capabilities) this kind of thing shows up all the time in chains of thought in response to all kinds of prompts, to the point that no, I don't have specific examples, because I wouldn't know how to pick one. The one that first comes to mind, I guess, was using AI to help me develop a personalized nutrition/supplement/weight loss/training regimen. 

Stepping back, I should reiterate that I'm talking about "the current AI paradigm"

That's fair, and a reasonable thing to discuss. After all, the fundamental claim of the book's title is about a conditional probability: IF it turns out that the anything like our current methods scale to superintelligent agents, we'd all be screwed. 

Reply
How does the current AI paradigm give rise to the "superagency" that IABIED is concerned with?
Answer by AnthonyCSep 29, 202573

I sincerely hope that if anyone has a concrete, actionable answer to this question, that they're smart enough not to share it publicly, for what I hope are obvious reasons.

But aside from that caveat, I think you are making several incorrect assumptions.

  1. "There is no massive corpus of such strategies that can be used as training data"
    1. The AI has, at minimum access-in-principle to everything that has ever been written or otherwise recorded, including all fiction, all historical records, and all analysis of both of those. This includes many, many, many examples and discussions of plans, successful and not, and detailed discussions of why humans believe they succeeded or failed.
  2. "(a) doing real-world experiments (whereby generating sufficient data would be far too slow and costly, or simply impossible)"
    1. People have already handed substantial amounts of crypto to at least one AI, which it can use to autonomously act in the real world by paying humans. What do you see as the upper bound on this, and why?
    2. I think most people greatly overestimate how much of this is actually needed for many kinds of goals. What do you see as the upper bound for what can, in principle, be done with a plan that an army of IQ-180 humans (aka no better qualitative thinking than what the smartest humans can do, so that this is a strict lower bound on ASI capabilities) came up with over subjective millennia with access to all recorded information that currently exists in the world? Assume the plan includes the capability to act in parallel, at scale, and the ability to branch its actions based on continued observation, just like groups of humans can, but with much better coordination within the group.
  3. "(b) a comprehensive world-model that is capable of predicting the results of proposed actions"
    1. See above - I'm not sure what you see as the upper bound for how good such a world model can or would likely be?
    2. One answer is "Because we're going to have long since handed it thousands to billions bodies to operate in the world, and problems to come up with plans to solve, and compute to use to execute and revise those plans." Without the bodies, we're already doing this.
    3. Current non-superintelligent AIs already come up with hypotheses and plans to test them and means to revise them and checks against past data all the time with increasing success rates over a widening range of problems. This is synthetic data we're already paying to generate.
    4. Also, have you ever run a plan (or anything else) by an LLM and asked it to find flaws and suggest solutions and estimate probabilities of success? This is already very useful at improving on human success rates across many domains.
  4. "Plans for achieving such goals are not amenable to simulation because you can't easily predict or evaluate the outcome of any proposed action. "
    1. It's actually very easy to get current LLMs to generate hypothetical actions well outside a narrow domain if you explain to them that there are unusually high stakes. We're not talking about a traditional chess engine thinking outside the rules of chess. We're about about systems whose currently-existing predecessors are increasingly broadly capable of finding solutions to open-ended problems using all available tools. This includes capabilities like deception, lying, cheating, stealing, giving synthesis instructions to make drugs, and explaining how to hire a hitman.
    2. Any plan a human can come up with without having personally conducted groundbreaking relevant experiments, is a plan that exists within or is implied by the combined corpus of training data available to an AI. This includes, for example, everything ever written by this community or anyone else, and everything anyone ever thought about upon reading everything ever written by this community or anyone else.
Reply
An N=1 observational study on interpretability of Natural General Intelligence (NGI)
AnthonyC5d20

True, but I think in this case there's at least no risk of an infinite regress. At one end, yes, it bottoms out in an extremely vague and inefficient but general hyperprior. I would guess from the little I've read that in humans these are the layers that govern how we learn from even before we're born. I would imagine an ASI would have at least one layer more fundamental than this, which enable it to change various fixed-in-humans assumptions about things.

At the other end would be the most specific or most abstracted layer of priors that has proven useful to date. Somewhere in the stack are your current best processes for deciding whether particular priors or layers of priors are useful or worth keeping or if you need a new one.

I am actually not sure whether 'prior' is quite the right term here? Some of it feels like the distinction between thingspace and conceptspace, where the priors might be more about the expectations what things exist and where natural concept boundaries lie and how to evaluate and re-evaluate those?

Reply
Ranking the endgames of AI development
AnthonyC5d30

I equally hope to write "life in the day of" posts for each category soon as a better visualisation of what each of these worlds entails.

I think this would be really interesting and useful! For me, just reading the flowchart and seeing the list laid out makes me assume most people would seriously underestimate how broad these categories could actually be. 

Exact placement would of course involve a number of value judgment calls. For example, I would probably characterize something like the outcome in Friendship is Optimal as an example of #7, but it could also be considered 8/10/11.

I'm also curious about your thoughts on the relative stability of each of these categories. To me, #6 seems metastable at best, for example, while #9 is an event, not a trajectory. AKA it is at least theoretically recoverable to some of the other states (or else declines into 10/11).

Reply
An N=1 observational study on interpretability of Natural General Intelligence (NGI)
AnthonyC5d20

The ability to get to consciously decide when to discard or rewrite or call on the simple programs is a superpower evolution didn't give humans. One that it seems would be the obvious solution for an AI that gets to call on an external, updatable set of tools. Or an ASI got got to rewrite the parts of itself that call the tools or notice (what it previously thought were) edge cases.

AKA, an ASI can go ahead and have a human-specific prior. It can choose to apply it until it meets entities that are alien, then stop applying it. Humans can't really do that, in the same way that we can't turn off our visual heuristics when encountering things we consciously know are weirdly constructed adversarial examples, even if we can sometimes override them with enough effort. The ASI, presumably, would further react to encountering aliens by reasoning from more basic principles (recurse as needed) as it learns enough to create 1) a new prior specific to those aliens, 2) a new prior specific to those aliens' species, culture, world, etc. 

Or at least, that's my <4 minute human-level single attempt at guessing a lower bound on an ASI's solution.

Reply
Economics Roundup #6
AnthonyC5d20

America would have to pay the subsidies off.

This is not necessarily true. At least not on any currently-human-relevant timescale. The ballooning can be a problem, especially when the money is spent very poorly. But if a reasonable fraction of it is spent on productive assets and other forms of growth, debt can grow for a long time. Longer than the typical lifespan of a country or currency.

Reply
AGI Companies Won't Profit From AGI
AnthonyC7d20

The part of the reasoning where others use the AI to generate value does seem to underexplore the possibility that the AI companies themselves use the AI for that first.

Reply
AGI Companies Won't Profit From AGI
AnthonyC7d20

What would you say to the idea that other kinds of capital retain value post-AGI? Like land, or mineral rights, or electricity generating capacity? I think those are also unlikely, but I do come across them once in a while.

Reply
AGI Companies Won't Profit From AGI
AnthonyC8d53

Let's consider the set of worlds where developing AGI does not destroy or permanently disempower humanity. 

You have a good point, that in many such scenarios the investors in AI labs, or the AI companies, may not be able to capture more than a tiny fraction of the value they generate.

Does this make the investment a mistake?

Personally, I don't think so. The people making these investments generally have a level of wealth where no amount of additional money can make more than a small additional improvement to their well being. In contrast, AGI and ASI could plausibly (within their lifetimes) render the world as a whole many OOMs richer, with capabilities that seem centuries or millennia away in a non-AI future. Not being able to claim all of that wealth may cost them status/dominance/control, but they would also gain the status/prestige of having enabled such radical improvement. And in any case, they might (very reasonably, IMO) be willing to trade status for immortality + godlike technological powers.

Also, in the proportion of worlds where the AI labs or those funding them do manage to retain control of the wealth created or to obtain and hold other forms of power, that's quite the high payoff even at these valuations.

Reply
Notes on fatalities from AI takeover
AnthonyC9d64

Yeah, my own instinct is to just see if the results are interesting in such a way that if I believed them, it would meaningfully change what I thought was the best strategy. In this case, don't think so. Even what I see as a very optimistic set of assumptions still results in what I see as an unacceptably high risk of very bad outcomes. I do find the exploration itself interesting, though.

Reply
Load More