Davidmanheim - LessWrong

Proposal for making credible commitments to AIs.

It's very much a tradeoff, though. Loose deployment allows for credible commitments, but also makes human monitoring and verification harder, if not impossible.

Proposal for making credible commitments to AIs.

Davidmanheim1d42

Strongly agree. Fundamentally, as long as models don't have more direct access to the world, there are a variety of failure modes that are inescapable. But solving that creates huge new risks as well! (As discussed in my recent preprint; https://philpapers.org/rec/MANLMH )

Proposal for making credible commitments to AIs.

Davidmanheim1d40

The idea was also proposed in a post on LW a few weeks ago: https://www.lesswrong.com/posts/psqkwsKrKHCfkhrQx/making-deals-with-early-schemers

Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)

Davidmanheim3d10

But we weren't talking about 254, we were talking about 222, so that it could / should be skin-safe, at least.

The Industrial Explosion

Davidmanheim3d40

Yeah, I think Thomas was arguing the opposite direction, and he argued that you "underrate the capabilities of superintelligence," and I was responding to why that wasn't addressing the same scenario as your original post.

BIG-Bench Canary Contamination in GPT-4

Davidmanheim5dΩ240

Flagging that I just found that Google Gemini also has this contamination: https://twitter.com/davidmanheim/status/1939597767082414295

The Industrial Explosion

Davidmanheim6d20

The macroscopic biotech that accomplishes what you're positing is addressed in the first part, and the earlier comment where I note that you're assuming ASI level understanding of bio for exploring an exponential design space for something that isn't guaranteed to be possible. The difficulty isn't unclear, it's understood not to bebfeasible.

The Industrial Explosion

Davidmanheim6d31

Given the premises, I guess I'm willing to grant that this isn't a silly extrapolation, and absent them it seems like you basically agree with the post?

However, I have a few notes on why I'd reject your premises.

On your first idea, I think high-fidelity biology simulators require so much understanding of biology that they are subsequent to ASI, rather than a replacement. And even then, you're still trying to find something by searching an exponential design space - which is nontrivial even for AGI with feasible amounts of "unlimited" compute. Not only that, but the thing you're looking for needs to do a bunch of stuff that probably isn't feasible due to fundamental barriers (Not identical to the ones listed there, but closely related to them.)

On your second idea, a software-only singularity assumes that there is a giant compute overhang for some specific buildable general AI that doesn't even require specialized hardware. Maybe so, but I'm skeptical; the brain can't be simulated directly via Deep NNs, which is what current hardware is optimized for. And if some other hardware architecture using currently feasible levels of compute is devised, there still needs to be a massive build-out of these new chips - which then allows "enough compute has been manufactured that nanotech-level things can be developed." But that means you again assume that arbitrary nanotech is feasible, which could be true, but as the other link notes, certainly isn't anything like obvious.

The Industrial Explosion

Davidmanheim6d40

How strong a superintelligence are you assuming, and what path did it follow? If it's already taken over mass production of chips to the extent that it can massively build out its own capabilities, we're past the point of industrial explosion. And if not, where did these (evidently far stronger than even the collective abilities of humanity, given the presumed capabilities,) capabilities emerge from?

The Industrial Explosion

Davidmanheim6d4-4

I'm very confused by this response - if we're talking about strong quality superintelligence, as opposed to cooperative and/or speed superintelligence, then the entire idea of needing an industrial explosion is wrong, since (by assumption) the superintelligent AI system is able to do things that seem entirely magical to us.

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments

Sequences

Posts

Wikitag Contributions

Comments