kaarelh AT gmail DOT com
What then? One option is to never build superintelligence. But there’s also another option, namely: trying to get access to enhanced human labor, via the sorts of techniques I discussed in my post on waystations (e.g., whole brain emulation). In particular: unlike creating an alignment MVP, which plausibly requires at least some success in learning how to give AIs human-like values, available techniques for enhancing human labor might give you human-like values by default, while still resulting in better-than-human alignment research capabilities. Call this an “enhanced human labor” path.[12]
[12]: Though: note that if you thought that even an alignment MVP couldn’t solve the alignment problem, you need some story about why your enhanced human labor would do better.
something that is imo important but discordant with the analysis you give here:
* humans/humanity could also just continue becoming more intelligent/capable (i mean: in some careful, self-conscious, deliberate fashion; so not like: spawning a random alien AI that outfooms humans; of course, what this means is unclear — it would imo need to be figured out ever-better as we proceed), like maybe forever
even if you're mediocre at coming up with ideas, as long as it's cheap and you can come up with thousands, one of them is bound to be promising
for ideas which are "big enough", this is just false, right? for example, so far, no LLM has generated a proof of an interesting conjecture in math
coming up with good ideas is very difficult as well
(and it requires good judgment, also)
I've only skimmed the post so the present comment could be missing the mark (sorry if so), but I think you might find it worthwhile/interesting/fun to think (afresh, in this context) about how come humans often don't wirehead and probably wouldn't wirehead even with much much longer lives (in particular, much longer childhoods and research careers), and whether the kind of AI that would do hard math/philosophy/tech-development/science will also be like that.[1][2]
I'm not going to engage further on this here, but if you'd like to have a chat about this, feel free to dm me. ↩︎
I feel like clarifying that I'd inside-view say P( the future is profoundly non-human (in a bad sense) | AI (which is not pretty much a synthetic human) smarter than humans is created this century ) >0.98 despite this. ↩︎
i agree that most people doing "technical analysis" are doing nonsense and any particular well-known simple method does not actually work. but also clearly a very good predictor could make a lot of money just looking at the past price time series anyway
it feels to me like you are talking of two non-equivalent types of things as if they were the same. like, imo, the following are very common in competent entities: resisting attempts on one's life, trying to become smarter, wanting to have resources (in particular, in our present context, being interested in eating the Sun), etc.. but then whether some sort of vnm-coherence arises seems like a very different question. and indeed even though i think these drives are legit, i think it's plausible that such coherence just doesn't arise or that thinking of the question of what valuing is like such that a tendency toward "vnm-coherence" or "goal stability" could even make sense as an option is pretty bad/confused[1].
(of course these two positions i've briefly stated on these two questions deserve a bunch of elaboration and justification that i have not provided here, but hopefully it is clear even without that that there are two pretty different questions here that are (at least a priori) not equivalent)
briefly and vaguely, i think this could involve mistakenly imagining a growing mind meeting a fixed world, when really we will have a growing mind meeting a growing world — indeed, a world which is approximately equal to the mind itself. slightly more concretely, i think things could be more like: eg humanity has many profound projects now, and we would have many profound but currently basically unimaginable projects later, with like the effective space of options just continuing to become larger, plausibly with no meaningful sense in which there is a uniform direction in which we're going throughout or whatever ↩︎
Kaarel:
TK:
K:
TK:
K:
TK:
K:
TK:
K:
TK:
I think it's plausible that a system which is smarter than humans/humanity (and distinct and separate from humans/humanity) should just never be created, and I'm inside-view almost certain it'd be profoundly bad if such a system were created any time soon. But I think I'll disagree with like basically anyone on a lot of important stuff around this matter, so it just seems really difficult for anyone to be such that I'd feel like really endorsing them on this matter?[1] That said, my guess is that PauseAI is net positive, tho I haven't thought about this that much :)
Thank you for the comment!
First, I'd like to clear up a few things:
I'll now proceed to potential disagreements.
But there’s something else, which is a very finite legible learning algorithm that can automatically find all those things—the object-level stuff and the thinking strategies at all levels. The genome builds such an algorithm into the human brain. And it seems to work! I don’t think there’s any math that is forever beyond humans, or if it is, it would be for humdrum reasons like “not enough neurons to hold that much complexity in your head at once”.
Some ways I disagree or think this is/involves a bad framing:
like, such that it is already there in a fetus/newborn and doesn't have to be gained/built ↩︎
I think present humans have much more for doing math than what is "directly given" by evolution to present fetuses, but still. ↩︎
One attempt to counter this: "but humans could reprogram into basically anything, including whatever better system for doing math there is!". But conditional on this working out, the appeal of the claim that fetuses already have a load-bearing fixed "learning algorithm" is also defeated, so this counterargument wouldn't actually work in the present context even if this claim were true. ↩︎
let's assume this makes sense ↩︎
That said, I could see an argument for a good chunk of the learning that most current humans are doing being pretty close to gaining thinking-structures which other people already have, from other people that already have them, and there is definitely something finite in this vicinity — like, some kind of pure copying should be finite (though the things humans are doing in this vicinity are of course more complicated than pure copying, there are complications with making sense of "pure copying" in this context, and also humans suck immensely (compared to what's possible) even at "pure copying"). ↩︎
oki! in this scenario, i guess i'm imagining humans/humanity becoming ever-more-artificial (like, ever-more-[human/mind]-made) and ever-more-intelligent (like, eventually much more capable than anything that might be created by 2100), so this still seems like a somewhat unnatural framing to me