kaarelh AT gmail DOT com
coming up with good ideas is very difficult as well
(and it requires good judgment, also)
I've only skimmed the post so the present comment could be missing the mark (sorry if so), but I think you might find it worthwhile/interesting/fun to think (afresh, in this context) about how come humans often don't wirehead and probably wouldn't wirehead even with much much longer lives (in particular, much longer childhoods and research careers), and whether the kind of AI that would do hard math/philosophy/tech-development/science will also be like that.[1][2]
I'm not going to engage further on this here, but if you'd like to have a chat about this, feel free to dm me. ↩︎
I feel like clarifying that I'd inside-view say P( the future is profoundly non-human (in a bad sense) | AI (which is not pretty much a synthetic human) smarter than humans is created this century ) >0.98 despite this. ↩︎
i agree that most people doing "technical analysis" are doing nonsense and any particular well-known simple method does not actually work. but also clearly a very good predictor could make a lot of money just looking at the past price time series anyway
it feels to me like you are talking of two non-equivalent types of things as if they were the same. like, imo, the following are very common in competent entities: resisting attempts on one's life, trying to become smarter, wanting to have resources (in particular, in our present context, being interested in eating the Sun), etc.. but then whether some sort of vnm-coherence arises seems like a very different question. and indeed even though i think these drives are legit, i think it's plausible that such coherence just doesn't arise or that thinking of the question of what valuing is like such that a tendency toward "vnm-coherence" or "goal stability" could even make sense as an option is pretty bad/confused[1].
(of course these two positions i've briefly stated on these two questions deserve a bunch of elaboration and justification that i have not provided here, but hopefully it is clear even without that that there are two pretty different questions here that are (at least a priori) not equivalent)
briefly and vaguely, i think this could involve mistakenly imagining a growing mind meeting a fixed world, when really we will have a growing mind meeting a growing world — indeed, a world which is approximately equal to the mind itself. slightly more concretely, i think things could be more like: eg humanity has many profound projects now, and we would have many profound but currently basically unimaginable projects later, with like the effective space of options just continuing to become larger, plausibly with no meaningful sense in which there is a uniform direction in which we're going throughout or whatever ↩︎
Kaarel:
TK:
K:
TK:
K:
TK:
K:
TK:
K:
TK:
I think it's plausible that a system which is smarter than humans/humanity (and distinct and separate from humans/humanity) should just never be created, and I'm inside-view almost certain it'd be profoundly bad if such a system were created any time soon. But I think I'll disagree with like basically anyone on a lot of important stuff around this matter, so it just seems really difficult for anyone to be such that I'd feel like really endorsing them on this matter?[1] That said, my guess is that PauseAI is net positive, tho I haven't thought about this that much :)
Thank you for the comment!
First, I'd like to clear up a few things:
I'll now proceed to potential disagreements.
But there’s something else, which is a very finite legible learning algorithm that can automatically find all those things—the object-level stuff and the thinking strategies at all levels. The genome builds such an algorithm into the human brain. And it seems to work! I don’t think there’s any math that is forever beyond humans, or if it is, it would be for humdrum reasons like “not enough neurons to hold that much complexity in your head at once”.
Some ways I disagree or think this is/involves a bad framing:
like, such that it is already there in a fetus/newborn and doesn't have to be gained/built ↩︎
I think present humans have much more for doing math than what is "directly given" by evolution to present fetuses, but still. ↩︎
One attempt to counter this: "but humans could reprogram into basically anything, including whatever better system for doing math there is!". But conditional on this working out, the appeal of the claim that fetuses already have a load-bearing fixed "learning algorithm" is also defeated, so this counterargument wouldn't actually work in the present context even if this claim were true. ↩︎
let's assume this makes sense ↩︎
That said, I could see an argument for a good chunk of the learning that most current humans are doing being pretty close to gaining thinking-structures which other people already have, from other people that already have them, and there is definitely something finite in this vicinity — like, some kind of pure copying should be finite (though the things humans are doing in this vicinity are of course more complicated than pure copying, there are complications with making sense of "pure copying" in this context, and also humans suck immensely (compared to what's possible) even at "pure copying"). ↩︎
Thank you for your comment!
What you're saying seems more galaxy-brained than what I was saying in my notes, and I'm probably not understanding it well. Maybe I'll try to just briefly (re)state some of my claims that seem most relevant to what you're saying here (with not much justification for my claims provided in my present comment, but there's some in the post), and then if it looks to you like I'm missing your point, feel very free to tell me that and I can then put some additional effort into understanding you.
(* Also, there's something further to be said also about how [[doing math] and [thinking about how one should do math]] are not that separate.)
I'm at like inside-view p=0.93 that the above presents the right vibe to have about thinking (like, maybe genuinely about its potential development forever, but if it's like technically only the right vibe wrt the next years of thinking (at a 2024 rate) or something, then I'm still going to count that as thinking having this infinitary vibe for our purposes).[3]
However, the question about whether one can in principle make a math AI that is in some sense explicit/understandable anyway (that in fact proves impressive theorems with a non-galactic amount of compute) is less clear. Making progress on this question might require us to clarify what we want to mean by "explicit/understandable". We could get criteria on this notion from thinking through what we want from it in the context of making an explicit/understandable AI that makes mind uploads (and "does nothing else"). I say some more stuff about this question in 4.4.
if one is an imo complete lunatic :), one is hopeful about getting this so that one can make an AI sovereign with "the right utility function" that "makes there be a good future spacetime block"; if one is an imo less complete lunatic :), one is hopeful about getting this so that one can make mind uploads and have the mind uploads take over the world or something ↩︎
to clarify: I actually tend to like researchers with this property much more than I like basically any other "researchers doing AI alignment" (even though researchers with this property are imo engaged in a contemporary form of alchemy), and I can feel the pull of this kind of direction pretty strongly myself (also, even if the direction is confused, it still seems like an excellent thing to work on to understand stuff better). I'm criticizing researchers with this property not because I consider them particularly confused/wrong compared to others, but in part because I instead consider them sufficiently reasonable/right to be worth engaging with (and because I wanted to think through these questions for myself)! ↩︎
I'm saying this because you ask me about my certainty in something vaguely like this — but I'm aware I might be answering the wrong question here. Feel free to try to clarify the question if so. ↩︎
not really an answer but i wanted to communicate that the vibe of this question feels off to me because: surely one's criteria on what to be up to are/[should be] rich and developing. that is, i think things are more like: currently i have some projects i'm working on and other things i'm up to, and then later i'd maybe decide to work on some new projects and be up to some new things, and i'd expect to encounter many choices on the way (in particular, having to do with whom to become) that i'd want to think about in part as they come up. should i study A or B? should i start job X? should i 2x my neuron count using such and such a future method? these questions call for a bunch of thought (of the kind given to them in usual circumstances, say), and i would usually not want to be making these decisions according to any criterion i could articulate ahead of time (though it could be helpful to tentatively state some general principles like "i should be learning" and "i shouldn't do psychedelics", but these obviously aren't supposed to add up to some ultimate self-contained criterion on a good life)
for ideas which are "big enough", this is just false, right? for example, so far, no LLM has generated a proof of an interesting conjecture in math