All of LWLW's Comments + Replies

LWLW7-3

This just boils down to “humans aren’t aligned,” and that fact is why this would never work, but I still think it’s worth bringing up. Why are you required to get a license to drive, but not to have children? I don’t mean this in a literal way, I’m just referring to how casual the decision to have children is seen by much of society. Bringing someone into existence is vastly higher stakes than driving a car. 

I’m sure this isn’t implementable, but parents should at least be screened for personality disorders before they’re allowed to have children. And... (read more)

4cubefox
There is also the related problem of intelligence being negatively correlated with fertility, which leads to a dysgenic trend. Even if preventing people below a certain level of intelligence to have children was realistically possible, it would make another problem more severe: the fertility of smarter people is far below replacement, leading to quickly shrinking populations. Though fertility is likely partially heritable, and would go up again after some generations, once the descendants of the (currently rare) high-fertility people start to dominate.
6Garrett Baker
Historically attempts to curtail this right lead to really really dark places. Part of living in a society with rights and laws is that people will do bad things the legal system has no ability to prevent. And on net, that’s a good thing. See also.
LWLW20

How far along are the development of autonomous underwater drones in America? I’ve read statements by American military officials about wanting to turn the Taiwan straight into a drone-infested death trap. And I read someone (not an expert) who said that China is racing against time to try and invade before autonomous underwater drones take off. Is that true? Are they on track?

LWLW10

MuZero doesn’t seem categorically different from AlphaZero. It has to do a little bit more work at the beginning, but if you don’t get any reward for breaking the rules: you will learn not to break the rules. If MuZero is continuously learning then so is AlphaZero. Also, the games used were still computationally simple, OOMs more simple than an open-world game, let alone a true World-Model. AFAIK MuZero doesn’t work on open-ended, open-world games. And AlphaStar never got to superhuman performance at human speed either.


 

3Carl Feynman
I am in violent agreement.  Nowhere did I say that MuZero could learn a world model as complicated as those LLMs currently enjoy.  But it could learn continuously, and execute pretty complex strategies.  I don’t know how to combine that with the breadth of knowledge or cleverness of LLMs, but if we could, we’d be in trouble.
LWLW10

hi, thank you! i guess i was thinking about claims that "AGI is imminent and therefore we're doomed." it seems like if you define AGI as "really good at STEM" then it is obviously imminent. but if you define it as "capable of continuous learning like a human or animal," that's not true. we don't know how to build it and we can't even run a fruit-fly connectome on the most powerful computers we have for more than a couple of seconds without the instance breaking down: how would we expect to run something OOMs more complex and intelligent? "being good at STE... (read more)

For me, depression has been independent of the probability of doom.  I’ve definitely been depressed, but I’ve been pretty cheerful for the past few years, even as the apparent probability of near-term doom has been mounting steadily.  I did stop working on AI, and tried to talk my friends out of it, which was about all I could do.  I decided not to worry about things I can’t affect, which has clarified my mind immensely. 

The near-term future does indeed look very bright.

You shouldn’t worry about whether something “is AGI”; it’s an I’ll-defined concept.  I agree that current models are lacking the ability to accomplish long-term tasks in the real world, and this keeps them safe.  But I don’t think this is permanent, for two reasons.

Current large-language-model type AI is not capable of continuous learning, it is true.  But AIs which are capable of it have been built.  AlphaZero is perhaps the best example; it learns to play games to a superhuman level in a few hours.  It’s a topic of current resear... (read more)

LWLW20

Apologies in advance if this is a midwit take. Chess engines are “smarter” than humans at chess, but they aren’t automatically better at real-world strategizing as a result. They don’t take over the world. Why couldn’t the same be true for STEMlord LLM-based agents? 

It doesn’t seem like any of the companies are anywhere near AI that can “learn” or generalize in real time like a human or animal. Maybe a superintelligent STEMlord could hack their way around learning, but that still doesn’t seem the same as or as dangerous as fooming, and it also seems m... (read more)

6Carl Feynman
Welcome to Less Wrong.  Sometimes I like to go around engaging with new people, so that’s what I’m doing. On a sentence-by-sentence basis, your post is generally correct.  It seems like you’re disagreeing with something you’ve read or heard.  But I don’t know what you read, so I can’t understand what you’re arguing for or against.  I could guess, but it would be better if you just said.    
LWLW10

You’re probably right but I guess my biggest concern is  the first superhuman alignment researchers being aligned/dumb enough to explain to the companies how control works. It really depends on if self-awareness is present as well.

LWLW140

what is the plan for making task-alignment go well? i am much more worried about the possibility of being at the mercy of some god-emperor with a task-aligned AGI slave than I am about having my atoms repurposed by an unaligned AGI. the incentives for blackmail and power-consolidation look awful.

2MondSemmel
Why? I figure all the AI labs worry mostly about how to get the loot, without ensuring that there's going to be any loot in the first place. Thus there won't be any loot, and we'll go extinct without any human getting to play god-emperor. It seems to me like trying to build an AGI tyranny is an alignment-complete challenge, and since we're not remotely on track to solving alignment, I don't worry about that particular bad ending.
LWLW160

I honestly think the EV of superhumans is lower than the EV for AI. sadism and wills to power are baked into almost every human mind (with the exception of outliers of course). force multiplying those instincts is much worse than an AI which simply decides to repurpose the atoms in a human for something else. i think people oftentimes act like the risk ends at existential risks, which i strongly disagree with. i would argue that everyone dying is actually a pretty great ending compared to hyperexistential risks. it is effectively +inf relative utility.

with... (read more)

1kman
I don't think the result of intelligence enhancement would be "multiplying those instincts" for the vast majority of people; humans don't seem to end up more sadistic as they get smarter and have more options. I'm curious what value you assign to the ratio [U(paperclipped) - U(worst future)] / [U(best future) - U(paperclipped)]? It can't be literally infinity unless U(paperclipped) = U(best future). So your model is that we need to eradicate any last trace of sadism before superbabies is a good idea?
LWLW1-2

I think that people don’t consider the implications of something like this. This seems to imply that the mathematical object of a malevolent superintelligence exists, and that conscious victims of said superintelligence exist as well. Is that really desirable? do people really prefer that to some sort of teleology?

LWLW10

Yeah something like that, the ASI is an extension of their will.

2kman
So you think that, for >95% of currently living humans, the implementation of their CEV would constitute an S-risk in the sense of being worse than extinction in expectation? This is not at all obvious to me; in what way do you expect their CEVs to prefer net suffering?
LWLW40

This is just a definition for the sake of definition, but I think you could define a human as aligned if they could be given an ASI slave and not be an S-risk. I really think that under this definition, the absolute upper bound of “aligned” humans is 5%, and I think it’s probably a lot lower.

1kman
What would it mean for them to have an "ASI slave"? Like having an AI that implements their personal CEV?
4Noosphere89
I'm more optimistic, in that the upper bound could be as high as 50-60%, but yeah the people in power are unfortunately not part of this, and I'd only trust 25-30% of the population in practice if they had an ASI slave.
LWLW20

I should have clarified, I meant a small fraction and that that is enough to worry. 

TsviBT2518

After I finish my methods article, I want to lay out a basic picture of genomic emancipation. Genomic emancipation means making genomic liberty a right and a practical option. In my vision, genomic liberty is quite broad: it would include for example that parents should be permitted and enabled to choose:

  • to enhance their children (e.g. supra-normal health; IQ at the outer edges of the human envelope); and/or
  • to propagate their own state even if others would object (e.g. blind people can choose to have blind children); and/or
  • to make their children more n
... (read more)
2kman
In that case I'd repeat GeneSmith's point from another comment: "I think we have a huge advantage with humans simply because there isn't the same potential for runaway self-improvement." If we have a whole bunch of super smart humans of roughly the same level who are aware of the problem, I don't expect the ruthless ones to get a big advantage. I mean I guess there is some sort of general concern here about how defense-offense imbalance changes as the population gets smarter. Like if there's some easy way to destroy the world that becomes accessible with IQ > X, and we make a bunch of people with IQ > X, and a small fraction of them want to destroy the world for some reason, are the rest able to prevent it? This is sort of already the situation we're in with AI: we look to be above the threshold of "ability to summon ASI", but not above the threshold of "ability to steer the outcome". In the case of AI, I expect making people smarter differentially speeds up alignment over capabilities: alignment is hard and we don't know how to do it, while hill-climbing on capabilities is relatively easy and we already know how to do it. I should also note that we have the option of concentrating early adoption among nice, sane, x-risk aware people (though I also find this kind of cringe in a way and predict this would be an unpopular move). I expect this to happen by default to some extent.
LWLW9-5

I agree. At least I can laugh if the AGI just decides it wants me as paperclips. There will be nothing to laugh about with ruthless power-seeking humans with godlike power.

LWLW70

That sounds very interesting! I always look forward to reading your posts. I don’t know if you know any policy people, but in this world, it would need to be punishable by jail-time to genetically modify intelligence without selecting for pro-sociality. Any world where that is not the case seems much, much worse than just getting turned into paper-clips.

LWLW85

I certainly wouldn’t sign up to do that, but the type of individual I’m concerned about likely wouldn’t mind sacrificing nannies if their lineage could “win” in some abstract sense. I think it’s great that you’re proposing a plan beyond “pray the sand gods/Sam Altman are benevolent.” But alignment is going to be an issue for superhuman agents, regardless of if they’re human or not.

GeneSmith111

Agreed. I've actually had a post in draft for a couple of years that discusses some of the paralleles between alignment of AI agents and alignment of genetically engineered humans.

I think we have a huge advantage with humans simply because there isn't the same potential for runaway self-improvement. But in the long term (multiple generations), it would be a concern.

LWLW86

I’m sure you’ve already thought about this, but it seems like the people who would be willing and able to jump through all of the hoops necessary would likely have a higher propensity towards power-seeking and dominance. So if you don’t edit the personality as well, what was it all for besides creating a smarter god-emperor? I think that in the sane world you’ve outlined where people deliberately avoid developing AGI, an additional level of sanity would be holding off on modifying intelligence until we have the capacity to perform the personality edits to ... (read more)

2kman
Not at all obvious to me this is true. Do you mean to say a lot of people would, or just some small fraction, and you think a small fraction is enough to worry?
9GeneSmith
It's a fair concern. But the problem of predicting personality can be solved! We just need more data. I also worry somewhat about brilliant psychopaths. But making your child a psychopath is not necessarily going to give them an advantage. Also can you imagine how unpleasant raising a psychopath would be? I don't think many parents would willingly sign up for that.
LWLW2512

How much do people know about the genetic components of personality traits like empathy? Editing personality traits might be almost as or even more controversial than modifying “vanity” traits. But in the sane world you sketched out this could essentially be a very trivial and simple first step of alignment. “We are about to introduce agents more capable than any humans except for extreme outliers: let’s make them nice.” Also, curing personality disorders like NPD and BPD would do a lot of good for subjective wellbeing. 

I guess I’m just thinking of a ... (read more)

2DanielLC
Imagine Star Trek if Khan were also engineered to be a superhumanly moral person.
1Roger Scott
Putting aside for the moment the fact that even "intelligence" is hardly a well-defined and easily quantified property, isn't it rather a giant leap to say we even know what a "better" personality is? I might agree that some disorders are reasonably well defined, and those might be candidates for trying to "fix", but if you're trying to match greater intelligence with "better" personality I think you first need a far better notion of what "better" personality actually means.
6David Gross
There are some promising but under-utilized interventions for improving personality traits / virtues in already-developed humans,* and a dearth of research about possible interventions for others. If we want more of that sort of thing, we might be better advised to fill in some of those gaps rather than waiting for a new technology and a new generation of megalopsychebabies.
GeneSmith173

Very little at the moment. Unlike intelligence and health, a lot of the variance in personality traits seems to be the result of combinations of genes rather than purely additive effects.

This is one of the few areas where AI could potentially make a big difference. You need more complex models to figure out the relationship between genes and personality.

But the actual limiting factor right now is not model complexity, but rather data. Even if you have more complex models, I don't think you're going to be able to actually train them until you have a lot mor... (read more)

LWLW70

>be me, omnipotent creator

>decide to create

>meticulously craft laws of physics

>big bang

>pure chaos

>structure emerges

>galaxies form

>stars form

>planets form

>life

>one cell

>cell eats other cell, multicellular life

>fish

>animals emerge from the oceans

>numerous opportunities for life to disappear, but it continues

>mammals

>monkeys

>super smart monkeys

>make tools, control fire, tame other animals

>monkeys create science, philosophy, art

>the universe is beginning to understand itself

>AI

>Humans and AI... (read more)

LWLW*40

I think Noah Carl was coping with the “downsides” he listed. Loss of meaning and loss of status are complete jokes. They are the problems of people who don’t have problems. I would even argue that focusing on X-risks rather than S-risks is a bigger form of cope than denying AI is intelligent at all. I don’t see how you train a superintelligent military AI that doesn’t come to the conclusion that killing your enemies vastly limits the amount of suffering you can inflict upon them.

Edit: I think loss of actual meaning, like conclusive proof we're in a dystele... (read more)

5Mitchell_Porter
Victory is the aim of war, not suffering. 
LWLW131

Everything feels so low-stakes right now compared to future possibilities, and I am envious of people who don’t realize that. I need to spend less time thinking about it but I still can’t wrap my head around people rolling a dice which might have s-risks on it. It just seems like a -inf EV decision. I do not understand the thought process of people who see -inf and just go “yeah I’ll gamble that.” It’s so fucking stupid.

4Thane Ruthenis
* They are not necessarily "seeing" -inf in the way you or me are. They're just kinda not thinking about it, or think that 0 (death) is the lowest utility can realistically go. * What looks like an S-risk to you or me may not count as -inf for some people.
LWLW30

Hi Steven! This is an old post, so you probably won't reply, but I'd appreciate it if you did! What do you think might be going on in the brains of schizophrenics with high intelligence? I know schizophrenia is typically associated with MRI abnormalities and lower intelligence, but this isn't always the case! At least for me, my MRI came back normal, and my cognitive abilities were sufficient to do well in upper level math courses at a competitive university: even during my prodromal period. I actually deal with hypersensitivity as well, so taking a very s... (read more)

3Steven Byrnes
Hmm. I don’t really know! But it’s fun to speculate… Possibility 1: Like you said, maybe strong short-range cortex-to-cortex communication + weak long-range cortex-to-cortex communication? I haven’t really thought about how that would manifest. Possibility 2: In terms of positive symptoms specifically, one can ask the question: “weak long-range cortex-to-cortex communication … compared to what?” And my answer is: “…compared to cortex output signals”. See Model of psychosis, take 2. …Which suggests a hypothesis: someone could have unusually trigger-happy cortex output signals. Then they would have positive schizophrenia symptoms without their long-range cortex-to-cortex communication being especially weak on an absolute scale, and therefore they would have less if any cognitive symptoms. (I’m not mentioning schizophrenia negative symptoms because I don’t understand those very well.) I guess Possibility 1 & 2 are not mutually exclusive. There could also be other possibilities I’m not thinking of. Hmm, “Unusually trigger-happy cortex output signals” theory might explain hypersensitivity too, or maybe not, I’m not sure, I think it depends on details of how it manifests.
LWLW21

I see no reason why any of these will be true at first. But the end-goal for many rational agents in this situation would be to make sure 2 and 3 are true.

1Milan W
Correct, those goals are instrumentally convergent.
LWLW10

That makes sense. This may just be wishful thinking on my part/trying to see a positive that doesn't exist, but psychotic tendencies might have higher representation among the population you're interested in than the trend you've described might suggest. Taking the very small, subjective sample that is "the best" mathematician of each of the previous four centuries (Newton, Euler, Gauss, and Grothendieck), 50% of them (Newton and Grothendieck) had some major psychotic experiences (admittedly vastly later in life than is typical for men).

Again, I'm probably... (read more)

LWLW30

"I don't think you'll need to worry about this stuff until you get really far out of distribution." I may sound like I'm just commenting for the sake of commenting but I think that's something you want to be crystal clear on. I'm pessimistic in general and this situation is probably unlikely but I guess one of my worst fears would be creating uberpsychosis. Sounding like every LWer, my relatively out of distribution capabilities made my psychotic delusions hyper-analytic/1000x more terrifying & elaborate than they would have been with worse working mem... (read more)

3GeneSmith
I can't really speak to your specific experience too well other than to simply say I'm sorry you had to go through that. We actually see that in general, mental health prevalence actually declines with increasing IQ. The one exception to this is aspbergers. I do think it's going to be very important to address mental health issues as well. Many mental health conditions are reasonably editable; we could reduce the prevalence of some by 50%+ with editing.
LWLW10

This might be a dumb/not particularly nuanced question, but what are the ethics of creating what would effectively be BSI? Chickens have a lot of health problems due to their size, they weren't meant to be that big. Might something similar be true for BSI? How would a limbic system handle that much processing power: I'm not sure it would be able to. How deep of a sense of existential despair and terror might that mind feel?

TLDR: Subjective experience would likely have a vastly higher ceiling and vastly lower floor. To the point where a BSI's (or ASI's for that matter) subjective experience would look like +/-inf to current humans.

5GeneSmith
It's not a dumb question. It's a fair concern. I think the main issue with chickens is not that faster growth is inevitably correlated with health problems, but that chicken breeders are happy to trade off a good amount of health for growth so long as the chicken doesn't literally die of health problems. You can make different trade-offs! We could have fast-growing chickens with much better health if breeders prioritized chicken health more highly. I don't think you'll need to worry about this stuff until you get really far out of distribution. If you're staying anywhere near the normal human distribution you should be able to do pretty simple stuff like select against the risk of mental disorders while increasing IQ.
LWLW237

Making the (tenuous) assumption that humans remain in control of AGI, won't it just be an absolute shitshow of attempted power grabs over who gets to tell the AGI what to do? For example, supposing OpenAI is the first to AGI, is it really plausible that Sam Altman will be the one actually in charge when there will have been multiple researchers interacting with the model much earlier and much more frequently? I have a hard time believing every researcher will sit by and watch Sam Altman become more powerful than anyone ever dreamed of when there's a chance they're a prompt away from having that power for themselves.

Milan W115

You're assuming that:
- There is a single AGI instance running.
- There will be a single person telling that AGI what to do
- The AGI's obedience to this person will be total.

I can see these assumptions holding approximately true if we get really really good at corrigibility and if at the same time running inference on some discontinuously-more-capable future model is absurdly expensive. I don't find that scenario very likely, though.