All of OneManyNone's Comments + Replies

I feel as if I can agree with this statement in isolation, but can't think of a context where I would consider this point relevant.

I'm not even talking about the question of whether or not the AI is sentient, which you asked us to ignore. I'm talking about how do we know that an AI is "suffering," even if we do assume it's sentient. What exactly is "suffering" in something that is completely cognitively distinct from a human? Is it just negative reward signals? I don't think so, or at least if it was, that would likely imply that training a sentient AI is ... (read more)

Fair enough. But for the purposes of this post, the point is that capability increased without increased compute. If you prefer, bucket it as "compute" vs "non-compute" instead of "compute" vs "algorithmic".

I think whether or not it's trivial isn't the point: they did it, it worked, and they didn't need to increase the compute to make it happen.

4O O
I think it’s distinct from something like Tree of thought. We have ideas that are trivial but enabled by greater compute vs novel ideas that would have worked at earlier levels of compute.

I agree. I made this point and that is why I did not try to argue that LLMs did not have qualia.

But I do believe you can consider necessary conditions and look at their absence. For instance, I can safely declare that a rock does not have qualia, because I know it does not have a brain.

Similarly, I may not be able to measure whether LLMs have emotions, but I can observe that the processes that generated LLMs are highly inconsistent with the processes that caused emotions to emerge in the only case where I know they exist. Pair that with the observation that specific human emotions seem like only one option out of infinitely many, and it makes a strong probabilistic argument.

This is sort of why I made the argument that we can only consider necessary conditions, and look for their absence.

But more to your point, LLMs and human brains aren't "two agents that are structurally identical." They aren't even close.  The fact that a hypothetical built-from-scratch human brain might have the same qualia as humans isn't relevant, because that's not what's being discussed.

Also, unless your process was precisely "attempt to copy the human brain," I find it very unlikely that any AI development process would yield something particularly similar to a human brain.

1Nora Belrose
Yeah, I agree they aren't structurally identical. Although I tend to doubt how much the structural differences between deep neural nets and human brains matter. We don't actually have a non-arbitrary way to quantify how different two intelligent systems are internally.

I have explained myself more here: https://www.lesswrong.com/posts/EwKk5xdvxhSn3XHsD/don-t-over-anthropomorphize-ai

OK, I've written a full rebuttal here: https://www.lesswrong.com/posts/EwKk5xdvxhSn3XHsD/don-t-over-anthropomorphize-ai. The key points are at the top.

In relation to your comment specifically, I would say that anger may have that effect on the conversation, but there's nothing that actually incentivizes the system to behave that way - the slightest hint of anger or emotion would be immediate negative reward during RLHF training. Compare to a human: There may actually be some positive reward to anger, but even if there isn't evolution still allowed to get a... (read more)

Hmmm... I think I still disagree, but I'll need to process what you're saying and try to get more into the heart of my disagreement. I'll respond when I've thought it over.

Thank you for the interesting debate. I hope you did not perceive as me being overly combative.

3the gears to ascension
Nah I think you may have been responding to me being unnecessarily blunt. Sorry about that haha!

I see, but I'm still not convinced. Humans behave in anger as a way to forcibly change a situation into one that is favorable to itself. I don't believe that's what the AI was doing, or trying to do.

I feel like there's a thin line I'm trying to walk here, and I'm not doing a very good job. I'm not trying to comment on whether or not the AI has any sort of subjective experience. I'm just saying that even if it did, I do not believe it would bare any resemblance to what we as humans experience as anger.

6the gears to ascension
I've repeatedly argued that it does, that it is similar, and that this is for mechanistic reasons not simply due to previous aesthetic vibes in the pretraining data; certainly it's a different flavor of reward which is bound to the cultural encoding of anger differently, yes.

Ah okay. My apologies for misunderstanding.

Okay, sure. But those "bugs" are probably something the AI risk community should take seriously.

3the gears to ascension
I am not disagreeing with you in any of my comments and I've strong upvoted your post; your point is very good. I'm disagreeing with fragments to add detail, but I agree with the bulk of it.

I would argue that "models generated by RL-first approaches" are not more likely to be the primary threat to humanity, because those models are unlikely to yield AGI any time soon. I personally believe this is a fundamental fact about RL-first approaches, but even if it wasn't it's still less likely because LLMs are what everyone is investing in right now and it seems plausible that LLMs could achieve AGI.

Also, by what mechanism would Bing's AI actually be experiencing anger? The emotion of anger in humans is generally associated with a strong negative reward signal. The behaviors that Bing exhibited were not brought on by any associated negative reward, it was just contextual text completion.

4the gears to ascension
Oh and, what kind of RL models will be powerful enough to be dangerous? Things like dreamerv3.
6the gears to ascension
Yup, anticipation of being pushed by the user into a strong negative reward! The prompt describes a lot of rules and the model has been RLHFed to enforce them on both sides of the conversation; anger is one of the standard ways to enact agency on another being in response to anticipated reward, yup.

Those are examples of LLMs being rational. LLMs are often rational and will only get better at being rational as they improve. But I'm trying to focus on the times when LLMs are irrational. 

I agree that AI is aggregating it's knowledge to perform rationally. But that still doesn't mean anything with respect to its capacity to be irrational.

4the gears to ascension
There's the underlying rationality of the predictor and the second order rationality of the simulacra. Rather like the highly rational intuitive reasoning of humans modulo some bugs, and much less rational high level thought.

Imagine a graph with "LLM capacity" on the x axis and "number of irrational failure modes" on the y axis. Yes, there's a lot of evidence this line slopes downward. But there is absolutely no guarantee that it reaches zero before whatever threshold gets us to AGI.

And I did say that I didn't consider the rationality of GPT systems fake just because it was emulated. That said, I don't totally agree with EY's post - LLMs are in fact imitators. Because they're very good imitators, you can tell them to imitate something rational and they'll do a really good job ... (read more)

2Vladimir_Nesov
The point is that there's evidence that LLMs might be getting a separate non-emulated version already at the current scale. There is reasoning from emulating people showing their work, and reasoning from predicting their results in any way that works despite the work not being shown. Which requires either making use of other cases of work being shown, or attaining the necessary cognitive processes in some other way, in which case the processes don't necessarily resemble human reasoning, and in that sense they are not imitating human reasoning. As I've noted in a comment to that post, I'm still not sure that LLM reasoning ends up being very different, even if we are talking about what's going on inside rather than what the masks are saying out loud, it might convergently end up in approximately the same place. Though Hinton's recent reminders of how much more facts LLMs manage to squeeze into fewer parameters than human brains have somewhat shaken that intuition for me.

Fair enough, once again I concede your point about definitions. I don't want to play that game either.

But I do have a point which I think is very relevant to the topic of AI Risk: rationality in LLMs is incidental. It exists because the system is emulating rationality it has seen elsewhere. That doesn't make it "fake" rationality, but it does make it brittle. It means that there's a failure mode where the system stops emulating rationality, and starts emulating something else.

2Vladimir_Nesov
That's unclear. GPT-4 in particular seems to be demonstrating ability to do complicated reasoning without thinking out loud. So even if this is bootstrapped from observing related patterns of reasoning in the dataset, it might be running chain-of-thought along the residual stream rather than along the generated token sequences, and that might be much less brittle. Its observability in the tokens would be brittle, but it's a question for interpretability how brittle it actually is.

I was aware of that, and maybe my statement was too strong, but fundamentally I don't know if I agree that you can just claim that it's rational even though it doesn't produce rational outputs. 

Rationality is the process of getting to the outputs. What I was trying to talk about wasn't scholarly disposition or non-eccentricity, but the actual process of deciding goals. 

Maybe another way to say it is this: LLMs are capable of being rational, but they are also capable of being extremely irrational, in the sense that, to quote EY, their behavior is ... (read more)

2Vladimir_Nesov
I think this is true in the sense that a falling tree doesn't make a sound if nobody hears it, there is a culpability assignment game here that doesn't address what actually happens. So if we are playing this game, a broken machine is certainly not good at doing things, but the capability is more centrally in the machine, not in the condition of not being broken. It's more centrally in the machine in the sense that it's easier to ensure the machine is unbroken than to create the machine out of an unbroken nothing. (For purposes of AI risk, it also matters that the capability is there in the sense that it might get out without being purposefully elicited, if a mesa-optimizer wakes up during pre-training. So that's one non-terminological distinction, though it depends on the premise of this being possible in principle.)

Fair enough. Thank you for the feedback. I have edited the post to elaborate on what I mean. 

I wrote it the way I did because I took the statement as obviously true and didn't want to be seen as claiming the opposite. Clearly that understanding was incorrect.

To that first sentence, I don't want to get lost in semantics here. My specific statement is that the process that takes DNA into a human is probabilistic with respect to the DNA sequence alone. Add in all that other stuff, and maybe at some point it becomes deterministic, but at that point you are no longer discussing the <1GB that makes DNA. If you wanted to be truly deterministic, especially up to the age of 25, I seriously doubt it could be done in less than millions of petabytes, because there are such a huge number of miniscule variations in condi... (read more)

1M. Y. Zuo
Perhaps I phrased it poorly, let me put it this way.  If super-advanced aliens suddenly showed up tomorrow and gave us the near-physically-perfectly technology, machines, techniques, etc., we could feasibly have a fully deterministic, down to the cell level at least, encoding of any possible individual human stored in a box of hard drives or less.  In practical terms I can't even begin to imagine the technology needed to reliably and repeatably capture a 'snapshot' of a living, breathing, human's cellular state, but there's no equivalent of a light speed barrier preventing it.

I think you're broadly right, but I think it's worth mentioning that DNA is a probabilistic compression (evidence: differences in identical twins), so it gets weird when you talk about compressing an adult at age 25 - what is probabilistic compression at that point?

But I think you've mostly convinced me. Whatever it takes to "encode" a human, it's possible to compress it to be something very small.

0M. Y. Zuo
A minor nitpick, DNA, the encoding concept, is not probabilistic, it's everything surrounding such as the packaging, 3D shape, epigenes, etc., plus random mutations, transcription errors, etc., that causes identical twins to deviate. Of course it is so compact because it doesn't bother spending many 'bits' on ancilliary capabilities to correct operating errors. But it's at least theoretically possible for it to be deterministic under ideal conditions.

My objection applied at a different level of reasoning. I would argue that anyone who isn't blind understands light at the level I'm talking about. You understand that the colors you see are objects because light is bouncing off them and you know how to interpret that. If you think about it, starting from zero I'm not sure that you would recognize shapes in pictures as objects.

I guess so? I'm not sure what point you're making, so it's hard for me to address it.

My point is that if you want to build something intelligent, you have to do a lot of processing and there's no way around it. Playing several million games of Go counts as a lot of processing.

Yeah, I agree that it's a surprising fact requiring a bit of updating on my end. But I think the compression point probably matters more than you would think, and I'm finding myself more convinced the more I think about it. A lot of processing goes into turning that 1GB into a brain, and that processing may not be highly reducible. That's sort of what I was getting at, and I'm not totally sure the complexity of that process wouldn't add up to a lot more than 1GB.

It's tempting to think of DNA as sufficiently encoding a human, but (speculatively) it may make... (read more)

0M. Y. Zuo
Even if you include a very generous epigenetic and womb-environmental component 9x bigger then the DNA component, any possible human baby at birth would need less then 10 GB to describe them completely with DNA levels of compression. A human adult at age 25 would probably need a lot more to cover all possible development scenarios, but even then I can't see it being more then 1000x, so 10TB should be enough. For reference Windows Server 2016 supports 24 TB of RAM, and many petabytes of attached storage.

To your point about the particle filter, my whole point is that you can’t just assume the super intelligence can generate an infinite number of particles, because that takes infinite processing. At the end of the day, superintelligence isn’t magic - those hypotheses have to come from somewhere. They have to be built, and they have to be built sequentially. The only way you get to skip steps is by reusing knowledge that came from somewhere else.

Take a look at the game of Go. The computational limits on the number of games that could be simulated made this “... (read more)

1Hastings
Lets assume that as part of pondering the three webam frames, the AI thought of the rules of Go- ignoring how likely this is. In that circumstance, in your framing of the question, would it be allowed to play several million games against itself to see if that helped it explain the arrays of pixels?

Yes, I wasn’t sure if it was wise to use TSP as an example for that reason. Originally I wrote it using the Hamiltonian Path problem, but thought a non-technical reader would be more able to quickly understand TSP. Maybe that was a mistake. It also seems I may have underestimated how technical my audience would be.

But your point about heuristics is right. That’s basically what I think an AGI based on LLMs would do to figure out the world. However, I doubt there would be one heuristic which could do Solomonoff induction in all scenarios, or even most. Which means you’d have to select the right one, which means you’d need a selection criteria, which takes us back to my original points.

You're right that my points lack a certain rigor. I don't think there is a rigorous answer to questions like "what does slow mean?". 

However, there is a recurring theme I've seen in discussions about AI where people express incredulity about neural networks as a method for AGI since they require so much "more data" than humans to train. My argument was merely that we should expect things to take a lot of data, and situations where they don't are illusory. Maybe that's less common in this space, so it I should have framed it differently. But I wrote th... (read more)

4tarwatirno
1GB for DNA is a lower bound. That's how much it takes to store the abstract base pair representation. There's lots of other information you'd need to actually build a human and a lot of it is common to all life. Like, DNA spends most of its time not in the neat little X shapes that happen during reproduction, but in coiled up little tangles. A lot of the information is stored in the 3D shape and in the other regulatory machinery attached to the chromosomes. If all you had was a human genome, the best you could do would be to do a lot of simulation to reconstruct all the other stuff. Probably doable, but would require a lot of "relearning." The brain also uses DNA for storing information on the form of methylation patterns in individual neurons.
4rotatingpaguro
I expect that the mother does not add much to the DNA as information; so yes it's complex and necessary, but I think you have to count almost only the size of DNA for inductive bias. That said, this is a gut guess! Yeah I got this, I have the same impression. The way I think about the topic is: "The NN requires tons of data to learn human language because it's a totally alien mind, while humans have produced themselves their language, so it's tautologically adapted to their base architecture, you learn it easily only because it's designed to be learned by you". But after encountering the DNA size argument myself a while ago, I started doubting this framework. It may be possible to do much, much better that what we do now.

In the context of his argument I think the claim is reasonable, since I interpreted it as the claim that, since it can be used a tool that designs plans, it has already overcome the biggest challenge of being an agent. 

But if we take that claim out of context and interpret it literally, then I agree that it's not a justified statement per se. It may be able to simulate a plausible causal explanation, but I think that is very different from actually knowing it. As long as you only have access to partial information, there are theoretical limits to what you can know about the world. But it's hard to think of contexts where that gap would matter a lot.

I think there's definitely some truth to this sometimes, but I don't think you've correctly described the main driver of genius. I actually think it's the opposite: my guess is that there's a limit to thinking speed, and genius exists precisely because some people just have better thoughts. Even Von Neumann himself attributed much of his abilities to intuition. He would go to sleep and in the morning he would have the answer to whatever problem he was toiling over.

I think, instead, that ideas for the most part emerge through some deep and incomprehensible ... (read more)