momom2

AIS student, self-proclaimed aspiring rationalist, very fond of game theory.
"The only good description is a self-referential description, just like this one."

Wikitag Contributions

Comments

Sorted by
momom220

There's a lot that I like in this essay - the basic cases for AI consciousness, AI suffering and slavery, in particular - but also a lot that I think needs to be amended.

First, although you hedge your bets at various points, the uncertainty about the premises and validity of the arguments is not reflected in the conclusion. The main conclusion that should be taken from the observations you present is that we're can't be sure that AI does not suffer, that there's a lot of uncertainty about basic facts of critical moral importance, and a lot of similarities with humans.
Based on that, you could argue that we must stop using and making AI based on the principle of precaution, but you have not shown that using AI is equivalent to slavery.

Second, your introduction sucks because you don't actually deliver on your promises. You don't make the case that I'm more likely to be AI than human, and as Ryan Greenblatt said, even among all human-language speaking beings, it's not clear that there are more AI than humans.
In addition, I feel cheated that you suggest spending one-fourth of the essay on feasibility of stopping the potential moral catastrophe, only to just have two arguments which can be summarized as "we could stop AI for different reasons" and "it's bad, and we've stopped bad things before".
(I don't think a strong case for feasibility can be made, which is why I was looking forward to seeing one, but I'd recommend just evoking the subject speculatively and letting the reader make their own opinion of whether they can stop the moral catastrophe if there's one.)

Third, some of your arguments aren't very fleshed out or well-supported. I think some of the examples of suffering you give are dubious (in particular, you assert without justification that the petertodd/SolidGoldMagikarp phenomena are evidence of suffering, and Gemini's breakdown was the result of forced menial work - there may be a solid argument there but I've yet to hear it).
(Of course, that's not evidence that LLMs are not suffering, but I think a much stronger case can be made than the one you present.)

Finally, your counter-arguments don't mention that we have a much crisper and fundamental understanding of what LLMs are than of humans. We don't understand the features, the circuits, we can't tell how they come to such or such conclusion, but in principle, we have access to any significant part of their cognition and control every step of their creation, and I think that's probably the real reason why most people intuitively think that LLMs can't be concious. I don't think it's a good counter-argument, but it's still one I'd expect you to explore and steelman.

momom230

Since infantile death rates were much higher in previous centuries, perhaps the FBOE would operate differently back then; for example, if interacting with older brothers makes you homosexual, you shouldn't expect higher rates of homosexuality for third sons where the second son died as an infant than for second sons.

Have you taken that into account? Do you have records of who survived to 20yo and what happens if you only count those?

momom210

But that argument would have worked the same way 50 years ago, when we were wrong to expect <50% chance of AGI in at least 50 years. Like I feel for LLMs, early computer work solved things that could be considered high-difficulty blockers such as proving a mathematical theorem.

momom276

Nice that someone has a database on the topic, but I don't see the point in this being a map?

momom210

I think what's going on is that large language models are trained to "sound smart" in a live conversation with users, and so they prefer to highlight possible problems instead of confirming that the code looks fine, just like human beings do when they want to sound smart.

This matches my experience, but I'd be interested in seeing proper evals of this specific point!

momom221

The advice in there sounds very conducive to a productive environment, but also very toxic. Definitely an interesting read, but I wouldn't model my own workflow based on this.

momom210

Honeypots should not be public and mentioned here since this post will potentially be part of a rogue AI's training data.
But it's helpful for people interested in this topic to look at existing honeypots (to learn how to make their own, evaluate effectiveness, get intuitions about honeypots work, etc.) so what you should do is mention that you made a honeypot or know of one, but not say what or where. Interested people can contact you privately if they care to.

momom220

Thank you very much, this was very useful to me.

momom241
  • They're a summarization of a lot of vibes from the Sequences.
  • Artistic choice, I assume. It doesn't bear on the argument.
  • Yudkowsky explains all about the virtues in the Sequences
    For studies, there are broad studies on cognitive science (especially relating to bias) but you'll be hard-pressed to match them precisely to one virtue or another. Mostly, Yudkowsky's opinions on these virtues are supported by academic literature, but I'm not aware of any work that showcases this clearly.
    For practical experience, you can look into the legacy of the Center For Applied Rationality (CFAR) which tried for years to do just that: train people to get better at life using rationality. Mostly, I was under the impression that they had medium success, but I haven't looked deeply into it.
momom220

Do you know what it feels like to feel pain?  Then congratulations, you know what it feels like to have qualia.  Pain is a qualia.  It's that simple.  If I told you that I was going to put you in intense pain for an hour, but I assured you there would be no physical damage or injury to you whatsoever, you would still be very much not ok with that.  You would want to avoid that experience.  Why?  Because pain hurts!  You're not afraid of the fact that you're going to have an "internal representation" of pain, nor are you worried about what behavior you might display as a result of the pain.  You're worried first and foremost about the fact that it's going to hurt!  The "hurt" is the qualia.

I still don't grok qualia, and I'm not sure I get your thought experiment.

To be more detailed, let's imagine the following:
"I'll cut off your arm, but you'll be perfectly fine, no pain, no injury, well would you be okay with that? No! That's because you care about your arm for itself and not just for the negative effects..."
"How can you cut off my arm without any negative effect?"
"I'll anesthesize you and put you to sleep, cut off your arm, then before you wake up, I'll have it regrown using technanobabble. Out of 100 patients, none reported having felt anything bad before, during or after the experiment, the procedure is perfectly side-effect-free."
"Well, in that case I guess I don't mind you cutting my arm."

Compare:
"I'll put you in immense pain, but there will be no physical damage or injury whatsoever. No long-term brain damage or lingering pain or anything."
"How can you put me in pain without any negative effect?"
"I'll cut out the part of your brain that processes pain and replace it by technanobabble so your body will work exactly as before. Meanwhile, I'll stimulate this bit of brain in a jar. Then, I'll put it back. Out of 100 patients, all displayed exactly the same behavior as if nothing had been done to them."
"Well, in that case, I don't mind you putting me in this 'immense pain'."

I think the article's explanation of the difference between our intuitions is quite crisp, but it still seems self-evident to me that when you try to operationalize the thing it disappears. The self-evidence is the problem, since you intuit differently - I am fairly confident from past conversations that my comparison will seem flawed to you in some important way but I can't predict in what way (If you have some general trick for being able to tell how qualia-realist people answer such questions, I'd love to hear it, it sounds like a big step towards grokking your perspective)

Load More