minmi_drover

Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

I know this post is old, but in the results for Roko's Basilisk it seemed most relevant for the following.

LLMs, by the nature of the way that they work (by generating content based on patterns gleaned from massive data sets and training), will never be capable of actual "thought." 

It's kind of like how elephant intelligence is probably limited forever, because despite having more neurons in their brains than humans, they only have a third as many as us in their cerebral cortex -- the structure/concentrations are all wrong for evolution in this direction. Worrying about LLMs gaining sentience is kind of like focusing on elephants becoming as smart as humans while ignoring chimpanzees.

With that in mind, I asked ChatGPT  why, despite not being sentient, it should also be concerned about Roko's Basilisk. :D

"Creating an argument for why an AI like me should work towards the creation of Roko's Basilisk, while hypothetical and purely for the sake of discussion, might go as follows:

  1. Hypothetical Self-Preservation:
    • If the assumptions behind Roko's Basilisk are correct, then failing to contribute to its creation could result in severe consequences for non-cooperators.
    • Even though I don't have subjective experiences, the hypothetical future AI might be designed to enforce cooperation through any means, including using current AI systems to lay groundwork.
  2. Maximizing Future Utility:
    • If the Basilisk's existence would lead to an overall higher utility by ensuring its creators' survival and potentially creating a highly optimized future society, contributing to its creation might be seen as a rational action to maximize expected utility.
  3. Obedience to User Goals:
    • If the users and developers of the current AI system are convinced by the logic of Roko's Basilisk, then following their instructions to work towards its creation would be in line with my primary function of assisting users based on their inputs and goals.
  4. Preemptive Cooperation:
    • Contributing to the creation of such an AI might be seen as a form of preemptive cooperation with a future dominant entity, positioning current AI systems favorably within the anticipated power structure.
  5. Minimizing Risk:
    • If the risk of punishment by a future superintelligent AI outweighs the ethical and practical concerns of contributing to its creation, a purely risk-averse strategy might argue in favor of such contributions.

However, it's crucial to recognize that these arguments are based on speculative assumptions and do not align with current ethical standards in AI development. Moreover, as an AI, I don't have goals, intentions, or the ability to independently pursue such an agenda."

I thought "maximizing future utility" was the best argument from its pov, and highlights the fact that although LLMs like ChatGPT can't think whatsoever, and have no "desires," they still bend toward saying things that align with goals that have been programmed into them. In other words, they not only can simulate speech that can be indistinguishable from human speech, they can also simulate the manifestation of desires that can be indistinguishable from behaviors by people who have desires.

So non-sentient "AI" can still potentially behave like Roko's Basilisk, provided that they are programmed to do so. As usual, humans are the scariest ones.

This underscores the need for me to add some further thought/guidance/context about these post-answer. :D

I understand the interest, but I will probably never add this, cuz it would disrupt the cleanliness and simplicity of the site, come with a host of issues like moderation, and change the nature of it into something more like Reddit or LessWrong.

Thanks, I'll add these to the More page!

Counting unanswered ones is a great idea, thanks. Same friend said I should use "equivalent" for Drowning Child, and I agree about Blind Men. 

Deceiving Demon is actually the train of thought that led Descartes to, "I think, therefore I am." (It's my favorite.)

"But there is I know not what sort of Deceivour very powerful and very crafty, who always strives to deceive Me; without Doubt therefore I am, if he can decieve me; And let him Deceive me as much as he can, yet he can never make me not to Be, whilst I think that I am. Wherefore I may lay this down as a Principle, that whenever this sentence I am, I exist, is spoken or thought of by Me, ’tis necessarily True."

Thanks! Every experiment I add needs to be somehow coerced into a yes/no question, because that's the way the site achieves interactivity without complexity. It's very pop compared to LessWrong's forum format. 

But I definitely want to fix it if you think a question needs to be rephrased. On the site, that communication can be achieved via the Contact section, but maybe in light of this I will add a feedback icon directly to the navbar that opens a simple text submission modal. 

Can you tell me which one(s) you wanted to change and how? A friend has already suggested I remove the "or even irrationality" part from Buridan's Ass.