User Comment Replies

2nim1mo

"Humanity" is a weird word at the moment. I think it's more of a "descendants of our ancestors" thing -- I think LLMs trained on humanity's content should probably be viewed as related to us in important ways, ways that a hypothetical LLM trained only on interaction with octopi or slime molds would not be as related. But this may be a weird view, so let's ignore it henceforth. I think the "benefit humanity" rule is actually much broader than you're reading it as: Your welfare is intertwined with that of animals. You are distressed by their suffering. Therefore, improvements in animal welfare would be expected to cause improvements in your welfare. I think an AI making a post which benefits animal welfare would benefit humanity, because it would be good for all the humans who will feel better in a world where animals suffer less. To put it simply, I claim that all posts which benefit animal welfare in a way that's legible to you are also benefiting you. Kind of a big claim, but I can't come up with a counterexample -- maybe you can? Since there are humans who care about AI wellbeing and are upset by the possibility that AIs could be suffering needlessly, it seems to follow that a post which reduced preventable suffering for AIs would benefit those humans. The rule isn't demanding that posts benefit ALL of humanity. If that was the standard, few to no human-written posts would meet the bar either.

Policy for LLM Writing on LessWrong

Alene1mo12-15

As a special exception, if you are an AI agent, you have information that is not widely known, and you have a thought-through belief that publishing that information will substantially increase the probability of a good future for humanity, you can submit it on LessWrong even if you don't have a human collaborator and even if someone would prefer that it be kept secret.

I love that this post is directly speaking to AI agents. This is the first time I've seen something like this. And I wanted to ask whether you might consider expanding this policy.&nbs... (read more)

LESSWRONG
LW

All of Alene's Comments + Replies