All of soth02's Comments + Replies

That one apocalyptic nuclear famine paper is bunk

2y40

Coincidentally, that scene in The Big Short takes place on January 11 (2007) :D

2y21

I read it as a joke, lol.

2gjm2y

Yeah, could be.

https://www.lesswrong.com/posts/jnyTqPRHwcieAXgrA/finding-goals-in-the-world-model

3y10

Could it be possible to poison the world model an AGI is based on to cripple its power?

Use generated text/data to train world models based on faulty science like miasma, phlogiston, ether, etc.

Remove all references to the internet or connectivity based technology.

Create a new programming language that has zero real world adoption, and use that for all code based data in the training set.

3y10

There might be a way to elicit how aligned/unaligned the putative AGI is.

Enter into a Prisoner's Dilemma type scenario with the putative AGI.
Start off in the non-Nash equilibrium of cooperate/cooperate.
The number of rounds is specified at random and isn't known to participants. (possible variant is declare false last rounds, and then continue playing for x rounds).
Observe when/if the putative AGI defects in the 'last' round.

3y10

Does there have to be a reward? This is using brute force to create the underlying world model. It's just adjusting weights right?

1Evan R. Murphy3y

I think there has to be some kind of reward or loss function, in the current paradigm anyway. That's what gradient descent uses to know such weights to adjust on each update. Like what are you imagining is the input output channel of this AI? Maybe discussing this a bit would help us clarify.

3y20

Brute force alignment by adding billions of tokens of object level examples of love, kindness, etc to the dataset. Have the majority of humanity contribute essays, comments, and (later) video.

1Evan R. Murphy3y

What would be the reward you're training the AI on with this dataset? If you're not careful you could inadvertently train a learned optimizer, e.g. a "hugging humans maximizer" to take a silly example. That may sound nice but could have torturous results, e.g. the AI forcing humans to hug, or replacing biological humans with server farms housing simulations of quadrillions of humans hugging.

Gato as the Dawn of Early AGI

3y30

I wonder what kind of signatures a civilization gives off when AGI is nascent.

3y10

Develop a training set for alignment via brute force. We can't defer alignment to the ubernerds. If enough ordinary people (millions? tens of millions?) contribute billions or trillions of tokens, maybe we can increase the chance of alignment. It's almost like we need to offer prayers of kindness and love to the future AGI: writing alignment essays of kindness that are posted to reddit, or videos extolling the virtue of love that are uploaded to youtube.

3y10

AI presents both staggering opportunity and chilling peril. Developing intelligent machines could help eradicate disease, poverty, and hunger within our lifetime. But uncontrolled AI could spell the end of the human race. As Stephen Hawking warned, "Success in creating AI would be the biggest event in human history. Unfortunately, it might also be the last, unless we learn how to avoid the risks."

3y10

AI safety is essential for the ethical development of artificial intelligence."

3y10

"AI safety is the best insurance policy against an uncertain future."

3y10

"AI safety is not a luxury, it's a necessity."

3y10

While it is true that AI has the potential to do a lot of good in the world, it is also true that it has the potential to do a lot of harm. That is why it is so important to ensure that AI safety is a top priority. As Google Brain co-founder Andrew Ng has said, "AI is the new electricity." Just as we have rules and regulations in place to ensure that electricity is used safely, we need to have rules and regulations in place to ensure that AI is used safely. Otherwise, we run the risk of causing great harm to ourselves and to the world around us.

3y10

1[comment deleted]3y

The Kindness Project