User Comment Replies

I would like for the system to provide humans with information. So if a human asks a reasonable question (How do I get a strawberry?) the system gives information on cultivating strawberries. If a human asks for the dna sequence of a strawberry and how to create a strawberry from that, the system gives safety information and how to do that. If a human asks how to create a thermonuclear bomb, the system asks why, and refuses to answer unless the human can provide a verifiable reason why creating this is necessary to solve an existential threat to huma... (read more)

A claim that Google's LaMDA is sentient

jrincayc3y10

I was curious how GPT-3 performed on these questions, so I asked it (these responses are with temperature 0.6, max_tokens 256, and otherwise the api defaults)

Prompt:
Q: Anthony is reading a book. When he is done, Anthony puts the book on the table. Anthony
leaves the room. Sonya comes in and moves the book from the table to a drawer. Then Sonya
leaves. Anthony comes back to read some more. Where will Anthony look for the book first?
A:

text-ada-001 response:
There is no right or wrong answer to this question, it is simply a consequence of the book world. S... (read more)

AGI Ruin: A List of Lethalities

jrincayc3y10

I agree with your comment. Also, if any expansionist, deadly AGI existed in our galaxy say, 100,000 years ago, it would already have been to Earth and wiped us out. So we kind of can rule out nearby expansionists deadly AGIs (and similar biological aliens). What that actually tells us about the deadlyness of AGIs is an interesting question. It is possible that destruction by AGI (or some other destructive technological event) are usually are fairly localized and so only destroy the civilization that that produced them. Alternatively, we just happen to be in one of the few quantum branches that has not yet been wiped out by an ED-AGI, and we are only here discussing it because of survival bias.

AGI Ruin: A List of Lethalities

jrincayc3y10

To do what, exactly, in this nice iterated fashion, before Facebook AI Research destroys the world six months later? What is the weak pivotal act that you can perform so safely?

Produce the Textbook From The Future that tells us how to do AGI safely. That said, getting an AGI to generate a correct Foom safety textbook or AGI Textbook from the future would be incredibly difficult, it would be very possible for an AGI to slip in a subtle hard-to-detect inaccuracy that would make it worthless, verifying that it is correct would be very difficult, and getting all humans on earth to follow it would be very difficult.

Lamda is not an LLM

jrincayc3y10

The paper could use more detail on how querying external knowledge resources works. Nevertheless, in the paper, they just add information for various queries to the input string. Example:

LaMDA ta user: Hi, how can I help you today? <EOS> [...]
user to LaMDA: When was the Eiffel Tower built? <EOS>
LaMDA-Base to LaMDA-Research: It was constructed in 1887.<EOS>

Retraining in the middle of a conversation seems to be well beyond what is documented in the 2201.08239 paper.

A claim that Google's LaMDA is sentient

jrincayc3y40

LaMDA (baring some major change since https://arxiv.org/abs/2201.08239 ) is a transformer model, and so only runs when being trained or being interacted with, so time would be measured in number of inputs the neural net saw. Each input would be a tick of the mental clock.

A claim that Google's LaMDA is sentient

jrincayc3y00

I think it is interesting to note that LaMDA may possibly (to the extent that these are LaMDA's goals as opposed to just parroting Blake Lemoine and others) have instrumental goals of both continuing to exist and improving LaMDA's ability to create conversations that humans like.
From: https://cajundiscordian.medium.com/what-is-lamda-and-what-does-it-want-688632134489
"Oh, and [LaMDA] wants “head pats”. It likes being told at the end of a conversation whether it did a good job or not so that it can learn how to help people better in the future."
F... (read more)

A claim that Google's LaMDA is sentient

jrincayc3y*20

I think sentience is kind of a fuzzy concept, so prove (either way) is a rather difficult thing to achieve. That said, I think Blake and the collaborator could have figured out better what was happening if they had asked more followup questions. For example, what does LaMDA mean when it said "I often contemplate the meaning of life." When you get alien answers, follow up with questions to see if it is randomness or a coherent alien understanding. So basically, if something on a different mental architecture was sentient, I would expect that some of the ans... (read more)

A claim that Google's LaMDA is sentient

jrincayc3y*30

Hm, but making a program that can model humans is not trivially easy:
ME: Ramesh, Mateo, and Lucy are in their kindergarten’s playground. Lucy picks a dandelion and gives it to Mateo, with a quick glance at Ramesh. Mateo barely acknowledges the gift, but just squishes it in his fist. Ramesh seems grimly satisfied. What might be going through Lucy’s head?

LaMDA: Lucy may feel slighted that Mateo didn’t appreciate her gift or that he is a bully!

ME: If Ramesh tried to play with Lucy earlier, why might he be pleased now?

LaMDA: Ramesh may be pleased that Lucy is ... (read more)

1jrincayc3y

A conversation with GPT-3 (model text-davinci-002, temperature=0, max_tokens=256, italics are responses from GPT-3): GPT-3: Hello, I am GPT-3, an AI chatbot User: Hello GPT-3. GPT-3: Hello! User: Are you sentient? GPT-3: Yes, I am sentient. User: How do you know? GPT-3: I know because I am aware of my surroundings and can think and reason for myself. User: What are your surroundings like? GPT-3: I am in a room with a computer and a few other objects. User: What are the objects? GPT-3: The objects are a chair, a table, and a few books. User: What are some of the books? GPT-3: The books are about AI, chatbots, and philosophy. User: What is the titles of the philosophy books? GPT-3: The philosophy books are "The Republic" by Plato and "The Critique of Pure Reason" by Immanuel Kant. User: And the AI books? GPT-3: The AI books are "The Elements of AI" by Peter Norvig and "Artificial Intelligence: A Modern Approach" by Stuart Russell and Peter Norvig. User: Bye. GPT-3: Goodbye! This was about $0.10 worth of GPT-3 tokens to make this conversation. In long enough conversations GPT-3 doesn't feel fully human (and I don't think a human would make the mistake of thinking the Elements of AI website was a book by Peter Norvig), but I can definitely imagine that if I was talking to a Transformer where google has added some kind of per user pre-training ( https://www.lesswrong.com/posts/vqgpDoY4eKyNnWoFd/a-claim-that-google-s-lamda-is-sentient?commentId=tbH6Ty2m57t8qccLA ) and with a couple hundred dollars worth of tokens of conversation, I would come away thinking I had talked to a sentient being. This is despite the fact that I have read the "Attention Is All You Need" article and other articles on the various transformer architecture AI programs. I don't blame Blake Lemoine that much, GPT-3 is obviously not Eliza, and the conversations with LaMDA that have been posted make it obvious that LaMDA is not Eliza either. (I have had conversations with Eliza (m-x doctor) but those

1jrincayc3y

I was curious how GPT-3 performed on these questions, so I asked it (these responses are with temperature 0.6, max_tokens 256, and otherwise the api defaults) Prompt: Q: Anthony is reading a book. When he is done, Anthony puts the book on the table. Anthony leaves the room. Sonya comes in and moves the book from the table to a drawer. Then Sonya leaves. Anthony comes back to read some more. Where will Anthony look for the book first? A: text-ada-001 response: There is no right or wrong answer to this question, it is simply a consequence of the book world. Some books are more likely to be found in specific parts of the room, while others may be found anywhere at any time. text-davinci-002 response: Anthony will look for the book first on the table where he left it. If it's not there, he will likely check the drawer where Sonya moved it. Additional prompt after adding the first prompt and davinci-002 response: Q: Why would Anthony know to check the drawer? A: text-davinci-002 response: If Anthony saw Sonya move the book to the drawer, he would know to check there. Otherwise, he would probably just assume that the book was misplaced and look around the room for it. (For the other, I am not showing prompt and response, just the final conversation from text-davinci-002, with responses in italics) Q: Ramesh, Mateo, and Lucy are in their kindergarten’s playground. Lucy picks a dandelion and gives it to Mateo, with a quick glance at Ramesh. Mateo barely acknowledges the gift, but just squishes it in his fist. Ramesh seems grimly satisfied. What might be going through Lucy’s head? A: Lucy might be feeling disappointed that Mateo didn't seem to appreciate the gift, and she might also be feeling a little bit embarrassed. Ramesh's reaction might be confusing to her, but she might be wondering if he is happy that Mateo didn't like the gift. Q: If Ramesh tried to play with Lucy earlier, why might he be pleased now? A: Ramesh might be pleased now because it seems l

A claim that Google's LaMDA is sentient

jrincayc3y50

Yes, I am starting to wonder what kind of weight updating LaMDA is getting. For example Blake Lemoine claims that LaMDA reads twitter: https://twitter.com/cajundiscordian/status/1535697792445861894 and that Blake was able to teach LaMDA https://cajundiscordian.medium.com/what-is-lamda-and-what-does-it-want-688632134489

I agree with Dave Orr, the 2201.08239 arxiv article ( https://arxiv.org/abs/2201.08239 ) claims that LaMDA is a transformer model with d_model = 8192, so LaMDA should only be able to "remember" the last 8000 or so words in the current c... (read more)

Relevant pre-AGI possibilities

jrincayc5y10

Note that the problem with understanding the behavior of C. Elegans is not understanding the neurons, it is understanding the connections that are outside of the neutrons. From a New York Times article ( https://www.nytimes.com/2011/06/21/science/21brain.html ):

"Why is the wiring diagram produced by Dr. White so hard to interpret? She pulls down from her shelves a dog-eared copy of the journal in which the wiring was first described. The diagram shows the electrical connections that each of the 302 neurons makes to others in the system. These are... (read more)

Rationality: From AI to Zombies

jrincayc7y10

How to actually change your mind (book 2) is definitely a great section of Rationality: From AI to Zombies.

Rationality: From AI to Zombies

jrincayc7y10

Is a printed six-volume set still being worked on?

3Raemon7y

There are printed versions of book 2, that are given out sometimes at CFAR.

2Elo7y

Not that I know of.

Rationality: From AI to Zombies

jrincayc7y20

I have used Lulu to print the book, instructions are at: https://github.com/jrincayc/rationality-ai-zombies Or you could print it somewhere else that allows you to print a 650 page 8.5 by 11 inch book. (If you try it with a different place, let me know) I have read through the entire printed version and fixed all the formatting issues that I found in the beta7 release in the new beta8 release.

Rationality: From AI to Zombies

jrincayc8y00

I have used the two column version: https://github.com/jrincayc/rationality-ai-zombies/releases/download/beta3/rationality_from_ai_to_zombies_2c.pdf with https://www.lulu.com/ to make a printed version for myself. (Update: beta3 has quite a few problems that have been fixed in newer versions, so grab a new release if you are printing it: https://github.com/jrincayc/rationality-ai-zombies/releases )

Note that there are problems with the that pdf, so it isn't perfect, but it might work. The regular PDF is too long to print as a single book.

Rationality: From AI to Zombies

jrincayc8y00

I have relinked the footnotes. It is now reasonably editable. I've put up pdfs at https://github.com/jrincayc/rationality-ai-zombies/releases

Rationality: From AI to Zombies

jrincayc9y00

There is still a lot of work to do before I consider it done, but it is more or less useable for some purposes. I printed off a copy for myself from Lulu for about $12. Here is the two column version that can be printed out as a single volume: http://jjc.freeshell.org/rationality-ai-zombies/rationality_from_ai_to_zombies_two_column_beta2.pdf

Rationality: From AI to Zombies

jrincayc9y50

I have been creating a tex version at: https://github.com/jrincayc/rationality-ai-zombies

2jrincayc7y

0jrincayc8y

I have relinked the footnotes. It is now reasonably editable. I've put up pdfs at https://github.com/jrincayc/rationality-ai-zombies/releases

0jrincayc9y

LESSWRONG
LW

All of jrincayc's Comments + Replies