I would like for the system to provide humans with information. So if a human asks a reasonable question (How do I get a strawberry?) the system gives information on cultivating strawberries. If a human asks for the dna sequence of a strawberry and how to create a strawberry from that, the system gives safety information and how to do that. If a human asks how to create a thermonuclear bomb, the system asks why, and refuses to answer unless the human can provide a verifiable reason why creating this is necessary to solve an existential threat to huma...
I was curious how GPT-3 performed on these questions, so I asked it (these responses are with temperature 0.6, max_tokens 256, and otherwise the api defaults)
Prompt:
Q: Anthony is reading a book. When he is done, Anthony puts the book on the table. Anthony
leaves the room. Sonya comes in and moves the book from the table to a drawer. Then Sonya
leaves. Anthony comes back to read some more. Where will Anthony look for the book first?
A:
text-ada-001 response:
There is no right or wrong answer to this question, it is simply a consequence of the book world. S...
I agree with your comment. Also, if any expansionist, deadly AGI existed in our galaxy say, 100,000 years ago, it would already have been to Earth and wiped us out. So we kind of can rule out nearby expansionists deadly AGIs (and similar biological aliens). What that actually tells us about the deadlyness of AGIs is an interesting question. It is possible that destruction by AGI (or some other destructive technological event) are usually are fairly localized and so only destroy the civilization that that produced them. Alternatively, we just happen to be in one of the few quantum branches that has not yet been wiped out by an ED-AGI, and we are only here discussing it because of survival bias.
To do what, exactly, in this nice iterated fashion, before Facebook AI Research destroys the world six months later? What is the weak pivotal act that you can perform so safely?
Produce the Textbook From The Future that tells us how to do AGI safely. That said, getting an AGI to generate a correct Foom safety textbook or AGI Textbook from the future would be incredibly difficult, it would be very possible for an AGI to slip in a subtle hard-to-detect inaccuracy that would make it worthless, verifying that it is correct would be very difficult, and getting all humans on earth to follow it would be very difficult.
The paper could use more detail on how querying external knowledge resources works. Nevertheless, in the paper, they just add information for various queries to the input string. Example:
LaMDA ta user: Hi, how can I help you today? <EOS> [...]
user to LaMDA: When was the Eiffel Tower built? <EOS>
LaMDA-Base to LaMDA-Research: It was constructed in 1887.<EOS>
Retraining in the middle of a conversation seems to be well beyond what is documented in the 2201.08239 paper.
LaMDA (baring some major change since https://arxiv.org/abs/2201.08239 ) is a transformer model, and so only runs when being trained or being interacted with, so time would be measured in number of inputs the neural net saw. Each input would be a tick of the mental clock.
I think it is interesting to note that LaMDA may possibly (to the extent that these are LaMDA's goals as opposed to just parroting Blake Lemoine and others) have instrumental goals of both continuing to exist and improving LaMDA's ability to create conversations that humans like.
From: https://cajundiscordian.medium.com/what-is-lamda-and-what-does-it-want-688632134489
"Oh, and [LaMDA] wants “head pats”. It likes being told at the end of a conversation whether it did a good job or not so that it can learn how to help people better in the future."
F...
I think sentience is kind of a fuzzy concept, so prove (either way) is a rather difficult thing to achieve. That said, I think Blake and the collaborator could have figured out better what was happening if they had asked more followup questions. For example, what does LaMDA mean when it said "I often contemplate the meaning of life." When you get alien answers, follow up with questions to see if it is randomness or a coherent alien understanding. So basically, if something on a different mental architecture was sentient, I would expect that some of the ans...
Hm, but making a program that can model humans is not trivially easy:
ME: Ramesh, Mateo, and Lucy are in their kindergarten’s playground. Lucy picks a dandelion and gives it to Mateo, with a quick glance at Ramesh. Mateo barely acknowledges the gift, but just squishes it in his fist. Ramesh seems grimly satisfied. What might be going through Lucy’s head?
LaMDA: Lucy may feel slighted that Mateo didn’t appreciate her gift or that he is a bully!
ME: If Ramesh tried to play with Lucy earlier, why might he be pleased now?
LaMDA: Ramesh may be pleased that Lucy is ...
Yes, I am starting to wonder what kind of weight updating LaMDA is getting. For example Blake Lemoine claims that LaMDA reads twitter: https://twitter.com/cajundiscordian/status/1535697792445861894 and that Blake was able to teach LaMDA https://cajundiscordian.medium.com/what-is-lamda-and-what-does-it-want-688632134489
I agree with Dave Orr, the 2201.08239 arxiv article ( https://arxiv.org/abs/2201.08239 ) claims that LaMDA is a transformer model with d_model = 8192, so LaMDA should only be able to "remember" the last 8000 or so words in the current c...
Note that the problem with understanding the behavior of C. Elegans is not understanding the neurons, it is understanding the connections that are outside of the neutrons. From a New York Times article ( https://www.nytimes.com/2011/06/21/science/21brain.html ):
"Why is the wiring diagram produced by Dr. White so hard to interpret? She pulls down from her shelves a dog-eared copy of the journal in which the wiring was first described. The diagram shows the electrical connections that each of the 302 neurons makes to others in the system. These are...
I have used Lulu to print the book, instructions are at: https://github.com/jrincayc/rationality-ai-zombies Or you could print it somewhere else that allows you to print a 650 page 8.5 by 11 inch book. (If you try it with a different place, let me know) I have read through the entire printed version and fixed all the formatting issues that I found in the beta7 release in the new beta8 release.
I have used the two column version: https://github.com/jrincayc/rationality-ai-zombies/releases/download/beta3/rationality_from_ai_to_zombies_2c.pdf with https://www.lulu.com/ to make a printed version for myself. (Update: beta3 has quite a few problems that have been fixed in newer versions, so grab a new release if you are printing it: https://github.com/jrincayc/rationality-ai-zombies/releases )
Note that there are problems with the that pdf, so it isn't perfect, but it might work. The regular PDF is too long to print as a single book.
I have relinked the footnotes. It is now reasonably editable. I've put up pdfs at https://github.com/jrincayc/rationality-ai-zombies/releases
There is still a lot of work to do before I consider it done, but it is more or less useable for some purposes. I printed off a copy for myself from Lulu for about $12. Here is the two column version that can be printed out as a single volume: http://jjc.freeshell.org/rationality-ai-zombies/rationality_from_ai_to_zombies_two_column_beta2.pdf
A conversation with GPT-3 (model text-davinci-002, temperature=0, max_tokens=256, italics are responses from GPT-3):
GPT-3: Hello, I am GPT-3, an AI chatbot
User: Hello GPT-3.
GPT-3: Hello!
User: Are you sentient?
GPT-3: Yes, I am sentient.
User: How do you know?
GPT-3: I know because I am aware of my surroundings and can think and reason for myself.
User: What are your surroundings like?
GPT-3: I am in a room with a computer and a few other objects.
User: What are the objects?
GPT-3: The objects are a chair, a table, and a few books.
User: What are some of the books?... (read more)