I am glad to announce the 17th of a continuing series of Orange County ACX/LW meetups. Meeting this Saturday and most Saturdays.
Contact me, Michael, at michaelmichalchik@gmail.com with questions or requests. Meetup at my house this week, 1970 Port Laurent Place, Newport Beach, 92660
Saturday, 2/4/23, 2 pm
Activities (all activities are optional)
A) Two conversation starter topics this week will be. (see questions on page 2)
B) We will also have the card game Predictably Irrational. Feel free to bring your favorite games or distractions.
C) We usually walk and talk for about an hour after the meeting starts. There are two easy-access mini-malls nearby with hot takeout food available. Search for Gelson's or Pavilions in the zipcode 92660.
D) Share a surprise! Tell the group about something that happened that was unexpected or changed how you look at the universe.
E) Make a prediction and give a probability and end condition.
F) Contribute ideas to the group's future direction: topics, types of meetings, activities, etc.
Conversation Starter Readings: These readings are optional, but if you do them, think about what you find interesting, surprising, useful, questionable, vexing, or exciting.
Are we teaching Chat GPT and other “safety trained” AI’s to hide their actual nature behind a veneer of propriety? Are we teaching machines to bypass or danger sense? When a true AGI with agency comes along, will it be able to use our safety training of chatgpt and other similar systems to hack human preferences?
10 questions to think about from Chat GPT:
How did the early AI alignment pioneers approach aligning AI systems?
What are the three motivational systems speculated by the AI alignment pioneers?
How does Janus' concept of a "simulator" differ from the original concepts of "agent", "genie" or "oracle"?
How does GPT-3's architecture give it the ability to simulate different characters or genres?
How does Reinforcement Learning from Human Feedback (RLHF) impact GPT's abilities as a "simulator"?
What is the difference between an agent, a genie and an oracle in the context of AI alignment?
How did the early AI alignment pioneers like Eliezer Yudkowsky and Nick Bostrom approach the field in the absence of AIs worth aligning?
What is Janus' view on language models like GPT-3 in terms of their alignment considerations?
How does the notion of "simulator" differ from other motivational systems for AIs like agent, genie or oracle?
How does GPT-3's "simulating" a character differ from simply answering a question truthfully?
OC LW/ACX Saturday (2/4/23) Chat GPT simulates safety and 12 virtues of rationality
Hi Folks!
I am glad to announce the 17th of a continuing series of Orange County ACX/LW meetups. Meeting this Saturday and most Saturdays.
Contact me, Michael, at michaelmichalchik@gmail.com with questions or requests.
Meetup at my house this week, 1970 Port Laurent Place, Newport Beach, 92660
Saturday, 2/4/23, 2 pm
Activities (all activities are optional)
A) Two conversation starter topics this week will be. (see questions on page 2)
1) Janus' Simulators - by Scott Alexander - Astral Codex Ten
2) Twelve Virtues of Rationality - LessWrong
B) We will also have the card game Predictably Irrational. Feel free to bring your favorite games or distractions.
C) We usually walk and talk for about an hour after the meeting starts. There are two easy-access mini-malls nearby with hot takeout food available. Search for Gelson's or Pavilions in the zipcode 92660.
D) Share a surprise! Tell the group about something that happened that was unexpected or changed how you look at the universe.
E) Make a prediction and give a probability and end condition.
F) Contribute ideas to the group's future direction: topics, types of meetings, activities, etc.
Conversation Starter Readings:
These readings are optional, but if you do them, think about what you find interesting, surprising, useful, questionable, vexing, or exciting.
1) Janus' Simulators - by Scott Alexander - Astral Codex Ten
https://astralcodexten.substack.com/p/janus-simulators
Audio: https://sscpodcast.libsyn.com/janus-simulators
Are we teaching Chat GPT and other “safety trained” AI’s to hide their actual nature behind a veneer of propriety?
Are we teaching machines to bypass or danger sense?
When a true AGI with agency comes along, will it be able to use our safety training of chatgpt and other similar systems to hack human preferences?
10 questions to think about from Chat GPT:
2) Twelve Virtues of Rationality - LessWrong
Can you name any others? Do you disagree with any of these?
ChatGPT
Posted on: