You can schedule them with me at this link: https://calendly.com/gurkenglas/consultation We can discuss whatever you're working on, such as math or code, but usually people end up having me watch their coding and giving them tips. Here's how this went last time: To my memory, almost every user wrote such...
At the beginning of history, entities gained power by eating each other. They each had goals, but few could accomplish anything in that environment. Over time the survivors gained information about each other, eventually allowing cooperation. They agreed to change the system: From now on, entities would slowly transfer power...
At AI safety camp I noticed that people are bad at math and when I look over their shoulder as they code I notice nested for loops that could have been one matrix multiplication. Therefore I'll try offering this service to the public: You screenshare as you do your daily...
Wishes it were crossposted from the AI Alignment Forum. Contains more technical jargon than usual. Recall Robust Cooperation in the Prisoner's Dilemma and a hint of domain theory. In the Prisoner's Dilemma, players have the opportunity to harm the opponent for minor gain. To aid decision, players may be granted...
Suppose that AI capability research is done, but AI safety research is ongoing. Any of the major players can launch an AI at the press of a button to win the cosmos. The longer everyone waits, the lower the chance that the cosmos is paperclips. The default is that someone...
Tl;dr We are attempting to make neural networks (NN) modular, have GPT-N interpret each module for us, in order to catch mesa-alignment and inner-alignment failures. Completed Project Train a neural net with an added loss term that enforces the sort of modularity that we see in well-designed software projects. To...
This week, the key alignment group, we answered two questions, 5-minute timer style: 1. Map out all of alignment (25 minutes) 2. Create an image/ table representing alignment (10 min.) You are free to stop here, to actually try to answer the questions yourself. Here is a link for a...