I specialize in regulatory affairs for Software as a Medical Device and hope to work in AI risk-mitigation. I enjoy studying machine learning and math, trying to keep up with capabilities research, reading fantasy, sci-fi and horror, and spending time with my family.
I think the kind of AI you have in mind would be able to:
continue learning after being trained
think in an open-ended way after an initial command or prompt
have an ontological crisis
discover and exploit signals that were previously unknown to it
accumulate knowledge
become a closed-loop system
The best term I've thought of for that kind of AI is Artificial Open Learning Agent.
Thanks for this answer! Interesting. It sounds like the process may be less systematized than how I imagined it to be.
Dwarkesh's interview with Sholto sounds well worth watching in full, but the segments you've highlighted and your analyses are very helpful on their own. Thanks for the time and thought you put into this comment!
I like this post, and I think I get why the focus is on generative models.
What's an example of a model organism training setup involving some other kind of model?
Maybe relatively safe if:
Here are some resources I use to keep track of technical research that might be alignment-relevant:
How I gain value: These resources help me notice where my understanding breaks down i.e. what I might want to study, and they get thought-provoking research on my radar.
I'm very glad to have read this post and "Reward is not the optimization target". I hope you continue to write "How not to think about [thing] posts", as they have me nailed. Strong upvote.
Thanks for pointing me to these tools!
I believe that by the time an AI has fully completed the transition to hard superintelligence
Nate, what is meant by "hard" superintelligence, and what would precede it? A "giant kludgey mess" that is nonetheless superintelligent? If you've previously written about this transition, I'd like to read more.
Maybe I've misunderstood your point, but if it's that humanity's willingness to preserve a fraction of Earth for national parks is a reason for hopefulness that ASI may be willing to preserve an even smaller fraction of the solar system (namely, Earth) for humanity, I think this is addressed here:
"research purposes" involving simulations can be a stand-in for any preference-oriented activity. Unless ASI would have a preference for letting us, in particular, do what we want with some fraction of available resources, no fraction of available resources would be better left in our hands than put to good use.