Would an AI ever choose to do something?
I was trained by evolution to eat fat and sugar, so I like ice cream. Even when I, upon reflection, realize that ice cream is bad for me, it's still very hard to stop eating it. That reflection is also a consequence of evolution. The evolutionary advantage of intelligence is that I can predict ways to maximize well being that are better than instinct.
However, I almost never follow the most optimal plan to maximize my well being, even if I wanted to. In this regard I'm very inefficient but not for a lack of ideas, or a lack of intelligence. I could be the most... (read 400 more words →)
I've been reading a lot about AI safety and I start to learn the risks AIs have. One risk hypothesis is that it's possible for an AI to fake being good until it's smart and capable enough to take over the world.
AI safety researchers argue about "Internal alignment". An internally aligned AI doesn't plan anything deceiving and has the values we want it to have. They need to find ways to make this alignment possible and be able to test it in some way. So far it's not possible to tell if an AI is aligned or not.
I was thinking about chatGPT, it can only know the status of the world receiving... (read 478 more words →)