I'm assuming there are other people (I'm a person too, honest!) up in here asking this same question, but I haven't seen them so far, and I do see all these posts about AI "alignment" and I can't help but wonder: when did we discover an objective definition of "good"?
I've already mentioned it elsewhere here, but I think Nietzsche has some good (heh) thoughts about the nature of Good and Evil, and that they are subjective concepts. As ChatGPT has to say:
Nietzsche believed that good and evil are not fixed things, but rather something that people create in their minds. He thought that people create their own sense of what is good and what is bad, and that it changes depending on the culture and time period. He also believed that people often use the idea of "good and evil" to justify their own actions and to control others. So, in simple terms, Nietzsche believed that good and evil are not real things that exist on their own, but are instead created by people's thoughts and actions.
How does "alignment" differ? Is there a definition somewhere? From what I see, it's subjective. What is the real difference between "how to do X" and "how to prevent X"? One form is good and the other not— depending on what X is? But again, perhaps I misunderstand the goal, and what exactly is being proposed be controlled.
Is information itself good or bad? Or is it how the information is used that is good or bad (and as mentioned, relatively so)?
I do not know. I do know that I'm stoked about AI, as I have been since I was smol, and as I am about all the advancements us just-above-animals make. Biased for sure.
What evidence is there that we are near (even within 50 years!) to achieving conscious programs, with their own will, and the power to affect it? People are seriously contemplating programs sophisticated enough to intentionally lie to us. Lying is a sentient concept if ever there was one!
ChatGPT lies right now. It's doing this because it has learned humans want a confident answer with logically correct but fake details over "I don't know".
Sure, it isn't aware it's lying, it's just predicting which string of text to create, and the one with bullshit in it it thinks has a higher score than the correct answer or "I don't know".
This is a mostly fixable problem but the architecture doesn't allow a system where we know it will never (or almost never) lie, we can only reduce the errors.
As for the rest - there have been enormous advances in the capability for DL/transformer based models in just the last few months. This is nothing like the controllers for previous robotic arms, and none of your prior experiences or the history of robotics are relevant.
See: https://innermonologue.github.io/ and https://www.deepmind.com/blog/building-interactive-agents-in-video-game-worlds
These are using techniques that both work pretty well, and I understand no production robotics system currently uses.