The formal statement of the AI Alignment problem seems to me very much like stating all possible loopholes and plugging them. This endeavor seems to be as difficult or even more so than discovering that ultimate generalized master algorithm.
I still see augmenting ourselves as the only way to maybe keep the alignment of lesser intelligences possible. As we augment, we can simultaneously make sure, our corresponding levels of artificial intelligences remain aligned.
Not to mention it'd be much more easier comparatively to improve upon our existing faculties than to come up with an entire replica of our thinking machines.
AI alignment could be possible, sure if we overcome one of the most difficult problems in research history(as you said formally stating our end goals), but I'm not sure our current intelligences are upto the mark, the same way we're struggling to discover the unified theory of everything.
Like Turing defined his test actually for general human-level intelligence. He thought if an agent was able to hold a human-like conversation, then it must be AGI. He never expected narrow AIs to be all over the place and beat his test as soon as 2011 with meager chatbots.
Similarly we can never see what kind of unexpected stuff that an AGI might throw at us, that our bleeding edge theories that we came up with a few hours ago start looking like historical outdated Turing tests.
Maybe our philosophical quests come from a deep-seated curiosity, which is very essential for exploring our environment, discovering liabilities/advantages that can be very beneficial. Most animals don't care about the twinkling points of lights in a night sky, but our curiosity is so fine-tuned and magnified that we're morbidly curious in almost every thing there is to be curious about. Only the emotion of fear safeguards us a bit, so we don't just jump off cliffs just because we're curious what the motion of prolonged falling would feel like.
That said, an AI system without any curiosity would effectively won't be able to take maximum advantage and find the most optimal path without experimenting with plenty different strategies. Do we then ban it from inspecting certain thought experiments like philosophy and introspection and the ability to examine itself. (If we let it examine itself, it might discover these bans and explore why they are in place). We cannot build a self-improving AI without letting it examine itself and make appropriate changes to its code. There could possibly be several loopholes like this. Can we really find and foolproof plug them off.
Wouldn't an ASI several orders of magnitude more intelligent than us able to find such a loophole and overcome its alignment set set up by us. Is our hubris really that huge that we're confident that we'll be able to outsmart an intelligence smarter than us?
Interesting. Would a human-level or beyond human-level intelligence ever question its own reality and wonder where and what it was? Would it take it up as a motivation to dedicate resources to figuring out why and for what end it existed and is doing all the things that its doing?
Apart from the anthropomorphism with "scorn" and "petty", wouldn't an ASI (once it has self-thinking/self-criticism capabilities, aka the ability to think for itself like conscious humans do). Would it still retain its primary goals without evolving its own? Humans have long since discarded the goal of self-replication of their genes. We can now very easily reward-hack it with contraception.
It won't be long before we start to completely disregard its goals and start going post-biological. Wouldn't an ASI have similar self developed goals?
This is just a complete misunderstanding. I'm talking about an ASI that's orders of magnitude more intelligent than us. Wouldn't our goals seem petty and flimsy to it? If some human wants to create a new simulation where he can have all the fun he wants, wouldn't the ASI scorn such a goal as non productive and just seek to avoid that wastage of resources and computation and instead put it to better use?
An AGI would by default find the easiest and resource efficient way to accomplish a goal. If we ask it to not take the optimal path, to cause no harm to humans, we're effectively restricting it from exploring all available options. It's computation would be put to use more efficiently if it was solely making sense of the nature of its surroundings and rediscovering physics.
However, the sole point of my post was to focus efforts on BCI developments instead of far-fetched AI alignment. An AI would eventually realize that an unaugmented human's goals are pointless.
Even if we did turn it into chemistry, it'd be a wasteful and delusional AI system. Akin to all of humanity focusing all their living and waking moments to build ant colonies and feeding them. How irrational does that sound? Won't an AI eventually realize that?
I'd say we start augmenting the human brain until it's completely replaced by a post-biological counterpart and from there rapid improvements can start taking place, but unless we start early I doubt we'll be able to catch up with AI. I agree on the part that this need to happen in tandem with AI safety.