Emmanuel Awosika

Wikitag Contributions

Comments

Sorted by

I'm yet to read the paper, but my initial reaction is that this is a classic game-theoretical problem where players have to weigh up the incentives to defect or cooperate. For example, I'm not sure if Manhattan Project-style effort for AI in the US is extremely unreasonable when China already has something of that sort

My weakly held opinion is that you cannot get adversarial nation-states at varying stages of developing a particular technology to mutually hamstring future development. China is unlikely to halt AI development (it is already moving to restrict DeepSeek researchers from traveling) because it expects the US to accelerate AI development and wants to hedge bets by developing AI itself. The US won't stop AI development because it doesn't trust China will do so (even with a treaty) and the conversation around the use of AI in military starts to look different when China has already outpaced the US in AI capabilities. Basically, each party wants to be in a position of strength and guarantee mutually assured destruction. 

"But if we have an arms race and build superintelligent AI, the entire human race is going to be killed off by a rogue AI." This is a valid point, but I'll argue that the odds of getting powerful nation states to pause AI for the "global good" is extremely low. We only need to see that countries like China are still shoring up nuclear weapons despite various treaties aimed at preventing proliferation of nukes. 

AFAICT, a plausible strategy is to make sure that the US still keeps up with AI in terms of AI development and slowly open lines of communication later to agree on a collective AI security agreement that protects humanity from the dangers of unaligned superintelligence. The US will be able to approach these negotiations from a place of power (not a place of weakness), which is—by and large—the most important favor in critical negotiations like this one. 

Morality must scale to be useful. A common failure mode for people is to engage in "performative morality" where they choose low-cost signaling over high-cost action. What's easier to do—starting a sustainable energy company (and scaling it) or donating $10 to climate change activism and tweeting about the seas rising 24/7?

But "easy" doesn't mean "effective". A sustainable energy company has 1000x the impact of a $10 donation on reversing climate change. Sure, not everyone can—or should—build a company; but then it shouldn't be the case that the person donating $10 feels some mysterious aura of morality and feels like they're just doing just as much as the person running a sustainable energy company. 

People should feel good about doing good. I think the problem starts when we start to reason qualitatively about actions instead of using a quantitative reasoning framework. This is where we start to judge the intent of an action instead of focusing on the outcomes. 

This idealist vs. pragmatist argument has been repeated ad nauseam, but I think it's worth bringing up again. Why? The world is getting more complex and the nature of problems we're dealing with reflects that complexity. We cannot afford to think simplistically about those problems and go for solutions that feel good on the surface, but don't actually move the needle a lot. 

We all want to feel agentic and feel like we can contribute to making the world a better place. But we're only effective at doing this when we're intellectually honest about our limitations and know where our efforts fit into the overall plan. 

AI alignment might be one of the areas that could benefit from this type of thinking. I haven't followed the AI conversation all too closely, but it seems you have two groups. Group #1 is people who have been doing AI safety before LLMs went mainstream saying an aligned superintelligence is impossible to build. Group #2 is AI safety people who emerged post-LLM boom and believe building a safe superintelligence is a resource problem—one we can solve with the right combination of capital, hardware, software, and talent. 

It's common to see people from group #1 claim the moral high ground and show they're doing more to stop the extinction of the human race. They might berate AI companies for building software, tweet about AI risk and sign petitions to "pause AI", and make dollar donations to AI safety research. Often, people in this group will claim moral superiority on matters concerning AI development and acquire status in the process. 

But the line of thinking I've previously should show that group #2 should get more airtime and recognition because they're doing harder things to solve the problem of AI misalignment. Raising capital, building hardware, attracting talent—these are all extremely difficult things to do at scale. If someone working on AI alignment does them well, then they deserve more attention in my opinion. 

It's not that the OG AI safety researchers are wrong, or the newfangled AI safety researchers raising billions cannot be wrong. My view is based on the idea that many problems in the world are in fact resource coordination and allocation problem. 

I think the history of technology suggests that capital formation and technological innovation go hand in hand; we have built technology to destroy each other (missiles) and technology to defend ourselves (Israel's Iron Dome). My weakly held belief is that AI alignment will follow the same pattern. 

In a world where AI alignment is a resource problem, I expect that people serious about solving the problem work harder on acquiring resources. It's okay to fail at this and admit failure. What is not right is refusing to do the hard work of resource acquisition while claiming specific problems are unsolvable. This, in my opinion, is intellectual hubris (failing to acknowledge that many solutions are emergent) and fails to understand the pattern of technological breakthroughs. 

More to the point, we should be wary of awarding points for intention and assigning status to the person who does a better job of signaling that they care about a particular problem and doesn't do enough to effectively to solve the problem. This is difficult to do for many reasons (e.g., assessing what counts as "effective action" in a particular problem-solving domain), but if done correctly, can raise the baseline of people who work harder to actually solve problems facing the world today. 

By saying morality must scale, we're essentially saying that the extent to which someone cares about solving a problem can be deduced from how much impact they plan to have with their action. The hope is that people learn to do more useful things and avoid motivated stopping when moving through the solution space related to a particular problem.