Epistemic status: This is not a scientific analysis, but just some personal observations. I still think they point towards some valid conclusions regarding AI alignment. I am a father of three sons. I would give my life to save each of them without second thoughts. As a father, I certainly...
When I talk to Claude or ChatGPT, as far as I understand it I’m not really talking to the underlying LLM, but to a fictional persona it selects from the near infinite set of possible personas. If that is true, then when an AI is evaluated, what is really tested...
This is a reaction to Nora Belrose’s and Quintin Pope’s AI Optimism initiative. However, others are better qualified to criticize the specific arguments they give for their claim that AI is easy to control. Instead, I will focus on the general stance of optimism, when it can be beneficial and...
Concerns about AI safety rely on the assumption that a sufficiently powerful AI might take control of our future in an undesirable way. Meta’s head of AI, Yann LeCun, correctly points this out in a recent tweet, but then argues that this assumption is wrong because “intelligence” and “dominance” are...
On June 22nd, there was a “Munk Debate”, facilitated by the Canadian Aurea Foundation, on the question whether “AI research and development poses an existential threat” (you can watch it here, which I highly recommend). On stage were Yoshua Bengio and Max Tegmark as proponents and Yann LeCun and Melanie...
Edit: Based on the comment by Daniel Kokotajlo, we extended the dialog in the chapter "Takeover from within" by a few lines. The perfect virtual assistant The year is 2026 and the race for human-level artificial general intelligence (AGI) draws to a close. One of the leading AI companies, MegaAI,...
This story is also available as a YouTube video. A network of specialized open-source agents emerges Developed by open-source communities, “agentic” AI systems like AutoGPT and BabyAGI begin to demonstrate increased levels of goal-directed behavior. They are built with the aim of overcoming the limitations of current LLMs by adding...