Suggestion for a different summary of my post:
Finite Factored Sets are re-framing of causality: They take us away from causal graphs and use a structure based on set partitions instead. Finite Factored Sets in Pictures summarizes and explains how that works. The language of finite factored sets seems useful to talk about and re-frame fundamental alignment concepts like embedded agents and decision theory.
I'm not completely happy with
Finite factored sets are a new way of representing causality that seems to be more capable than Pearlian causality, the state-of-the-art in causality analysis. This might be useful to create future AI systems where the causal dynamics within the model are more interpretable.
because
The summary has been updated to yours for both the public newsletter and this LW linkpost. And yes, they seem exciting. Connecting FFS to interpretability was a way to contextualize it in this case, until you would provide more thoughts on the use case (given your last paragraph in the post). Thank you for writing, always appreciate the feedback!
I also chatted with a few GPU researchers at NeurIPS and their take was that computing power will hit a peak, making AGI near-impossible. The newer GPUs from Google and Tesla are not necessarily better, they just avoid NVIDIA’s 4x markup on the price of GPUs.
I disagree, primarily since I think the human brain is at the limit already, and thus I see AGI as mostly possible. I also think that more energy will probably be used for computing. Now what I do agree with is there probably aren't pareto improvement we can do with AGI. (At least not without exotica working out.)
Here's a link:
https://www.lesswrong.com/posts/xwBuoE9p8GE7RAuhd/brain-efficiency-much-more-than-you-wanted-to-know
What might not be obvious from the post is that I definitely disagree with the "AGI near-impossible" as well, for the same reasons. These are the thoughts of GPU R&D engineers I talked with. However, the GPU performance increase limitation is a significant update on the ladder of assumptions towards "scaling is all you need" leading to AGI.
If I would update at all, I'd update that without exotic computers, AGI cannot achieve a pareto improvement over the brain. I'd also update towards continuous takeoff. Scaling will get you there, because the goal of AGI is an endgame goal (at least with neuromorphic chips.)
I think we agree here. Those both seem like updates against scaling is all you need, i.e. (in this case) "data for DL in ANNs on GPUs is all you need".
I think we agree here. Those both seem like updates against scaling is all you need, i.e. (in this case) "data for DL in ANNs on GPUs is all you need".
That's where I'm disagreeing, because to my mind this doesn't undermine "scale is all you need". It does undermine the idea that a basement group could produce AGI, but overall it gives actual limits on what AGI can do for a certain amount of energy.
Watch this week's episode on YouTube or listen to the audio version here.
Hopes and fears of the current AI safety paradigm, GPU performance predictions and popular literature on why machines will never rule the world. Welcome to the ML & AI safety Update!
Hopes & Fears of AI Safety
Karnofsky released an article in his Cold Takes blog describing his optimistic take on how current methods might lead to safe AGI:
At the same time, Christiano writes a reminder that AI alignment is distinct from applied alignment. Updating models to be inoffensive will not lead to safe artificial general intelligence but safer short-term systems such as ChatGPT. Steiner writes a counter-post on the usefulness of working with applied alignment as well.
Relatedly, Shlegeris publishes a piece exploring whether reinforcement learning from human feedback is a good approach to alignment. He addresses questions such as if RLHF is better than alternative methods that achieve the same (yes), has been net positive (yes), and is useful for alignment research (yes).
The alternative perspective is pretty well covered in Steiner’s piece this week on why RLHF / IDA / Debate won’t solve outer alignment. Basically, these methods do not optimize for truth or safety, they optimize for getting the humans to “click the approve button”, something that can lead to many failures down the road.
GPU Performance Predictions
Hobbhahn and Besiroglu of EpochAI, the main AI capabilities prediction organization, have released a comprehensive forecasting report on how GPU performance will develop during the next 30 years.
They use a model composed of the relationship between GPU performance and its features and how features change over time due to making transistors smaller. They expect GPU performance to hit a theoretical peak before 2033 at 1e15 FLOP/s (floating point operations per second).
I also chatted with a few GPU researchers at NeurIPS and their take was that computing power will hit a peak, making AGI near-impossible. The newer GPUs from Google and Tesla are not necessarily better, they just avoid NVIDIA’s 4x markup on the price of GPUs.
This brings hope to how well we can avoid AGI being developed. Ajeya Cotra’s estimate of ~1e29 FLOP/s required for artificial general intelligence based on the computation done by a human during a lifetime seems to be significantly farther away than her estimates indicated based on the Epoch report. Read her estimates in the first part of her wonderful transformative AI forecasting report.
“Why Machines Will Never Rule the World”
In the spirit of predicting how capable AGI will be, Machine Learning Street Talk, the hugely popular machine learning podcast, has interviewed Walid Saba about his review of the book from August, “Why Machines Will Never Rule the World”, by Landgrebe and Smith.
The book’s basic argument is that artificial general intelligence will not be possible for mathematical reasons. The human brain is a complex dynamical system and they argue that systems of this sort cannot be modeled with our modern neural network architectures or within computers at all due to the limited nature of training data as a function of the past.
These arguments are in line with Searle’s 1980 Chinese room argument and Penrose’s argument of non-computability based on Gödel’s incompleteness theorem. Walid Saba’s review is generally positive about the book. I personally disagree with the arguments since we do not need to model the complex system of the brain, we just need to replicate it in a simulator.
Nevertheless, it is an interesting discussion about whether AGI is possible.
Other news
In other news…
Opportunities
There are some exciting Winter opportunities this week! Again, thank you to AGISF for sharing opportunities in the space.
This has been the ML & AI safety update. We will take a break for two weeks over Christmas but then be back with more wonderful hackathons and ML safety updates. See you then!