On a few different views, understanding the computation done by neural networks is crucial to building neural networks that constitute human-level artificial intelligence that doesn’t destroy all value in the universe. Given that many people are trying to build neural networks that constitute artificial general intelligence, it seems important to understand the computation in cutting-edge neural networks, and we basically do not.
So, how should we go from here to there? One way is to try hard to think about understanding, until you understand understanding well enough to reliably build understandable AGI. But that seems hard and abstract. A better path would be something more concrete.
Therefore, I set this challenge: know everything that the best go bot knows about go. At the moment, the best publicly available bot is KataGo, if you’re at DeepMind or OpenAI and have access to a better go bot, I guess you should use that instead. If you think those bots are too hard to understand, you’re allowed to make your own easier-to-understand bot, as long as it’s the best.
What constitutes success?
- You have to be able to know literally everything that the best go bot that you have access to knows about go.
- It has to be applicable to the current best go bot (or a bot that is essentially as good - e.g. you’re allowed to pick one of the versions of KataGo whose elo is statistically hard-to-distinguish from the best version), not the best go bot as of one year ago.
- That being said, I think you get a ‘silver medal’ if you understand any go bot that was the best at some point from today on.
Why do I think this is a good challenge?
- To understand these bots, you need to understand planning behaviour, not just pick up on various visual detectors.
- In order to solve this challenge, you need to actually understand what it means for models to know something.
- There’s a time limit: your understanding has to keep up with the pace of AI development.
- We already know some things about these bots based on how they play and evaluate positions, but obviously not everything.
- We have some theory about go: e.g. we know that certain symmetries exist, we understand optimal play in the late endgame, we have some neat analysis techniques.
- I would like to play go as well as the best go bot. Or at least to learn some things from it.
Corollaries of success (non-exhaustive):
- You should be able to answer questions like “what will this bot do if someone plays mimic go against it” without actually literally checking that during play. More generally, you should know how the bot will respond to novel counter strategies.
- You should be able to write a computer program anew that plays go just like that go bot, without copying over all the numbers.
Drawbacks of success:
- You might learn how to build a highly intelligent and capable AI in a way that does not require deep learning. In this case, please do not tell the wider world or do it yourself.
- It becomes harder to check if professional human go players are cheating by using AI.
Related work:
- The work on identifying the ‘circuits’ of Inception v1
- The case for aligning narrowly superhuman models
A conversation with Nate Soares on a related topic probably helped inspire this post. Please don’t blame him if it’s dumb tho.
Yes, KataGo trains entirely through self-play.
It's not "100% pure Zero" in that it doesn't only play entire games from the start. So e.g. it gets supplied with some starting positions that are ones in which some version of KataGo was known to have blindspots (in the hope that this helps it understand those positions better and lose the blindspots) or ones that occur in human games but not in KataGo self-play games (in the hope that this helps it play better against humans and makes it more useful for analysing human games). But I believe all its training is from self-play and e.g. it's never trying to learn to play the same moves as humans did.
(The blindspot-finding is actually pretty clever. What they do is to take a lot of games, and search through them automatically for places where KG doesn't like the move that was actually played but it leads to an outcome that KG thinks is better than what it would have got, and then make a small fraction of KG's training games use those starting positions and also add some bias to the move-selection in those training games to make sure the possibly-better move gets explored enough for KG to learn that it's good if it really is.)
I am not surprised that your concept of local play is less crude than something I explicitly described as the "crudest and most elementary versions". It's not clear to me that we have an actual disagreement here. Isn't there a part of you that winces a little when you have to play an empty triangle, just because it's an ugly very-local configuration of stones?
Here's my (very rough-and-ready; some bits are definitely inaccurate but I don't care because this is just for the sake of high-level intuition) mental model of how a CNN-based go program understands a board position. (This is just about the "static" evaluation and move-proposing; search is layered on top of that and is also very important.)
So it starts with something similar to those "crudest and most elementary" notions of shape, and gradually refines them to deal with larger and larger scale structures and influence of stones further and further away; after enough layers we're a long way from "duh, empty triangle bad" and into "this group is kinda weak and short of eye-space, and that wall nearby is going to give it trouble, and there's a potential ladder over there that would run through the same area the group needs to run to, so playing here is probably good because in positions like this the resulting fight probably results in a stone here that will break the ladder and make my group over there safer" or "those black stones kinda-enclose some space but it's quite invadable and there's no way he's keeping all of it, especially because we'll get some free moves off that weak black group over there, which will help invade or reduce; it's probably worth about 23 points". (I don't mean to imply that there will be specific numbers the KataGo network computes that have those precise meanings, but that in the sort of position where a pro might think those things there will be things the network does that encode roughly that information.)