Whatever I end up learning about Bayesian networks might not transfer to messier architectures.
I don't think you should take it as a given that AGI will involve "messier archictures"; I think there's at least a fighting chance that one of the core components of an AGI will be literally a Bayesian network, or at least something pretty close to that, basically because of this paper and related neuroscience-type things.
As such, I have a general good feeling about this line of research, even if I haven't tried to follow the details of your post.
For my PhD research, I have been tinkering with the idea of approximating complex, non-linear statistical models (Bayesian networks) as simple, quasi-linear models.
I am making some headway into the problem, but I feel I need yet to clarify what exactly I want to do, and why this is important.
I am writing this blogpost to 1) spell out the idea better (and get questions and pointers on how to explain it better) and 2) get feedback from the community on which parts of this feel interesting / useful, and which parts I could drop.
First I will explain why I am working on approximating Bayesian networks. Then I will explain the solution I came up with. Finally I will discuss issues with my current approach and next steps.
If you want something to play with, try out this public prototype. It showcases a (clunky) solution to the problem, and might help understand what I am trying to do.
Requisites: you should be familiar with Bayesian Networks, d-separation, belief propagation and vector logodds to understand the gist of this post. Shapley values and double crux are also mentioned in passing, though you should be able to skip the parts where they are mentioned and still follow the post. If there is anything else that seems unclear, you should probably let me know, because it would help me understand what do I need to explain better.
Theory of research
My (post-hoc) reasoning to have been focusing on this problem is something like this:
I think there is a non-trivial chance that I am just fooling myself into thinking that this is something worth pursuing - in my current grant program I have significant pressure to focus on explaining bayesian networks and focus on textual explanations. So here are the reasons why my theory of research might be wrong:
I’d be quite interested in hearing other critiques of this theory of research.
What I have been doing
I have spent the last 6 months thinking about the concrete problem I pointed out above - how to modularize a Bayesian network as a series of arguments and then how to assign importance to each of them. In LessWrong lingo, what I have been doing is a way of extracting cruxes from Bayesian models.
My current approach is something inspired by Rudin et Utsun’s paper on Scoring Systems [REF], combined with belief propagation [REF].
Essentially, what I have been doing is:
This is dense and complex, so let’s walk through an example. You can also try out the end result of this process in this interactive notebook.
We will use the Asia network. The outcome we are interested in is whether the patient has lung cancer, which corresponds to the variable
lung
taking valueyes
.1. Finding all possible arguments
We first find all simple (undirected) paths to
lung
. For example one such path issmoke
→lung
. Another more convoluted path isasia
→tub
→either
←lung
.2. Find the conditions where the arguments apply
Let’s focus on the path
asia
→tub
→either
←lung
. For this path to be open, we need to observe the first node “asia”, and we need to observe the collidereither
or one of its descendents iexray
ordysp
.This gives rise to 12 instantiations of the argument, eg (
asia=yes
,either=yes
), (asia=yes
,either=no
), (asia=yes
,xray=yes
), (asia=yes
,dysp=yes
), etc.3. Quantify the importance of each argument
Let’s focus on the argument
asia
→tub
→either
←lung
when (asia
=yes
,xray
=yes
). We run delta belief propagation fromasia
totub
, then fromxray
toeither
, and finally we quantify how the change intub
affectstub
given the background change ineither
due to the change inxray
. This sounds convoluted, but the end result is an evidence vector that approximates how much this argument affectslung
. We translate this vector into a logodd change, and record it. See the appendix for more info on how this works exactly.4. Order all the arguments by importance
This is straightforward, given that we already have recorded the importance of each argument in terms of logodd changes. See figure 4 for the result.
5. Apply this for a concrete input
Let’s suppose we observe
{'asia': 'no', 'dysp': 'yes', 'bronc': 'no', 'tub': 'no'}
We can then find all arguments that apply under these conditions. So for example the argument
tub=no
→either
←lung
works because we have observedtub=no
anddysp=yes
to d-open the path. But the argument starting fromasia=no
does not work, sincetub=no
blocks the influence ofasia
on the rest of the diagram.The former argument has an associated score of 0.3 logodds (as we can see in row 15 of figure 4), so it will shift our credences slightly in favor of the target outcome
lung
=yes
.Because of how delta belief propagation works, we can give explanations of how the evidence propagates along each proposed chain of argument. I have chosen to give just a brief automated textual description, see figure 5 for an example.
Does this work?
Here there are some results:
One problem I have here is that it is not entirely clear what I am aiming to achieve. Some things I could aim for are:
What are your thoughts?
Wrapping up
In summary: I am trying to extract from a Bayesian model the most important considerations on why an input should affect our beliefs about an outcome.
I have a system that does something, but I am confused on how to evaluate it and the grander point of what I am trying to achieve.
Theoretically, I can see some issues with the current framework. The main ones I see are:
either=yes
←lung
and thattub=yes
→either=yes
←lung
are treated as two different arguments but really it should be treated as oneI have some ideas on how to work on these, but it will be a while until I can make those more concrete and code them up. I’d be interested in other ideas to solve these, and pointers to other issues.
Before I get to address them, I plan to spend some time clarifying what I am trying to do. In order to do that, my first plan is to just demo this system to a bunch of XAI researchers, and see what they have to say about it, plus reading more literature on the goals of the field to see if I can steal any ideas.
Appendix: delta propagation
The key concept that ties together the algorithm is what I’ve taken to call delta belief propagation. Do not expect a clean explanation because I don’t have one to offer (yet).
Roughly, this is a way of taking an evidence vector δX over a variable X and a factor ϕ over variables X,Y,Z1,…,Zn and output an evidence vector δY over variable Y, that roughly corresponds to how much of a difference the evidence δX makes over Y, given the probabilistic relations implied by ϕ.
Because the input and output of this mecanism is both an evidence vector, it is easy to see how this can be used recursively to study the effect of a path of variables such as the ones we are interested in studying.
How do we actually go about implementing this? My take is to multiply δX and ϕ, then marginalize all variables but Y. So far this would be equivalent to belief propagation. However, this mixes the information intrinsically in ϕ together with the information that comes from δX. To isolate the latter, we divide the result by the result of marginalizing ϕ without multiplying by δX.
(to address that sometimes we are interested in studying the effect as mediated by a collider, we also allow additional context vectors δZ that will be multiplied by ϕ before any other computation)
In equation form:
δY=ϕ1/ϕ2
ϕ1=∑J≠Y(ϕ⋅δX)
ϕ2=∑J≠Yϕ
In code form:
Jaime Sevilla is a researcher from Aberdeen University. He is sponsored by the NL4XAI program.
I thank my supervisors Ehud Reiter, Nava Tintarev and Oren Nir for supporting me and encouraging me to write out my ideas.