This is a sequel to our previous post on the Touchette-Lloyd theorem[1]. The previous post contained some introductory material and motivation for the theorem. Here, we will walk through the proof of the theorem and explore its applications in a few worked examples. It isn't strictly necessary to read that...
This post is an informal explainer of our paper which can be found on arxiv. This work was funded by the Advanced Research + Invention Agency (ARIA) Safeguarded AI Programme through project code MSAI-SE01-P005. Introduction There is an intuition that a powerful agent might have to contain some kind of...
As the current Dovetail research fellowship comes to a close, the fellows are giving talks on their projects. All are welcome to join! Unlike the previous cohort talks, these talks will be scheduled one at a time. This is partly because there are too many to do all in one...
This post is about one of the results described in the 2004 paper 'Information-theoretic approach to the study of control systems' by Hugo Touchette and Seth Lloyd.[1] The paper compares 'open-loop' and 'closed-loop' controllers (which we here call 'blind' and 'sighted' policies) for the task of reducing entropy and quantifies...
I recently had a conversation with a friend of a friend who has a very curious child around 5 years of age. I offered to answers some of their questions, since I love helping people understand the world. They sent me eight questions, and I answered them by hand-written letter....
I recently read Red Heart, a spy novel taking place in the core of a Chinese AGI project. Disclaimer that the author is my friend, and that I’m ideologically incentivized to promote stuff about AI safety! That said, I think you should read it. If nothing else, it’s a fun...
Every so often, I have this conversation: > Them: So you know how the other day we talked about whether we should leave for our trip on that sunday or monday? > Me: …doesn’t sound familiar… > Them: And you said it depended on what work you had left to...