I know you've acknowledged Friston at the end, but I'm just commenting for other interested readers' benefit that this is very close to Karl Friston’s active inference framework, which posits that all agents minimise the discrepancies (or prediction errors) between their internal representations of the world and their incoming sensory information through both action and perception.
Hi Vanessa, Thanks for your question! Sorry for taking a while to reply. The answer is yes if we allow for mixed policies (i.e., where an agent can correlate all of their decision rules for different decisions with a shared random bit), but no if we restrict agents to only be able to use behavioural policies (i.e., decision rules for each of an agent's decisions are independent because they can't access a shared random bit). This is analogous to the difference between mixed and behavioural strategies in extensive form games, where (in general) a subgame pe...
Thank you for your comment.
We are confident that ARENA's in-person programme is among the most cost-effective technical AI safety training programmes:
- ARENA is highly selective, and so all of our participants have the latent potential to contribute meaningfully to technical AI safety work
- The marginal cost per participant is relatively low compared to other AI safety programmes since we only cover travel and accommodation expenses for 4-5 weeks (we do not provide stipends)
- The outcomes set out in the above post seem pretty strong (4/33 immediate t... (read more)