Jacob_Hilton

Announcing the ARC White-Box Estimation Challenge

ARC has teamed up with AIcrowd to launch the ARC White-Box Estimation Challenge, a contest to improve upon our estimation algorithms for random MLPs. The warm-up round begins this week, and later rounds will have a total prize pool of at least $100,000. We are very grateful to Sharada Mohanty,...

Jun 2165

Mechanistic estimation for expectations of random products

We have developed some relatively general methods for mechanistic estimation competitive with sampling by studying problems that are expressible as expectations of random products. This includes several different estimation problems, such as random halfspace intersections, random #3-SAT and random permanents. In this post, we will give a high-level introduction to...

May 1550

Mechanistic estimation for wide random MLPs

This post covers joint work with Wilson Wu, George Robinson, Mike Winer, Victor Lecomte and Paul Christiano. Thanks to Geoffrey Irving and Jess Riedel for comments on the post. In ARC's latest paper, we study the following problem: given a randomly initialized multilayer perceptron (MLP), produce an estimate for the...

May 785

AlgZoo: uninterpreted models with fewer than 1,500 parameters

This post covers work done by several researchers at, visitors to and collaborators of ARC, including Zihao Chen, George Robinson, David Matolcsi, Jacob Stavrianos, Jiawei Li and Michael Sklar. Thanks to Aryan Bhatt, Gabriel Wu, Jiawei Li, Lee Sharkey, Victor Lecomte and Zihao Chen for comments. In the wake of...

Jan 26181

ARC progress update: Competing with sampling

by Eric Neyman, Victor Lecomte, Wilson Wu, Mikewins, Jacob_Hilton, and George Robinson

In 2025, the Alignment Research Center (ARC) has been making conceptual and theoretical progress at the fastest pace that I (Eric) have seen since I first interned in 2022. Most of this progress has come about because of a re-orientation around a more specific goal: outperforming random sampling when it...

Nov 18, 2025132

Jacob_Hilton's Shortform

May 1, 20256

A bird's eye view of ARC's research

This post includes a "flattened version" of an interactive diagram that cannot be displayed on this site. I recommend reading the original version of the post with the interactive diagram, which can be found here. Over the last few months, ARC has released a number of pieces of research. While...

Oct 23, 2024121

Jacob_Hilton

Jacob_Hilton

AlgZoo: uninterpreted models with fewer than 1,500 parameters

Mechanistic estimation for wide random MLPs

Formal verification, heuristic explanations and surprise accounting

A bird's eye view of ARC's research

Jacob_Hilton

AlgZoo: uninterpreted models with fewer than 1,500 parameters

Mechanistic estimation for wide random MLPs

Formal verification, heuristic explanations and surprise accounting

A bird's eye view of ARC's research

Announcing the ARC White-Box Estimation Challenge

Mechanistic estimation for expectations of random products

Mechanistic estimation for wide random MLPs

AlgZoo: uninterpreted models with fewer than 1,500 parameters

ARC progress update: Competing with sampling

Jacob_Hilton's Shortform

A bird's eye view of ARC's research