orthonormal

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage

This is a story about a flawed Manifold market, about how easy it is to buy significant objective-sounding publicity for your preferred politics, and about why I've downgraded my respect for all but the largest prediction markets. I've had a Manifold account for a while, but I didn't use it...

Aug 6, 2024201

Run evals on base models too!

(Creating more visibility for a comment thread with Rohin Shah.) Currently, DeepMind's capabilities evals are run on the post-RL*F (RLHF/RLAIF) models and not on the base models. This worries me because RL*F will train a base model to stop displaying capabilities, but this isn't a guarantee that it trains the...

Apr 4, 202451

Mesa-Optimizers via Grokking

Summary: Recent interpretability work on "grokking" suggests a mechanism for a powerful mesa-optimizer to emerge suddenly from a ML model. Inspired By: A Mechanistic Interpretability Analysis of Grokking Overview of Grokking In January 2022, a team from OpenAI posted an article about a phenomenon they dubbed "grokking", where they trained...

Dec 6, 202236

Transitive Tolerance Means Intolerance

Our society is pretty messed up around arguments of whose ideas we should and shouldn't tolerate. Some of this is inevitable: even without censorship, there are cases where group X can choose to actively show respect to person Y, and members of X will argue about that, and people with...

Aug 14, 202139

Improvement for pundit prediction comparisons

[EDIT: SimonM pointed out a possibly-fatal flaw with this plan: it would probably discourage more pundits from joining the prediction-making club at all, and adding to that club is a higher priority than comparing the members more accurately.] Stop me if you've heard this one. (Seriously, I may not be...

Mar 28, 202116

Developmental Stages of GPTs

Epistemic Status: I only know as much as anyone else in my reference class (I build ML models, I can grok the GPT papers, and I don't work for OpenAI or a similar lab). But I think my thesis is original. Related: Gwern on GPT-3 For the last several years,...

Jul 26, 2020140

Don't Make Your Problems Hide

I've seen a worrying trend in people who've learned introspection and self-improvement methods from CFAR, or analogous ones from CBT. They make better life decisions, they calm their emotions in the moment. But they still look just as stressed as ever. They stamp out every internal conflict they can see,...

Jun 27, 202063

orthonormal

orthonormal

The Loudest Alarm Is Probably False

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage

Choosing the Zero Point

Developmental Stages of GPTs

orthonormal

The Loudest Alarm Is Probably False

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage

Choosing the Zero Point

Developmental Stages of GPTs

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage

Run evals on base models too!

Mesa-Optimizers via Grokking

Transitive Tolerance Means Intolerance

Improvement for pundit prediction comparisons

Developmental Stages of GPTs

Don't Make Your Problems Hide