Summary LLM outputs vary substantially across models, prompts, and simulated perspectives. I propose "opinion fuzzing" for systematically sampling across these dimensions to quantify and understand this variance. The concept is simple, but making it practically usable will require thoughtful tooling. In this piece I discuss what opinion fuzzing could be...
Website version · Gestalt · Repo and data Change in 18 latent capabilities between GPT-3 and o1, from Zhou et al (2025) This is the third annual review of what’s going on in technical AI safety. You could stop reading here and instead explore the data on the shallow review...
Today we're releasing RoastMyPost, a new experimental application for blog post evaluation using LLMs. Try it Here. TLDR * RoastMyPost is a new QURI application that uses LLMs and code to evaluate blog posts and research documents. * It uses a variety of LLM evaluators. Most are narrow checks: Fact...
See previous discussion here. I find a lot of professional events fairly soul-crushing and have been thinking about why. I dislike small talk. Recently I attended Manifest, and noticed that it could easily take 10 minutes of conversation to learn the very basics about a person. There were hundreds of...
A common fallacy I see regarding information sources is the assumption that they either have zero impact or completely determine critical decisions. * "Our World In Data isn't directly cited by the US President, therefore it's worthless." * "Prediction markets haven't been adopted by top elected officials, so they contribute...
One of my key thoughts is that the bar of “human intellectuals” is just really low. I think that intellectual work is useful and that these intellectuals on the whole do produce some value, but I also think we can do much better. Epistemic Status A collection of thoughts I've...
Summary Task: Make an interesting and informative Fermi estimate Prize: $300 for the top entry Deadline: February 16th, 2025 Results Announcement: By March 1st, 2025 Judges: Claude 3.5 Sonnet, the QURI team Motivation LLMs have recently made it significantly easier to make Fermi estimates. You can chat with most LLMs...