LESSWRONG
LW

All of Roman Leventov's Comments + Replies

An Optimistic 2027 Timeline

But then the possibilities for 2027 branch on whether there are reliable agents, which doesn't seem knowable either way right now.

Very reliable, long-horizon agency is already in the capability overhang of Gemini 2.5 pro, perhaps even the previous-tier models (gemini 2.0 exp, sonnet 3.5/3.7, gpt-4o, grok 3, deepseek r1, llama 4). It's just the matter of harness/agent-wrapping logic and inference-time compute budget.

Agency engineering is currently in the brute-force stage. Agent engineers over rely on a "single LLM rollout" to be robust, but also often use ... (read more)

2Vladimir_Nesov1mo

I meant "realiable agents" in the AI 2027 sense, that is something on the order of being sufficient for automated AI research, leading to much more revenue and investment in the lead-up rather than stalling at ~$100bn per individual training system for multiple years. My point is that it's not currently knowable if it happens imminently in 2026-2027 or at least a few years later, meaning I don't expect that evidence exists that distinguishes these possibilities even within the leading AI companies.

Roman Leventov's Shortform

Roman Leventov2mo50

It seems that a lot of white collar jobs will become (already becoming) positional goods, such as aristocratic titles, at least for a few years, possibly longer.

AI will do 100% of the "meat" of the job better than almost all humans, and ~equally for every user (prompting won't matter much).

But business will still demand accountability for results, and that the workers can claim that they understand and attest AI outputs (these claims themselves won't be tested, though, nor would it really matter in the grand scheme of things). At the same time, the product... (read more)

Gradual Disempowerment, Shell Games and Flinches

Roman Leventov3mo4-2

Even for those not directly employed by AI labs, there are similar dynamics in the broader AI safety community. Careers, research funding, and professional networks are increasingly built around certain ways of thinking about AI risk. Gradual disempowerment doesn't fit neatly into these frameworks. It suggests we need different kinds of expertise and different approaches than what many have invested years developing. Academic incentives also currently do not point here - there are likely less than ten economists taking this seriously, trans-disciplin

Roman Leventov3mo110

My quick impression is that this is a brutal and highly significant limitation of this kind of research. It's just incredibly expensive for others to read and evaluate, so it's very common for it to get ignored.

I'd predict that if you improved the arguments by 50%, it would lead to little extra uptake.

I think this is wrong. The introduction of the GD paper takes no more than 10 minutes to read and no significant cognitive effort to grasp, really. I don't think there is more than 10% potential of making it any clearer or approachable.

3ozziegooen3mo

Even 10 minutes is a lot, for many people. I might see 100 semi-interesting Tweets and Hacker News posts that link to lengthy articles per day, and that's already filtered - I definitely can't spend 10 min each on many of them. "No significant cognitive effort" to read a nuanced semi-academic article with unique terminology? I tried spending around ~20-30min understanding this paper, and didn't find it trivial. I think it's very easy to make mistakes about what papers like this are really trying to say (In many ways, the above post lists out a bunch of common mistakes, for instance). I know the authors and a lot of related work, and even with that, I didn't find it trivial. I imagine things are severely harder for people much less close to this area.

3Knight Lee3mo

I think there's an important crux here. For people who write ideas/theories, and hope their ideas/theories get traction, the frustration is often directed at critics who reject their idea without taking the time to read it. Meanwhile, there are many supportive people in the comments, who did take the time to read the idea, and did say "yes, this is a good idea, I never thought of it! Good luck working on it." The author only sees these two groups of people, and feels that his/her fight is to push people in the former group to read their idea more clearly, so that they may move to the latter group. But the author doesn't realize that even if they did read the idea, and did move to the latter supportive group. The idea will be almost as badly ignored in the end. It would cure his/her frustration towards "people who never bothered to read," but his/her idea won't take off and succeed either. He/she will finally learn that there is something else to be frustrated about: even if everyone reads your idea and agrees with your idea, nobody has time to do anything about it. A lot of authors never reach this second stage of frustration, because there is indeed a large group of critics going around criticizing ideas without reading them. But these critics are rarely the real reason your idea is getting ignored. I'm one of the people who have strongly supported many ideas/theories, only to never talk about them again, because I don't have the time. I see many others doing this too. The real problem is still the lack of time. EDIT: actually I'm a bit confused. Maybe the real problem is the argument cannot just argue why the idea is good or why the theory is plausible, but why a reader (satisfying some criteria) should drop what she is doing and work on the idea/theory for a while. Maybe it should give a cost and benefit analysis? I'm not sure if this will fix the Idea Ignoring Problem.

The Failed Strategy of Artificial Intelligence Doomers

Roman Leventov3mo20

https://gradual-disempowerment.ai/ is mostly about institutional progress, not narrow technical progress.

AI research assistants competition 2024Q3: Tie between Elicit and You.com

Roman Leventov7mo50

Undermind.ai I think is much more useful for searching concepts and ideas in papers rather than extracting tabular info a la Elicit. Nominally Elicit can do the former, too, but is quite bad in my experience.

The Great Data Integration Schlep

Roman Leventov7mo40

https://openmined.org/ develops Syft, a framework for "private computation" in secure enclaves. It potentially reduces the barriers for data integration both within particularly bureaucratic orgs and across orgs.

My motivation and theory of change for working in AI healthtech

Roman Leventov7mo130

Thanks for the post, I agree with it!

I just wrote a post with differential knowledge interconnection thesis, where I argue that it is on net beneficial to develop AI capabilities such as

Federated learning, privacy-preserving multi-party computation, and privacy-preserving machine learning.
Federated inference and belief sharing.
Protocols and file formats for data, belief, or claim exchange and validation.
Semantic knowledge mining and hybrid reasoning on (federated) knowledge graphs and multimodal data.
Structured or semantic search.
Datastore federation for r

Roman Leventov1y60

I think the model of commercial R&D lab would often suit alignment work better than a "classical" startup company. Conjecture and AE Studio come to mind. Answer.AI, founded by Jeremy Howard (of Fast.ai and Kaggle) and Eric Ries (Lean Startup) elaborates on this business and organisational model here: https://www.answer.ai/posts/2023-12-12-launch.html.

2Judd Rosenblatt1y

Yes, excellent point, and thanks for the callout. Note though that a fundamental part of this is that we at AE Studio do intend eventually to incubate as part of our skunkworks program alignment-driven startups. We've seen that we can take excellent people, have them grow on client projects for some amount of time, get better at stuff they don't even realize they need to get better at in a very high-accountability way, and then be well positioned to found startups we incubate internally. We've not turned attention to internally-incubated startups for alignment specifically yet but hope to by later this year or early next. Meanwhile, there are not many orgs like us, and for various reasons it's easier to start a startup than to start something doing what we do. If you think you can start something like what we do, I'd generally recommend it. You're probably more likely to succeed doing something more focused though to start. Also, to start, we flailed a bit till we figured out we should get good at one thing at a time before doing more and more.