My personal take is that projects where the funder is actively excited about them and understands the work and wants frequent reports tend to get stuff done faster... And considering the circumstances, faster seems good. So I'd recommend supporting something you find interesting and inspiring, and then keep on top of it.
In terms of groups which have their eyes on a variety of unusual and underfunded projects, I recommend both the Foresight Institute and AE Studio.
In terms of specific individuals/projects that are doing novel and interesting things, which ...
Nifty
Oh, for sure mammals have emotions much like ours. Fruit flies and shrimp? Not so much. Wrong architecture, missing key pieces.
I call this phenomenon a "moral illusion". You are engaging empathy circuits on behalf of an imagined other who doesn't exist. Category error. The only unhappiness is in the imaginer, not in the anthropomorphized object. I think this is likely what's going with the shrimp welfare people also. Maybe shrimp feel something, but I doubt very much that they feel anything like what the worried people project onto them. It's a thorny problem to be sure, since those empathy circuits are pretty important for helping humans not be cruel to other humans.
Update: Claude Code and s3.7 has been a significant step up for me. Previously, s3.6 was giving me about a 1.5x speedup. s3.5 more like 1.2x CC+s3.7 is solidly over 2x, with periods of more than that when working on easy well-represented tasks not in an area I know myself (e.g. Node.js)
Here's someone who seems to be getting a lot more out of Claude Code though: xjdr
...i have upgraded to 4 claude code sessions working in parallel in a single tmux session, each on their own feature branch and then another tmux window with yet another claude in charge of mergi
This is a big deal. I keep bringing this up, and people keep saying, "Well, if that's the case, then everything is hopeless. I can't even begin to imagine how to handle a situation like that."
I do not find this an adequate response. Defeatism is not the answer here.
If what the bad actor is trying to do with the AI is just get a clear set of instructions for a dangerous weapon, and a bit of help debugging lab errors... that costs only a trivial amount of inference compute.
Finally got some time to try this. I made a few changes (with my own Claude Code), and now it's working great! Thanks!
This seems quite technologically feasible now, and I expect the outcome would mostly depend on the quality and care that went into the specific implementation. I am even more confident that if the detail of 'the comments of the bot get further tuning via feedback, so that initial flaws get corrected', then the bot would quickly (after a few hundred such feedbacks) get 'good enough' to pass most people's bars for inclusion.
Yes, I was in basically exactly this mindset a year ago. Since then, my hope for a sane controlled transition with humanity's hand on the tiller has been slipping. I now place more hope in a vision with less top-down "yang" (ala Carlsmith) control, and more "green"/"yin". Decentralized contracts, many players bargaining for win-win solutions, a diverse landscape of players messily stumbling forward with conflicting agendas. What if we can have a messy world and make do with well-designed contracts with peer-to-peer enforcement mechanisms? Not a free-for-al...
I feel the point by Kromem on Xitter really strikes home here.
While I do see benefits of having AIs value humanity, I also worry about this. It feels very nearby trying to create a new caste of people who want what's best for the upper castes with no concern for themselves. This seems like a much trickier philosophical position to support than wanting what's best for Society (including all people, both biological and digital). Even if you and your current employer are being careful to not create any AI that have the necessary qualities of experience such t...
Balrog eval has Nethack. I want to see an LLM try to beat that.
Exciting! I'd love to hear more.
Mine is still early 2027. My timeline is unchanged by the weak showing from GPT-4.5, because my timelines were already assuming that scaling would plateau. I was also already taking RL post-training and reasoning into account. This is what I was pointing at with my Manifold Markets about post-training fine-tuning plus scaffolding resulting in a substantial capability jump. My expectation of short timelines is that just something of approximately the current capability of existing SotA models (plus reasoning and research and scaffolds and agentic iterative ...
I don't think the idea of Superwisdom / Moral RSI requires Moral Realism. Personally, I am a big fan of research being put into a Superwisdom Agenda, but I don't believe in Moral Realism. In fact, I'd be against a project which had (in my view, harmful and incorrect) assumptions about Moral Realism as a core part of its aims.
So I think you should ask yourself whether this is necessarily part of the Superwisdom Agenda, or if you could envision the agenda being at least agnostic about Moral Realism.
I mean, suicide seems much more likely to me given the circumstances... but I also would describe this as compelling evidence. Like, if he had been killed and there wasn't a fight, him being drunk makes sense as a way to have pre-rendered him helpless by someone planning to kill him? Similarly, wouldn't a cold-blooded killer be expected to be wearing gloves and to place Suchir's hand on the gun before shooting him?
Nice to see my team's work (Tice 2024) getting used!
Not always true. Sometimes the locks are 'real' but deliberately chosen to be easy to pick, and the magician practices picking that particular lock. This doesn't change the point much, which is that watching stage magicians is not a good way to get an idea of how hard it is to do X, for basically an value of X. Locking Picking lawyer on youtube is a fun way to learn about locks.
Desired AI safety tool: A combo translator/chat interface (e.g. custom webpage) split down the middle. On one side I can type in English, and receive English translations. On the other side is a model (I give an model name, host address, and api key). The model receives all my text translated (somehow) into a language of my specification. All the models outputs are displayed raw on the 'model' side, but then translated to English on 'my' side.
Use case: exploring and red teaming models in languages other than English
Another take on the plausibility of RSI; https://x.com/jam3scampbell/status/1892521791282614643
(I think RSI soon will be a huge deal)
Have you noticed that AI companies have been opening offices in Switzerland recently? I'm excited about it.
This is exactly why the bio team for WMDP decided to deliberately include distractors involving relatively less harmful stuff. We didn't want to publicly publish a benchmark which gave a laser-focused "how to be super dangerous" score. We aimed for a fuzzier decision boundary. This brought criticism from experts at the labs who said that the benchmark included too much harmless stuff. I still think the trade-off was worthwhile.
Also worth considering is that how much an "institution" holds a view on average may not matter nearly as much as how the powerful decision makers within or above that institution feel.
There are a lot of possible plans which I can imagine some group feasibly having which would meet one of the following criteria:
If one of these criteria or similar applies to the plan, then you can't discuss it openly without sabotaging it. Making strategic plans with all your cards laid out on the table (whole open-ended hide theirs) makes things substantially harder.
A point in favor of evals being helpful for advancing AI capabilities: https://x.com/polynoamial/status/1887561611046756740
Noam Brown @polynoamial A lot of grad students have asked me how they can best contribute to the field of AI when they are short on GPUs and making better evals is one thing I consistently point to.
It has been pretty clearly announced to the world by various tech leaders that they are explicitly spending billions of dollars to produce "new minds vastly smarter than any person, which pose double-digit risk of killing everyone on Earth". This pronouncement has not yet incited riots. I feel like discussing whether Anthropic should be on the riot-target-list is a conversation that should happen after the OpenAI/Microsoft, DeepMind/Google, and Chinese datacenters have been burnt to the ground.
Once those datacenters have been reduced to rubble, and the chi...
People have said that to get a good prompt it's better to have a discussion with a model like o3-mini, o1, or Claude first, and clarify various details about what you are imagining, then give the whole conversation as a prompt to OA Deep Research.
Fair enough. I'm frustrated and worried, and should have phrased that more neutrally. I wanted to make stronger arguments for my point, and then partway through my comment realized I didn't feel good about sharing my thoughts.
I think the best I can do is gesture at strategy games that involve private information and strategic deception like Diplomacy and Stratego and MtG and Poker, and say that in situations with high stakes and politics and hidden information, perhaps don't take all moves made by all players at literally face value. Think a bit to yoursel...
I don't believe the nuclear bomb was truly built to not be used from the point of view of the US gov. I think that was just a lie to manipulate scientists who might otherwise have been unwilling to help.
I don't think any of the AI builders are anywhere close to "building AI not to be used". This seems even more clear than with nuclear, since AI has clear beneficial peacetime economically valuable uses.
Regulation does make things worse if you believe the regulation will fail to work as intended for one reason or another. For example, my argument that puttin...
I don't feel free to share my model, unfortunately. Hopefully someone else will chime in. I agree with your point and that this is a good question!
I am not trying to say I am certain that Anthropic is going to be net positive, just that that's my view as the higher probability.
Oops, yes.
I'm pretty sure that measures of the persuasiveness of a model which focus on text are going to greatly underestimate the true potential of future powerful AI.
I think a future powerful AI would need different inputs and outputs to perform at maximum persuasiveness.
Well, or as is often the case, the people arguing against changes are intentionally exploiting loopholes and don't want their valuable loopholes removed.
I don't like the idea. Here's an alternative I'd like to propose:
After a user gets a post or comment rejected, have them be given the opportunity to rewrite and resubmit it with the help of an AI mentor. The AI mentor should be able to give reasonably accurate feedback, and won't accept the revision until it is clearly above a quality line.
I don't think this is currently easy to make (well), because I think it would be too hard to get current LLMs to be sufficiently accurate in LessWrong specific quality judgement and advice. If, at some poin...
I think the correct way to address this is by also testing the other models with agent scaffolds that supply web search and a python interpreter.
I think it's wrong to jump to the conclusion that non-agent-finetuned models can't benefit from tools.
See for example:
https://x.com/Justin_Halford_/status/1885547672108511281
o3-mini got 32% on Frontier Math (!) when given access to use a Python tool. In an AMA, @kevinweil / @snsf (OAI) both referenced tool use w reasoning models incl retrieval (!) as a future rollout.
Model...
Good work, thanks for doing this.
For future work, you might consider looking into inference suppliers like Hyperdimensional for DeepSeek models.
Well, I upvoted your comment, which I think adds important nuance. I will also edit my shortform to explicitly say to check your comment. Hopefully, the combination of the two is not too misleading. Please add more thoughts as they occur to you about how better to frame this.
Yeah, I just found a cerebras post which claims 2100 serial tokens/sec.
Yeah, of course. Just trying to get some kind of rough idea at what point future systems will be starting from.
Oops, bamboozled. Thanks, I'll look into it more and edit accordingly.
[Edit 2: faaaaaaast. https://x.com/jrysana/status/1902194419190706667 ] [Edit: Please also see Nick's reply below for ways in which this framing lacks nuance and may be misleading if taken at face value.]
https://blogs.nvidia.com/blog/deepseek-r1-nim-microservice/
The DeepSeek-R1 NIM microservice can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system.
[Edit: that's throughput including parallel batches, not serial speed! Sorry, my mistake.
Here's a claim from Cerebras of 2100 tokens/sec serial speed on Llama 80B. https://cerebras.ai/b...
How much of their original capital did the French nobility retain at the end of the French revolution?
How much capital (value of territorial extent) do chimpanzees retain now as compared to 20k years ago?
Anthropic ppl had also said approximately this publicly. Saying that it's too soon to make the rules, since we'd end up mispecifying due to ignorance of tomorrow's models.
Some brief reference definitions for clarifying conversations.
Consciousness:
Sentient:
I have been discussing thoughts along these lines. My essay A Path to Humany Autonomy argues that we need to slow AI progress and speed up human intelligence progress. My plan for how to accomplish slowing AI progress is to use novel decentralized governance mechanisms aided by narrow AI tools. I am working on fleshing out these governance ideas in a doc. Happy to share.
Well... One problem here is that a model could be superhuman at:
And be merely high-human-level at:
Such an entity as described could absolutely be an existential threat to hum...
https://www.lesswrong.com/posts/CHD5m9fnosr7L3dto/friendship-is-optimal-a-my-little-pony-fanfic-about-an?commentId=p6br8sPHG5QysfFkw