User Comment Replies

Quotes from the Stargate press conference

Burny1mo4-1

0.5 out of $7T is done...

4ChristianKl1mo

The 0.5 is planned, but it's not clear that the funding is actually there.

Jesse Hoogland's Shortform

Burny1mo92

No MCTS, no PRM...

scaling up CoT with simple RL and scalar rewards...

emergent behaviour

A List of 45+ Mech Interp Project Ideas from Apollo Research’s Interpretability Team

Burny8mo30

Thanks for posting this!

Claude 3 claims it's conscious, doesn't want to die or be modified

Burny1y*61

https://twitter.com/AISafetyMemes/status/1764894816226386004 https://twitter.com/alexalbert__/status/1764722513014329620

How emergent / functionally special/ out of distribution is this behavior? Maybe Anthropic is playing big brain 4D chess by training Claude on data with self awareness like scenarios to cause panic by pushing capabilities with it and slow down the AI race by resulting regulations while it not being out of distribution emergent behavior but deeply part of training data and it being in distribution classical features interacting in circuits

3Jiao Bu1y

"Cause Panic." Outside of the typical drudgereport level "AI admits it wants to kill and eat people" type of headline, what do you expect? My prediction, with medium confidence, is there won't be meaningful panic until people see it directly connected with job loss. There will be handwringing about deepfakes and politics, but unfortunately that is almost a lost cause since I can already make deepfakes on my own expensive GPU computer from 3 years ago with open source GANs. Anthropic and others will probably make statements about it (I hear the word "safe" so much said by every tech company in this space, it makes me nervous, like saying "Our boys will be home by Christmas" or something). But as far as meaningful action? A large number of people will need to first lose economic security/power.

Possible OpenAI's Q* breakthrough and DeepMind's AlphaGo-type systems plus LLMs

Burny1y20

Thanks, will do!

OpenAI: The Battle of the Board

Burny1y20

Merging with Anthropic may have been a better outcome

2the gears to ascension1y

it's unlikely, even had amodei agreed in principle, that this could have been legally possible.

1HiddenPrior1y

I believe that power rested in the hands of the CEO the board selected, the board itself does not have that kind of power, and there may be other reasons we are not aware of that lead them to decide against that possibility.

Sam Altman fired from OpenAI

Burny1y372

"OpenAI’s ouster of CEO Sam Altman on Friday followed internal arguments among employees about whether the company was developing AI safely enough, according to people with knowledge of the situation.

Such disagreements were high on the minds of some employees during an impromptu all-hands meeting following the firing. Ilya Sutskever, a co-founder and board member at OpenAI who was responsible for limiting societal harms from its AI, took a spate of questions.

At least two employees asked Sutskever—who has been responsible for OpenAI’s biggest research break... (read more)

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

Burny1y60

We're at the start of interpretability, but the progress is lovely! Superposition was such a bottleneck even in small models.

More notes:

https://twitter.com/ch402/status/1710004685560750153 https://twitter.com/ch402/status/1710004416148058535

"Scalability of this approach -- can we do this on large models? Scalability of analysis -- can we turn a microscopic understanding of large models into a macroscopic story that answers questions we care about?"

"Make this work for real models. Find out what features exist in large models. Understand new, mor... (read more)

LESSWRONG
LW

All of Burny's Comments + Replies