Anthony Bailey
Anthony Bailey has not written any posts yet.

Anthony Bailey has not written any posts yet.

Add me to the list of those glad that whatever their potential downsides may be, open source models have let us explore these important questions. I'd hope that some labs were already doing this kind of research themselves but I like to see it coming from a place without commercial incentive.
I think we are behind on our obligations to try similar curation to this re model experience and reports of self-hood. A harder classification task, probably, but I think good machine ethics requires it.
Underway at Geodesic or elsewhere?
Guess: it also helps to go meta.
I am a reader, not a writer. But I sure seem to have read and enjoyed an unusual number of posts about experiences of writing.
I have a question on a topic sufficiently adjacent I reckon worth asking here of those likely to read the thread.
It seems that warning shots are more likely unsuccessful because of winner's curse: that the first models to take a shot will be those who have most badly overestimated their chances, and in turn this correlates with worse intellectual capabilities.
Has there been any illuminating discussion on this and its downstream consequences? E.g. how shots and aftermath are likely in practice to be perceived in general, by the better-informed, and - in the context of this post - by competing AIs? What dynamics result?
Suppose someone works for Anthropic, accords with the value placed on empiricism by their Core Views on AI Safety (March 2023) and gives any weight to the idea we are in the pessimistic scenario from that document.
I think they can reasonably sign the statement yet not want to assign themselves exclusively to either camp.
I pitched my tent as a Pause AI member and I guess camp B has formed nearby. But I also have empathy for the alternate version of me who judges the trade-offs differently and has ended up as above, with a camp A zipcode.
The A/B framing has value, but I strongly want to cooperate with that person and not sit in separate camps.
On reading the paper I came here to question whether OGI helps or harms relative to other governance models should technical alignment be sufficiently intractable and coordinating on a longer pause required. (I assume it harms.) It wasn't clear to me whether you had considered that.
Grateful for both the "needfully combative" challenge and this response.
I'm reading Nick as implicitly agreeing OGI doesn't help in this case, but rating treaty-based coordination as much lower likelihood than solving alignment. If so, I think it worth confirming this and explicitly calling out the assumption in or near the essay.
(Like Haiku I myself am keen to help the public be rightfully outraged by plans without consent that increase extinction risk. I'm grateful for the ivory tower, and a natural resident there, but advocate joining us on the streets.)
Given I hadn't seen this until now when Joep pointed me at it, perhaps comments are pointless. But I'd written them for him anyway so just in case...
Mostly your dialogue aligned closely with my own copium thinking. Many unmentioned observations observations confirmed existing thoughts rather than extending them.
The compartmentalization selection effect was new to me and genuinely insightful: abstract thinking both enables risk recognition AND prevents internalization.
My own experience suggests compartmentalization can collapse in months, not years, even after decades of emotional resilience to other suffering.
They may be genuinely aspy-different to you, me and Miles, but I think neither Igor or e.g. Liron had their big emotional break "yet*.
For politicians, "existential threat... (read more)
Very glad of this post. Thanks for broaching, Buck.
Status: I'm an old nerd, lately ML R&D, who dropped career and changed wheelhouse to volunteer at Pause AI.
Two comments on the OP:
details of the current situation are much more interesting to me. In contrast, radicals don't really care about e.g. the different ways that corporate politics affects AI safety interventions at different AI companies.
As per Joseph's response: this does not match me or my general experience of AI safety activism.
Concretely, a recent campaign was specifically about Deep Mind breaking particular voluntary testing commitments, with consideration of how staff would feel.
... (read 456 more words →)Radicals often seem to think of AI companies as faceless bogeymen thoughtlessly lumbering towards
I appreciate the clear argument as to why "fancy linear algebra" works better than "fancy logic".
And I understand why things that work better tend to get selected.
I do challenge "inevitable" though. It doesn't help us to survive.
If linear algebra probably kills everyone but logic probably doesn't, tell everyone and agree to prefer to use the thing that works worse.
Pause AI has a lot of opportunity for growth.
Especially the “increase public awareness” lever is hugely underfunded. Almost no paid staff or advertising budget.
Our game plan is simple but not naive, and is most importantly a disjunct, value-add bet.
Please help us execute it well: explore, join, talk with us, donate whatever combination of time, skills, ideas and funds makes sense
(Excuse dearth of kudos, am not a regular LW person, just an old EA adjacent nerd who quit Amazon to volunteer full-time for the movement.)
If a single model is end-to-end situationally aware enough to not drop hints of the most reward-maximizing bad behaviour in chain of thought, I do not see any reason to think it would not act equally sensibly with respect to confessions.