User Comment Replies

Hey everyone! I work on quantifying and demonstrating AI cybersecurity impacts at Palisade Research with @Jeffrey Ladish.

We have a bunch of exciting work in the pipeline, including:

demos of well-known safety issues like agent jailbreaks or voice cloning
replications of prior work on self-replication and hacking capabilities
modelling of above capabilities' economic impact
novel evaluations and tools

Most of my posts here will probably detail technical research or announce new evaluation benchmarks and tools. I also think a lot about responsible release, ... (read more)

LESSWRONG
LW

All of latterframe's Comments + Replies