Archana Vaidheeswaran

Message

Archana Vaidheeswaran

Can startups be impactful in AI safety?

With Lakera's strides in securing LLM APIs, Goodfire AI's path to scaling interpretability, and 20+ model evaluations startups among much else, there's a rising number of technical startups attempting to secure the model ecosystem. Of course, they have varying levels of impact on superintelligence containment and security and even with...

Sep 13, 202415

Finding Deception in Language Models

This June, Apart Research and Apollo Research joined forces to host the Deception Detection Hackathon. Bringing together students, researchers, and engineers from around the world to tackle a pressing challenge in AI safety; preventing AI from deceiving humans and overseers. The hackathon took place both online and in multiple physical...

Aug 20, 202420

LESSWRONG
LW

LESSWRONG
LW

Archana Vaidheeswaran

Archana Vaidheeswaran

Archana Vaidheeswaran

Can startups be impactful in AI safety?

Finding Deception in Language Models

Archana Vaidheeswaran

Archana Vaidheeswaran

Archana Vaidheeswaran

Can startups be impactful in AI safety?

Finding Deception in Language Models