LESSWRONG
LW

Martin Vlach

If you get an email from aisafetyresearch@gmail.com , that is most likely me. I also read it weekly, so you can pass a message into my mind that way.
Other ~personal contacts: https://linktr.ee/uhuge

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

Newest

Open Thread - Summer 2025

Martin Vlach9d10

Link to Induction section on https://www.lesswrong.com/lw/dhg/an_intuitive_explanation_of_solomonoff_induction/#induction seems broken on mobile Chrome, @habryka

Open Thread - Summer 2025

Martin Vlach9d20

I've heard that hypothesis in a review of that blog post of Anthropic, likely by

AI Explained

maybe by

bycloud

They've called it "Chekov's gun".

Open Thread - Summer 2025

Martin Vlach9d10

What's your view on sceptic claims about RL on transformer LMs like https://arxiv.org/abs/2504.13837v2 or one that CoT instruction yields better results than <thinking> training?

Open Source Search (Summary)

Martin Vlach9d10

Not the content I expect labeled AIb Capabilities,

although I see how that'd be vindicated.

By the way, if I write an article about LMs generating SVG, that's a plaintext and if I put an SVG illustration up, that's an image, not a plaintext?

Martin Vlach's Shortform

Martin Vlach11d10

Trivial, but do token-based LMs follow instructions like "only output tokens '1', '2', '3'" where they'd output 123 as one token without that instruction?

Martin Vlach's Shortform

Martin Vlach1mo10

I'd update my take from a very pessimist/gloom one to an (additional) excited one: Those more intelligent models building a clear view of the person they/it interacts with is a sign of emerging empathy, which is a hopeful property for alignment/respect.

Vincent Li's Shortform

Martin Vlach1mo20

False Trichotomy?

Your model assumes that one cannot be all three, however, some roles demand it, and in reality people do navigate all three traits, my top example would be empathic project managers.

Martin Vlach's Shortform

Martin Vlach1mo40

Largely Sycophantically Reasoning Models -- should we claim the term for this behavior where the language model profiles the user and customizes the responses heavily?

Launching Lightspeed Grants (Apply by July 6th)

Martin Vlach2mo30

Hello @habryka, could you please adjust the text on the page to include the year when applications closed, so that it confuses people( like me) less and they won't spend reading it all wasting their time stupidly?

THANKS!

Thou shalt not command an alighned AI

Martin Vlach2mo*10

You mean the chevrons like this is non-standard, but also sub-standard, although it has the neat property to represent >Speaker one< and >>Speaker two<<? I can see the typography of those here is meh at best.-\