LESSWRONG
LW

910
Martin Vlach
7351202
Message
Dialogue
Subscribe

If you get an email from aisafetyresearch@gmail.com , that is most likely me. I also read it weekly, so you can pass a message into my mind that way.
Other ~personal contacts: https://linktr.ee/uhuge 

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
1Martin Vlach's Shortform
3y
36
Legible vs. Illegible AI Safety Problems
Martin Vlach2d10

https://philarchive.org/rec/KURTTA-2

Wow, that's comprehensive(≈long).

Reply
Anthropic & Dario’s dream
Martin Vlach2d10

It's simply not enough to develop AI gradually, perform evaluations and do interpretability work to build safe superintelligence.

but to develop AI gradually, perform evaluations and do interpretability to indicate whenever to stop developing( capabilities) seem sensibly safe.

Reply
Legible vs. Illegible AI Safety Problems
Martin Vlach2d10

Pretty brilliant and IMHO correct observations for counter-arguments, appreciated!

Reply
Why Future AIs will Require New Alignment Methods
Martin Vlach1mo20

Task duration for software engineering tasks that AIs can complete with 50% success rate (50% time horizon)

paragraph seems duplicated.

 

medical research doing so in concerning domains

"instead of" is missing..?

Reply
Martin Vlach's Shortform
Martin Vlach1mo10

My friends(M.K.,he's on Github) honorable aim to establish a term in the AI evals field: The cognitive asymetry, generating-verifying complexity gap for model-as-judge evals.

Various tasks that have a clear intelligence-to-solve vs. intelligence-to-verify-a-solution gap, ie. only X00-B LMs have a shot, but X-B model is strong on verifying are desired.
It fits nicely to the incremental iterative alighnment scaling playbook, I hope.

Reply
GPT-oss is an extremely stupid model
Martin Vlach2mo10

I'd bet "re-based" model ala https://huggingface.co/jxm/gpt-oss-20b-base when instruction-tuned would do same as similarly sized Qwen models.

Reply
Project Vend: Can Claude run a small shop?
Martin Vlach4mo10

It's provided the current time together with other 20k sys-prompt tokens, so substantially more diluted influence on the behaviours..?

Reply
So You Think You've Awoken ChatGPT
Martin Vlach4mo10

Folks like this guy hit it on hyperspeed - 

https://www.facebook.com/reel/1130046385837121/?mibextid=rS40aB7S9Ucbxw6v

 

I still remember university teacher explaining how early TV transmission were very often including/displaying ghosts of dead people, especially dead relatives.

As the tech matures from art these phenomena or hallucinations evaporate.

Reply
Energy-Based Transformers are Scalable Learners and Thinkers
Martin Vlach4mo40

 you seem to report one OOM less than this picture in https://alexiglad.github.io/blog/2025/ebt/#:~:text=a%20log%20function).-,Figure%208,-%3A%20Scaling%20for

Reply
Open Thread - Summer 2025
Martin Vlach4mo10

Link to Induction section on https://www.lesswrong.com/lw/dhg/an_intuitive_explanation_of_solomonoff_induction/#induction seems broken on mobile Chrome, @habryka 

Reply
Load More
Zombies
a year ago
(+52/-50)
4Draft: A concise theory of agentic consciousness
5mo
2
0Thou shalt not command an alighned AI
6mo
4
8G.D. as Capitalist Evolution, and the claim for humanity's (temporary) upper hand
6mo
3
6Would it be useful to collect the contexts, where various LLMs think the same?
Q
2y
Q
1
1Martin Vlach's Shortform
3y
36