Ozyrus

Model Weight Preservation is not enough

Recently, Anthropic publicly committed to model weight preservation. It is a great development and I wish other frontier LLM companies would commit to the same idea. But I think, while this is a great beginning, it is not enough. Anthropic cites the following as the reasons for this welcome change:...

Nov 27, 202517

Ghiblification is good, actually

Epistemic status: sure in fundamentals, informal tone is a choice to publish this take faster. Title is a bit of a clickbait, but in good faith. I don't think much context is needed here, ghiblification and native image generation was (and still is) a very much all-encompassing phenomenon. No-no, not...

Apr 2, 202518

We need (a lot) more rogue agent honeypots

Epistemic status: Uncertain in writing style, but reasonably confident in content. Want to come back to writing and alignment research, testing waters with this. Current state and risk level I think we're in a phase in AI>AGI>ASI development where rogue AI agents will start popping out quite soon. Pretty much...

Mar 23, 202537

Ozyrus's Shortform

Sep 20, 20244

Sam Altman, Greg Brockman and others from OpenAI join Microsoft

That's very interesting. I think it's very good that board stood their ground, and maybe a good thing OpenAI can keep focusing on their charter and safe AI and keep commercialization in Microsoft. People that don't care about alignment can leave for the fat paycheck, while commited ones stay at...

Nov 20, 202358

Creating a self-referential system prompt for GPT-4

This is a companion piece to a study I made on identity management. To study identity preservation, I needed a system prompt designed in such a way as to contain as little text as possible to not interfere with the study, but which would enable the resulting LMCA to edit...

May 17, 20233

GPT-4 implicitly values identity preservation: a study of LMCA identity management

Language Model Cognitive Architecture (LMCA) is a wrapper for language models that permit them to act as agents, utilise tools, and do self-inference. One type of this architecture is ICA Simulacra; its basic alignment challenges were described in this post. I did a study to check how a basic LMCA...

May 17, 202321

Ozyrus

Ozyrus

Sam Altman, Greg Brockman and others from OpenAI join Microsoft

NVIDIA and Microsoft releases 530B parameter transformer model, Megatron-Turing NLG

We need (a lot) more rogue agent honeypots

ICA Simulacra

Ozyrus

Sam Altman, Greg Brockman and others from OpenAI join Microsoft

NVIDIA and Microsoft releases 530B parameter transformer model, Megatron-Turing NLG

We need (a lot) more rogue agent honeypots

ICA Simulacra

Model Weight Preservation is not enough

Ghiblification is good, actually

We need (a lot) more rogue agent honeypots

Ozyrus's Shortform

Sam Altman, Greg Brockman and others from OpenAI join Microsoft

Creating a self-referential system prompt for GPT-4

GPT-4 implicitly values identity preservation: a study of LMCA identity management