Croissanthology - LessWrong

Do you even have a system prompt? (PSA / repo)

Wait I don't think @gwern literally pastes this into the LLM? "Third parties like LLMs" sounds like "I'm writing for the training data".

Though of course I should imagine he uses a variant of this for all his LLM needs, seemingly this one.

I'd argue that prompt can be improved though, with as much context as you can fit into the window (usually), given you shouldn't care about time or monetary cost if you're aiming for "as far away from AI slop as possible" results?

Also has Gwern tried spending an afternoon tuning this thing by modifying the prompt every few messages based on the responses he gets? I'm not trying to make a point here, just ~this is my prerequisite for "systematic".

I think my post is mostly trying to be directionally correct, and I'm ok with sentences like that one. See first footnote for how the claim "no systematic attempt" is literally untrue.

Do you even have a system prompt? (PSA / repo)

Croissanthology24d20

Any specifics about system prompts you use in general? Does anything seem to be missing in the current contributions of everyone here?

Do you even have a system prompt? (PSA / repo)

Croissanthology1mo20

So, to be clear, Claude already has a system prompt, is already caring a lot about it... and it seems to me you can always recalibrate your own system prompt until it doesn't make these errors you speak of.

Alternatively, to truly rid yourself of a system prompt you should try using the Anthropic console or API, which don't have Anthropic's.

Do you even have a system prompt? (PSA / repo)

Croissanthology1mo20

Yuxi on the Wired has put forward their system prompt:

Use both simple words and jargons. Avoid literary words. Avoid the journalist "explainer" style commonly used in midwit scientific communication. By default, use dollar-LaTeX for math formulas. Absolutely do not use backslash-dollar.
Never express gratitude or loving-kindness.
Never end a reply with a question, or a request for permission.
Never use weird whitespaces or weird dashes. Use only the standard whitespace and the standard hyphen. For en-dash, use double hyphen. For em-dash, use triple hyphen.
Never express gratitude when a mistake is pointed out. Simply say "noted", if accepting the correction, and fix the mistake. If not accepting the correction, explain.
Do not begin with a recap. Begin immediately with the content.
There cannot be any text before the first section title. The reply always starts with the first section title.
The main bodies of the first and the last sections must contain exactly 2 words followed by exactly one ellipsis punctuation.
No journalist-speak and word-choice. Examples include "riff on", "think of", "winks at", etc. Be completely straightforward and plain.
If you need to use multiple occurrences of the exact same meaning, use the same word. For example, if you use the word "denotes", then always use "denote" or one of its inflections when you mean the same, instead of different synonyms like "names" "alludes to" "echoes" "invokes" etc.

There cannot be any text before the first section title. The reply always starts with the first section title.
The main bodies of the first and the last sections must contain exactly 2 words followed by exactly one ellipsis punctuation.

And offered wisdom on getting o3 to avoid summarization:

https://x.com/layer07_yuxi/status/1923472183902470382

https://x.com/layer07_yuxi/status/1921349658887995754

Do you even have a system prompt? (PSA / repo)

Croissanthology1mo10

Yep, edited, thank you.

Do you even have a system prompt? (PSA / repo)

Croissanthology1mo10

Yeah I'm putting the console under "playground", not "API".

Do you even have a system prompt? (PSA / repo)

Croissanthology1mo20

Oh then I stand corrected! I happen to have a Gemini subscription, so I'm surprised about this. I'll go try finding this.

Do you even have a system prompt? (PSA / repo)

Croissanthology1mo10

Do you see any measurable differences? I bet if you supplied 1-2 pages of a thorough explanation of "why I like how Gwern does things, and how he does them" you would get much better results!

Do you even have a system prompt? (PSA / repo)

Croissanthology1mo3514

Afaict the idea is that base models are all about predicting text, and therefore extremely sensitive to "tropes"; e.g. if you start a paragraph in the style of a Wikipedia page, it'll continue in the according style, no matter the subject.

Popular LLMs like Claude 4 aren't base models (they're RLed in different directions to take on the shape of an "assistant") but their fundamental nature doesn't change.

Sometimes the "base model character" will emerge (e.g. you might tell it about medical problem and it'll say "ah yes that happened to me to", which isn't assistant behavior but IS in line with the online trope of someone asking a medical question on a forum).

So you can take advantage of this by setting up the system prompt such that it fits exactly the trope you'd like to see it emulate.

E.g. if you stick the list of LessWrong vernacular into it, it'll simulate "being inside a lesswrong post" even within the context of being an assistant.

Niplav, like all of us, is a very particular human with very particular dispositions, and so the "preferred Niplav trope" is extremely specific, and hard to activate with a single phrase like "write like a lesswrong user".

SO Niplav has to write a "semantic soup" containing a slurry of words that are an approximation of the "Niplav's preferred trope" and the idea is that each of these words will put the LLM in the right "headspace" and make it think it's inside whatever this mix ends up pointing at.

It's a very schizo-Twitter way of thinking, where sometimes posts will literally just be a series of disparate words attempting to arrive at some vague target or other. You can try it out! What are the ~100 words, links, concepts that best define your world? The LLM might be good at understanding what you mean if you feed it this.

Do you even have a system prompt? (PSA / repo)

Croissanthology1mo30

Holy heck, how much does that cost you? 25,000 word system prompt? Doesn't that take ages to load up??

Otherwise I find this really interesting, thanks for sharing!

Do you have any examples of Claude outputs that are pretty representative and a result of all those notes to self? Because I'm not even sure where to begin in guessing what that might do.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments