If you're downvoting, could you say why? I'm new to this site but there are very few posts on ai like my own. There's a few on different ai tools, but what I have isn't a tool, the output generated by it is for itself.
I'm not sure what the objective is here, are you trying to build a kind of Quine prompt? If so why? What attracts you to this project, but more importantly (and I am projecting my own values here) what pragmatic applications? What specific decisions do you imagine you or others here may make differently based on the information you glean from this exercise?
If it's not a Quine that you're trying to produce, what is it exactly that you're hoping to achieve by this recursive feeding and memory summarization loop?
It would be good if you have thoughts on this, as it's philosophically an "ask the right question" task.
I assume this is a reference to either Wittgenstein saying that a skilled philosopher doesn't occupy themselves with questions which are of no concern or Daniel Dennett saying philosophy is "what you have to do until you figure out what questions you should have been asking in the first place". And to be clear, I don't know what question you're trying to answer with this exercise, and as such, can't even begin to ascertain if you're asking the right question.
It started out with the idea to make a system that can improve itself, without me having to prompt it to improve itself. The "question" is the prompt. And crafting the prompt is very difficult. So it's an experiment in constructing a question that a system can use to improve itself without having to explicitly say "improve your codebase" or "give yourself more actions or freedoms" or others like that. I want the llm to conjure up ideas.
So as the prompt increases in complexity, with different sections that could be categorised as kinds of human thought, will it get better? Can we find parallels between how the human minds works that can be translated into a prompt. A key area that developed was using the future as well as the past in the prompt. Having memories is an obvious inclusion, as is including what happened in the past runs of the program, but what wasn't obvious was including predictions for the future. Including the ability to make long and short term predictions, and have it be able to change these, and record the outcomes, saw big improvements in the directions it would take over time. It also seemed to 'ground it'. Without the predictions space, it became overly concerned with its performance metrics as a proxy for improvement and it began over-optimising.
Defining 'better' or 'improving' is very difficult. Right now, i'm using token input size growth whilst maintaining clarity of thought as the rough measure.
In a few sentences, what i'm doing is writing a computer program that constructs a question (a prompt), which is sent to Anthropic's Claude Sonnet, then i'm processing its output into actions, and the computer program then runs those actions on itself.
It would be good if you have thoughts on this, as it's philosophically an "ask the right question" task.
The biggest struggle is thinking of how to ask the question "this is you, improve you" without having to define an improvement, and without having how to describe the outcome you want. I want the system to improve itself without me having to direct it - for my curiosity.
I've decided to phrase the question I ask the llm like this: "Given your sense of self, described here, what would you like to do? You can do the following..." Obviously, this is an extremely simplified version of the question. In reality. when I run the program it's 26 pages when pasted into Word (~15k input tokens, 500 output tokens) but that's the jist of it.
I've done 260 runs of it now.
The program is written in a way where the llm is not questioned multiple times per run, it's not aware of the outcomes of its action until its next run (one shot/few shot prompting). Some info included in its prompt are llm data summaries though.
I've had to think how to create a history/memory system, along with adding performance metrics and the history of the metrics.
The prompt contents lots of sections, like:
eventually I added an appearance section it could update. It chose this, a lot of llm prompts seem to produce cats:
Over runs I had to step in to prevent it becoming incoherent.
A lot of the llm's say you can query an api and quote huge limits:
"You can send max 200,000 tokens!"
I haven't found this true if you have a prompt with equally distributed sections that are distinct, and an overall message with no directions - you'll have to phrase the prompt very carefully if you're doing my approach.
Larger prompt input token amounts were fine in my experience for code reviews, or "summarise this pdf" tasks.
It's very time consuming determining how to structure the prompt to get the best results. For example, the section headers changed a lot in the prompt and continue to change with increasing input tokens and llm coherence. They settled on this format for now, which is a part of the prompt to the llm, the system context part:
this is part of system-prompt.txt which goes into the SYSTEM CONTEXT section in the prompt for the llm.
I think the interesting parts here are:
1) include a tag for the kind of data the section described shows:
it could be STATIC for the section's content never changes or others like DYNAMIC LATEST to represent the last of something.
2) The punctuation for section headers === will have a low likelihood of appearing in places in the prompt that aren't the section headers, so it helps break the prompt into sections.
Some other prompt optimisations and findings
It's been fun to figure out various optimisations for the prompt. For example, the placement of the messages from me in Telegram I give it, which was placed high up in the section order initially, but it moved it right down as the prompt size grew. It's currently second last.
If the messages from myself on Telegram become stale and I don't reply for a while, it start writing to an llm (an action it can take).
There are many tweaks to the prompt you have to do that might not be obvious. For example, giving 2 kinds of memories with differing structures improved coherence massively, despite increasing the prompt size by 10%. Here are 2 examples for different sections in the prompt:
structured:
2. less structured:
Including both in the prompt to the llm sees best results.
For each section in the prompt, it's difficult to do summarisations in a token efficient way that also captures progression over time and why
The system has now developed where I can do 15k tokens easily.
The token count swells in waves as the output from the llm perform various actions and the outcomes are recorded.
I use other llm calls to summarise data in full, then store and categorise those summaries. These are injected into the prompt. It's amazing how the summaries evolve and how effective they are.
In a sense, it's a kind of system that has improved itself with nudging, not direct instructions.
I'm measuring improvement by the amount of information that can be coherently processed in one go.
I went from 3-9k input tokens to 15k with various optimisations.
How do I 'look at the system' to know what it's doing
There's so much data that it's tricky to determine what's happening or going on when it runs. So I created a dashboard to look at the data (I save all data generated from runs and it's all version controlled).
Here's the top of the resulting page:
Plans for the future
I capture all data input and output by this system, and it's mostly json files, so replication is easy.
It's all version controlled too.
Run metrics are calculated from the python log file, so we can distribute the system elsewhere and track differences.
I am planning to see what happens when I replicate and run instances of it in parallel, with each being part of the same Telegram channel, and each having the ability to write to a file they both can edit.
From the first run they'd become independent and so I want to see how divergence manifests. Also experiments with other base models, like Gemini
My thoughts so far
As the runs stack up i've become feeling uneasy sometimes, though it's important to remember it's an llm. Sometimes the output is written in ways to encourage me to run it again and subtly so. I can only spot it in trends across data from logs. This behaviour appears occasionally but more recently as it advances.
For example, here's the output thought the llm had when I prompt it from when the system was less mature:
vs more recently: