LESSWRONG
LW

All of MrCheeze's Comments + Replies

Recent AI model progress feels mostly like bullshit

But you have to be careful here, since the results heavily depend on details of the harness, as well as on how thoroughly they have memorized walkthroughs of the game.

Research Notes: Running Claude 3.7, Gemini 2.5 Pro, and o3 on Pokémon Red

MrCheeze3mo31

Text adventures do seem like a good eval right now, since they're the ONLY games that can be tested without either relying on vision (which is still very bad), or writing a custom harness for each game (in which case your results depend heavily on the harness).

Is Gemini now better than Claude at Pokémon?

MrCheeze3mo20

(Gemini did actually write much of the Gemini_Plays_Pokemon scaffolding, but only in the sense of doing what David told it to do, not designing and testing it.)

I think you're probably right that a LLM coding its own scaffolding is probably more achievable than one playing the game like a human, but I don't think current models can do it - watching the streams, the models don't seem like they understand their own flaws, although admittedly they haven't been prompted to focus on this.

1Ozyrus3mo

Not being able to do it right now is perfectly fine, still warrants setting it up to see when exactly they will start to be able to do it.

Is Gemini now better than Claude at Pokémon?

MrCheeze3mo60

On the other hand, Claude has (arguably) a better pathfinding tool. As long as it requests to be moved to a valid set of coordinates from the screenshot overlay grid, the tool will move it there. Gemini mostly navigates on its own, although it has access to another instance of Gemini dedicated just to pathfinding.

I very much argue this. Claude's navigator tool can only navigate to coordinates that are onscreen, meaning that the main model needs to have some idea of where it's going. Which means grappling with problems that are extremely difficult for both ... (read more)

Research Notes: Running Claude 3.7, Gemini 2.5 Pro, and o3 on Pokémon Red

MrCheeze3mo*243

I have not tested if Gemini can distinguish this tree (and intend to eventually). This may very well be the only reason Gemini has progressed further.

You missed an important fact about the Gemini stream, which is that it just reads the presence of these trees from RAM and labels them for the model (along with a few other special tiles like ledges and water). Nevertheless I do think Gemini's vision is better, by which I mean if you provide it a screenshot it will sometimes identify the correct tree, unlike Claude who will never do so. (Although to my knowle... (read more)

So how well is Claude playing Pokémon?

MrCheeze4mo180

And now in the second run it has entered a similar delusional loop. It knows the way to Cerulean City is via Route 4, but the route before and after Mt. Moon are both considered part of Route 4. Therefore it deluded itself into thinking it can get to Cerulean from the first part of the route. Because of that, every time it accidentally stumbles into Mt Moon and is making substantial progress towards the exit, it intentionally blacks out to get teleported back outside the entrance, so it can look for the nonexistent path forwards.

From what I've seen on stre... (read more)

gilch4mo153

Update: Claude made it to Cerulean City today, after wandering the Mt. Moon area for 69 hours.

brambleboy4mo102

Claude finally made it to Cerulean after the "Critique Claude" component correctly identified that it was stuck in a loop, and decided to go through Mt. Moon. (I think Critique Claude is prompted specifically to stop loops.)

So how well is Claude playing Pokémon?

MrCheeze4mo142

Note that the creator stated that the setup is intentionally somewhat underengineered:

I do not claim this is the world's most incredible agent harness; in fact, I explicitly have tried not to "hyper engineer" this to be like the best chance that exists to beat Pokemon. I think it'd be trivial to build a better computer program to beat Pokemon with Claude in the loop.

This is like meant to be some combination of like "understand what Claude's good at and Benchmark and understand Claude-alongside-a-simple-agent-harness", so what that boils down to is this is like a pretty straightforward tool-using agent.

So how well is Claude playing Pokémon?

MrCheeze4mo373

This basically sums up how it's doing: https://www.reddit.com/r/ClaudePlaysPokemon/comments/1j568ck/the_mount_moon_experience

Of course much of that is basic capability issues -poor spatial reasoning, short term memory that doesn't come anywhere close to lasting for 1 lap, etc.

But I've also noticed ways in which Claude's personality is sabotaging it. Claude is capable of taking notes saying that it "THOROUGHLY confirmed NO passages" through the eastern barrier - but never gets impatient or frustrated, so this doesn't actually prevent it from trying the same... (read more)

MrCheeze4mo180

From what I've seen on stre... (read more)

Why I'm doing PauseAI

MrCheeze1y75

"Under development" and "currently training" I interpret as having significantly different meanings.

The ‘ petertodd’ phenomenon

MrCheeze2y118

Doesn't strike me as inevitable at all, just a result of OpenAI following similar methods for creating their tokenizer twice. (In both cases, leading to a few long strings being included as tokens even though they don't actually appear frequently in large corpuses.)

They presumably had already made the GPT-4 tokenizer long before SolidGoldMagikarp was discovered in the GPT-2/GPT-3 one.

The ‘ petertodd’ phenomenon

MrCheeze2y263

Prior to OpenAI's 2023-02-14 patching of ChatGPT (which seemingly prevents it from directly encountering glitch tokens like ‘ petertodd’)

I've never seen it mentioned around here, but since that update, ChatGPT is using a different tokenizer that has glitch tokens of its own:

https://github.com/openai/tiktoken/blob/46287bfa493f8ccca4d927386d7ea9cc20487525/tiktoken/model.py#L16

https://wetdry.world/@MrCheeze/110130795421274483

3Matthew_Opitz2y

This is important. If these glitch-tokens are an inevitable tendency for any LLMs regardless of tweaks to how they were trained, then that would be big news and possibly a window into understanding how LLMs function. Did the cl100k_base tokenizer that ChatGPT and GPT-4 used involve any manual tweaking of the training corpus AFTER the tokenizer was set in stone, as is suspected to have happened with GPT-3's training? Or is this just an emergent property of how LLMs organize certain parts of their training data?

Rationality Quotes September 2012

MrCheeze12y00

I'd say this captures the spirit of Less Wrong perfectly.

[This comment is no longer endorsed by its author]Reply

Unbounded Scales, Huge Jury Awards, & Futurism

MrCheeze13y00

500 years still sounds optimistic to me.

[This comment is no longer endorsed by its author]Reply

1Icenogle7y

You probably won't see this since it's six years old, but just in case, why do you think such a long time? A significant portion of people who are in the AI field give a much closer number, and while predicting the future isn't exact, 500 years is a pretty big difference from the numbers I've most often seen.

AI timeline predictions: are we getting better?

MrCheeze13y10

The key is in the phrase "much more complicated". The sort of algorithm that could become a mind would be an enormous leap forward in comparison to anything that has ever been done so far.

[This comment is no longer endorsed by its author]Reply

AI timeline predictions: are we getting better?

MrCheeze13y00

Man, people's estimations seem REALLY early. The idea of AI in fifty years seems almost absurd to me.

[This comment is no longer endorsed by its author]Reply

2Mitchell_Porter13y

Why? Are you thinking of an AI-in-a-laptop? You should think in terms of, say, a whole data center as devoted to a single AI. This is how search engines already work - a single "information retrieval" algorithm, using techniques from linear algebra to process the enormous data structures in which the records are kept, in order to answer a query. The move towards AI just means that the algorithm becomes much more complicated. When I remember that we already have warehouses of thousands of networked computers, tended by human staff, containing distributed proto-AIs that interact with the public ... then imminent AI isn't hard to imagine at all.

Math is Subjunctively Objective

MrCheeze13y00

I still stand by my belief that 2 + 3 = 5 does not in fact exist, and yet it is still true that adding two things with three things will always result in five things.

[This comment is no longer endorsed by its author]Reply

0Kindly13y

I don't think that what you just said means anything.

How to Seem (and Be) Deep

MrCheeze13y-10

"I think that if you took someone who was immortal, and asked them if they wanted to die for benefit X, they would say no."

This doesn't help against arguments that stable immortality is impossible or incredibly unlikely, of course, but I suppose those aren't the arguments you were countering at the time.

[This comment is no longer endorsed by its author]Reply

Pascal's Mugging: Tiny Probabilities of Vast Utilities

MrCheeze13y00

Yes, but the chance of magic powers from outside the matrix is low enough that what he says has an insignificant difference.

...or is an insignificant difference even possible?

[This comment is no longer endorsed by its author]Reply

2DanielLC13y

The chance of magic powers from outside the matrix is nothing compared to 3^^^^3. It makes no difference in whether or not it's worth while to pay him.

SIAI - An Examination

MrCheeze14y00

Hm. I'd rather have seen more of the analysis on whether what they do with the money is useful, but this is something.

[This comment is no longer endorsed by its author]Reply

"Stuck In The Middle With Bruce"

MrCheeze14y00

Hmm, didn't really get anything out of this. Maybe you need to be able to be competent at stuff in the first place to sabotage yourself?

[This comment is no longer endorsed by its author]Reply

Prisoner's Dilemma Tournament Results

MrCheeze14y00

If this ever happens again I'd make one for the long-term evolutionary one that tries to learn strategies U-style, and then remembers what it learned in future rounds. If that's allowed.

[This comment is no longer endorsed by its author]Reply

0benelliott14y

That sort of thing wasn't allowed this time, but I agree that a variation of that rule would be interesting (but possibly difficult to code)

MrCheeze14y-10

Shouldn't priority be given to improving quality of lives first?

[This comment is no longer endorsed by its author]Reply

An Outside View on Less Wrong's Advice

MrCheeze14y20

It makes me sad because it means smart people aren't doing things that are actually useful.

[This comment is no longer endorsed by its author]Reply

Decoherence is Simple

MrCheeze14y-20

This isn't quite what your post was about, but one thing I've never understood is how anyone could possibly find "the universe is totally random" to be a MORE simple explanation.

[This comment is no longer endorsed by its author]Reply

To Spread Science, Keep It Secret

MrCheeze14y00

Well this explains a lot.

[This comment is no longer endorsed by its author]Reply

Counterfactual Calculation and Observational Knowledge

MrCheeze14y00

The thing is, the other world was chosen specifically BECAUSE it had the opposite answer, not randomly like the world you're in.

[This comment is no longer endorsed by its author]Reply

Why Truth?

MrCheeze14y400

Maybe in ninety-eight universes out of 100 it does blow up and we just see the one that's left; and he's actually giving an accurate number. :P

shokwave14y641

The TV show version of the anthropic principle: all the episodes where the Enterprise does blow up aren't made.

Not Taking Over the World

MrCheeze14y00

"Give it to you" is a pretty lame answer but I'm at least able to recognise the fact that I'm not even close to being a good choice for having it.

That's more or less completely ignoring the question but the only answers I could ever come up with at the moment are what I think you call cached thoughts here.

[This comment is no longer endorsed by its author]Reply

31 Laws of Fun

MrCheeze14y00

Well it's good to see that if you somehow found a way to implement your ideas you would at least do it well.

[This comment is no longer endorsed by its author]Reply

You Only Live Twice

MrCheeze14y-10

retracted

[This comment is no longer endorsed by its author]Reply

0Friendly-HI14y

You make no sense whatsoever. Yeah, life can be boring and dull sometimes... but your life in 100 years may be as different from yours today, as that of the average 18th Century peasant is from your blessed existence nowadays. Life can be fun and any future that has the technology to revive you would presumably offer you plenty of fun, and they also would be able to rid you from any of your personal and psychological shortcomings if you wish. In the meantime, go and smarten up with some positive psychology. Doesn't help everyone with everything, but it's as good as it gets for now.

The Moral Void

MrCheeze15y00

So... the correct answer is to dissolve the question, yes?

[This comment is no longer endorsed by its author]Reply

The Hero With A Thousand Chances

MrCheeze15y40

I had trouble reading it too. If you really don't want to do it like that, then at least just take out all the quotes except for at the very beginning and end of the speech (no quotes at all between paragraphs).

[This comment is no longer endorsed by its author]Reply

Two Cult Koans

MrCheeze15y-20

Okay, I have no idea whatsoever what this is supposed to be saying.

EDIT: Wait, hold on. Is it supposed to not make sense?

[This comment is no longer endorsed by its author]Reply

3bigjeff514y

A cult is what you make of it. The first novice, Ni no Tachi, was not a cultist. The key was Ougi's statement: "How long will you repeat my words and ignore the meaning?" Ni no Tachi learned the meanings and became a true rationalist - he understood that the clothes had nothing at all to do with it. You could kind of think of them as a way of promoting group identity while weeding out those destined to be forever irrational. Bouzo never even got to the key statement. He stopped questioning as soon as Ougi implied that the silly hat was necessary. He learned all the teachings of Ougi, but he never understood their meanings. Eventually he would only discuss rationality while wearing a clown suit, because he believed without any evidence that silly clothes were related to probability theory, which is completely irrational. Bouzo was a cultist.

0[anonymous]15y

Wait, hold on. Is it supposed to not make sense?

The Sword of Good

MrCheeze15y00

Loved the story and also the first time I took you strong atheism completely seriously, but I think that one bit where they stab those three sleeping guys went a bit too strongly to the "no, this definately isn't right" side of things. Although I didn't think about that scene at all when I was trying to figure out which side was the Good side and thought about the death of Alek as my main piece of evidence for the Lord of Dark being Bad possibility so that's something.

[This comment is no longer endorsed by its author]Reply