brambleboy - LessWrong

I still think it's weird that many AI safety advocates will criticize labs for putting humanity at risk while simultaneously being paid users of their products and writing reviews of their capabilities. Like, I get it, we think AI is great as long as it's safe, we're not anti-tech, etc.... but is "don't give money to the company that's doing horrible things" such a bad principle?

"I find Lockheed Martin's continued production of cluster munitions to be absolutely abhorrent. Anyway, I just unboxed their latest M270 rocket system and I have to say I'm quite impressed..."

A Dissent on Honesty

brambleboy5d10

Presenting fabricated or cherry-picked evidence might have the best odds of persuading someone of something true, and so you could argue that doing so "maximizes the truth of the belief" they get, but that doesn't make it honest.

Show, not tell: GPT-4o is more opinionated in images than in text

brambleboy10d30

Just tried it. The description is in fact completely wrong! The only thing it sort of got right is that the top left square contains a rabbit.

Show, not tell: GPT-4o is more opinionated in images than in text

brambleboy11d31

Your 'just the image' link is the same as the other link that includes the description request, so I can't test it myself. (unless I'm misunderstanding something)

Show, not tell: GPT-4o is more opinionated in images than in text

brambleboy12d90

I see, I didn't read the thread you linked closely enough. I'm back to believing they're probably the same weights.

I'd like to point out, though, that in the chat you made, ChatGPT's description gets several details wrong. If I ask it for more detail within your chat, it gets even more details wrong (describing the notebook as white and translucent instead of brown, for example). In one of my other generations it also used a lot of vague phrases like "perhaps white or gray".

When I sent the image myself it got all the details right. I think this is good evidence that it can't see the images it generates as well as user-provided images. Idk what this implies but it's interesting ¯\_(ツ)_/¯

Slopworld 2035: The dangers of mediocre AI

brambleboy12d10

I think these sort of concerns will manifest in the near future, but it'll be confusing because AI's competence will continue to be unevenly distributed and unintuitive. I expect some AI systems will be superhuman, such as automated vehicles and some AI diagnosticians, and that incompetent AIs will gain unwarranted trust by association while the competent AIs get unwarranted distrust by association. Sometimes trusting AI will save lives, other times it will cost them.

Show, not tell: GPT-4o is more opinionated in images than in text

brambleboy13d20

This thread shows an example of ChatGPT being unable to describe the image it generated, though, and other people in the thread (seemingly) confirm that there's a call to a separate model to generate the image. The context has an influence on the images because the context is part of the tool call.

AI #111: Giving Us Pause

brambleboy16d30

We should always be able to translate latent space reasoning aka neuralese (see COCONUT) to a human language equivalent representation.

I don't think this is true at all. How do you translate, say, rotating multiple shapes in parallel into text? Current models already use neuralese as they refine their answer in the forward pass. Why can't we translate that yet? (Yes, we can decode the model's best guess at the next token, but that's not an explanation.)

Chain-of-thought isn't always faithful, but it's still what the model actually uses when it does serial computation. You're directly seeing a part of the process that produced the answer, not a hopefully-adequate approximation.

Fun With GPT-4o Image Generation

brambleboy1mo20

The rocket image with the stablediffusionweb watermark on it is interesting for multiple reasons:

It shows they haven't eliminated watermarks randomly appearing in generated images yet, which is an old problem that seems like it should've been solved by now.
It actually looks like it was generated by an older Stable Diffusion model, which means this model can emulate the look of other models.

Shortform

brambleboy1mo122

I think some long tasks are like a long list of steps that only require the output of the most recent step, and so they don't really need long context. AI improves at those just by becoming more reliable and making fewer catastrophic mistakes. On the other hand, some tasks need the AI to remember and learn from everything it's done so far, and that's where it struggles- see how Claude Plays Pokémon gets stuck in loops and has to relearn things dozens of times.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments