Probably because the dataset of images + captions scraped from the internet consists of lots of boring photos with locations attributed to them, and not a lot of labeled screenshots of pixel art games with by comparison. This is similar to how LLMs are very good at stylometry, because they have lots of experience making inferences about authors based on patterns in the text.
Another idea: real photos have lots of tiny details to notice regularities in. Pixel art images, on the other hand, can only be interpreted properly by "looking at the big picture". AI vision is known to be biased towards textures rather than shape, compared to humans.
I still think it's weird that many AI safety advocates will criticize labs for putting humanity at risk while simultaneously being paid users of their products and writing reviews of their capabilities. Like, I get it, we think AI is great as long as it's safe, we're not anti-tech, etc.... but is "don't give money to the company that's doing horrible things" such a bad principle?
"I find Lockheed Martin's continued production of cluster munitions to be absolutely abhorrent. Anyway, I just unboxed their latest M270 rocket system and I have to say I'm quite impressed..."
Presenting fabricated or cherry-picked evidence might have the best odds of persuading someone of something true, and so you could argue that doing so "maximizes the truth of the belief" they get, but that doesn't make it honest.
Just tried it. The description is in fact completely wrong! The only thing it sort of got right is that the top left square contains a rabbit.
Your 'just the image' link is the same as the other link that includes the description request, so I can't test it myself. (unless I'm misunderstanding something)
I see, I didn't read the thread you linked closely enough. I'm back to believing they're probably the same weights.
I'd like to point out, though, that in the chat you made, ChatGPT's description gets several details wrong. If I ask it for more detail within your chat, it gets even more details wrong (describing the notebook as white and translucent instead of brown, for example). In one of my other generations it also used a lot of vague phrases like "perhaps white or gray".
When I sent the image myself it got all the details right. I think this is go...
I think these sort of concerns will manifest in the near future, but it'll be confusing because AI's competence will continue to be unevenly distributed and unintuitive. I expect some AI systems will be superhuman, such as automated vehicles and some AI diagnosticians, and that incompetent AIs will gain unwarranted trust by association while the competent AIs get unwarranted distrust by association. Sometimes trusting AI will save lives, other times it will cost them.
This thread shows an example of ChatGPT being unable to describe the image it generated, though, and other people in the thread (seemingly) confirm that there's a call to a separate model to generate the image. The context has an influence on the images because the context is part of the tool call.
We should always be able to translate latent space reasoning aka neuralese (see COCONUT) to a human language equivalent representation.
I don't think this is true at all. How do you translate, say, rotating multiple shapes in parallel into text? Current models already use neuralese as they refine their answer in the forward pass. Why can't we translate that yet? (Yes, we can decode the model's best guess at the next token, but that's not an explanation.)
Chain-of-thought isn't always faithful, but it's still what the model actually uses when it does serial c...
The rocket image with the stablediffusionweb watermark on it is interesting for multiple reasons:
I think some long tasks are like a long list of steps that only require the output of the most recent step, and so they don't really need long context. AI improves at those just by becoming more reliable and making fewer catastrophic mistakes. On the other hand, some tasks need the AI to remember and learn from everything it's done so far, and that's where it struggles- see how Claude Plays Pokémon gets stuck in loops and has to relearn things dozens of times.
Claude finally made it to Cerulean after the "Critique Claude" component correctly identified that it was stuck in a loop, and decided to go through Mt. Moon. (I think Critique Claude is prompted specifically to stop loops.)
I'm glad you shared this, it's quite interesting. I don't think I've ever had something like that happen to me and if it did I'd be concerned, but I could believe that it's prevalent and normal for some people.
I don't think your truth machine would work because you misunderstand what makes LLMs hallucinate. Predicting what a maximum-knowledge author would write induces more hallucinations, not less. For example, say you prompted your LLM to predict text supposedly written by an omniscient oracle, and then asked "How many fingers am I holding behind my back?" The LLM would predict an answer like "three" or something, because an omniscient person would know that, even though it's probably not true.
In other words, you'd want the system to believe "this writer I'm p...
I've been trying to put all my long-form reading material in one place myself, and found a brand-new service called Reader which is designed specifically for this purpose. It has support for RSS, Newsletters, YouTube transcripts, and other stuff. $10 annually / $13 monthly.
Thanks for responding.
I agree with what you're saying; I think you'd want to maintain your reward stream at least partially. However, the main point I'm trying to make is that in this hypothetical, it seems like you'd no longer be able to think of your reward stream as grounding out your values. Instead it's the other way around: you're using your values to dictate the reward stream. This happens in real life sometimes, when we try to make things we value more rewarding.
You'd end up keeping your values, I think, because your beliefs about what you value do...
This conception of values raises some interesting questions for me.
Here's a thought experiment: imagine your brain loses all of its reward signals. You're in a depression-like state where you no longer feel disgust, excitement, or anything. However, you're given an advanced wireheading controller that lets you easily program rewards back into your brain. With some effort, you could approximately recreate your excitement when solving problems, disgust at the thought of eating bugs, and so on, or you could create brand-new responses. My questions:
While I don't have specifics either, my impression of ML research is that it's a lot of work to get a novel idea working, even if the idea is simple. If you're trying to implement your own idea, you'll be banging your head against the wall for weeks or months wondering why your loss is worse than the baseline. If you try to replicate a promising-sounding paper, you'll bang your head against the wall as your loss is worse than the baseline. It's hard to tell if you made a subtle error in your implementation or if the idea simply doesn't work for reasons you...
At the start of my Ph.D. 6 months ago, I was generally wedded to writing "good code". The kind of "good code" you learn in school and standard software engineering these days: object oriented, DRY, extensible, well-commented, and unit tested.
I think you'd like Casey Muratori's advice. He's a software dev who argues that "clean code" as taught is actually bad, and that the way to write good code efficiently is more like the way you did it intuitively before you were taught OOP and stuff. He advises "Semantic Compression" instead- essentially you just s...
Yeah, I think the mainstream view of activism is something like "Activism is important, of course. See the Civil Rights and Suffrage movements. My favorite celebrity is an activist for saving the whales! I just don't like those mean crazy ones I see on the news."
Pacing is a common stimming behavior. Stimming is associated with autism / sensory processing disorder, but neurotypical people do it too.
This seems too strict to me, because it says that humans aren't generally intelligent, and that a system isn't AGI if it's not a world-class underwater basket weaver. I'd call that weak ASI.
Fatebook has worked nicely for me so far, and I think it'd be cool to use it more throughout the day. Some features I'd like to see:
When I see an event with the stated purpose of opposing highly politically polarized things such as cancel culture and safe spaces, I imagine a bunch of people with shared politics repeating their beliefs to each other and snickering, and any beliefs that are actually highly controversial within that group are met with "No no, that's what they want you to think, you missed the point!" It seems possible to avoid that failure mode with a genuine truth-seeking culture, so I hope you succeeded.
It's been about 4 years. How do you feel about this now?
Bluesky has custom feeds that can bring in posts from all platforms that use the AT Protocol, but Bluesky is the only such platform right now. Most feeds I've found so far are simple keyword searches, which work nicely for having communities around certain topics, but I hope to see more sophisticated ones pop up.
While most people have super flimsy defenses of meat-eating, that doesn't mean everyone does. Some people simply think it's quite unlikely that non-human animals are sentient (besides primates, maybe). For example, IIRC Eliezer Yudkowsky and Rob Bensinger's guess is that consciousness is highly contingent on factors such as general intelligence and sociality, or something like that.
I think the "5% chance is still too much" argument is convincing, but it begs similar questions such as "Are you really so confident that fetuses aren't sentient? How could you be so sure?"
I agree that origami AIs would still be intelligent if implementing the same computations. I was trying to point at LLMs potentially being 'sphexish': having behaviors made of baked if-then patterns linked together that superficially resemble ones designed on-the-fly for a purpose. I think this is related to what the "heuristic hypothesis" is getting at.
The paper "Auto-Regressive Next-Token Predictors are Universal Learners" made me a little more skeptical of attributing general reasoning ability to LLMs. They show that even linear predictive models, basically just linear regression, can technically perform any algorithm when used autoregressively like with chain-of-thought. The results aren't that mind-blowing but it made me wonder whether performing certain algorithms correctly with a scratchpad is as much evidence of intelligence as I thought.
Even if you know a certain market is a bubble, it's not exactly trivial to exploit if you don't know when it's going to burst, which prices will be affected, and to what degree. "The market can remain irrational longer than you can remain solvent" and all that.
Personally, while I think that investment will decrease and companies will die off, I doubt there's a true AI bubble, because there are so many articles about it being in a bubble that it couldn't possibly be a big surprise for the markets if it popped, and therefore the hypothetical pop is already p...
The fourth image is of the "Z machine", or the Z Pulsed Power Facility, which creates massive electromagnetic pulses for experiments. It's awesome.
I can second this. I recommend the chrome extension Unhook, which allows you to disable individual parts of YouTube, and Youtube-shorts block, which makes YouTube shorts play like normal videos.
(Disclaimer: I'm not very knowledgeable about safety engineering or formal proofs)
I notice that whenever someone brings up "What if this unexpected thing happens?", you emphasize that it's about not causing accidents. I'm worried that it's hard to define exactly who caused an accident, for the same reason that deciding who's liable in the legal system is hard.
It seems quite easy to say that the person who sabotaged the stop sign was at fault for the accident. What if the saboteur poured oil on the road instead? Is it their fault if the car crashes from sli...
If random strangers start calling you "she", that implies you look feminine enough to be mistaken for a woman. I think most men would prefer to look masculine for many reasons: not being mistaken for a woman, being conventionally attractive, being assumed to have a 'manly' rather than 'effeminate' personality, looking your age, etc.
If you look obviously masculine, then being misgendered constantly would just be bewildering. Surely something is signaling that you use feminine pronouns.
If it's just people online misgendering you based on your writing, then that's less weird. But I think it still would bother some people for some of the reasons above.
I predict that implementing bots like these into social media platforms (in their current state) would be poorly received by the public. I think many people's reaction to Grok's probability estimate would be "Why should I believe this? How could Grok, or anyone, know that?" If it were a prediction market, the answer would be "because <economic and empirical explanation as to why you can trust the markets>". There's no equivalent answer for a new bot, besides "because our tests say it works" (making the full analysis visible might help). From these co...
The images on this post appear to be broken.
If you go on Twitter/X and find the right people, you can get most of the benefits you list here. There are tastemakers that share and discuss intriguing papers, and researchers who post their own papers with explanation threads which are often more useful than the papers themselves. The researchers are usually available to answer questions about their work, and you can read the answers they've given already. You're also ahead of the game because preprints can appear way before conferences.
It may be through extrapolating too much from your (first-person, subjective) experiences with objects that seemingly possess intrinsic, observer-independent properties, like the classical objects of everyday life.
Are you trying to say that quantum physics provides evidence that physical reality is subjective, with conscious observers having a fundamental role? Rob implicitly assumes the position advocated by The Quantum Physics Sequence, which argues that reality exists independently of observers and that quantum stuff doesn't suggest otherwise. It'...
Another example in ML of a "non-conservative" optimization process: a common failure mode of GANs is mode collapse, wherein the generator and discriminator get stuck in a loop. The generator produces just one output that fools the discriminator, the discriminator memorizes it, the generator switches to another, until eventually they get back to the same output again.
In the rolling ball analogy, we could say that the ball rolls down into a divot, but the landscape flexes against the ball to raise it up again, and then the ball rolls into another divot, and so on.
So of course Robin Hanson offered polls on these so-called taboo topics. The ‘controversial’ positions got overwhelming support. The tenth question, whether demographic diversity (race, gender) in the workplace often leads to worse performance got affirmed 54%-17%, and the rest were a lot less close than that. Three were roughly 90%-1%. I realize Hanson has unusual followers, but the ‘taboo questions’ academics want to discuss? People largely agree on the answers, and the academics have decided saying that answer out loud is not permitted.
I understan...
For those curious about the performance: eyeballing the technical report, it roughly performs at the level of LLama-3 70B. It seems to have an inferior parameters-to-performance ratio because it was only trained on 9 trillion tokens, while the Llama-3 models were trained on 15 trillion tokens. It's also trained with a 4k context length as opposed to Llama-3's 8k. Its primary purpose seems to be the synthetic data pipeline thing.
I encountered this while I was reading about an obscure estradiol ester, Estradiol undecylate, used for hormone replacement therapy and treating prostate cancer. It's very useful because it has a super long half-life, but it was discontinued. I had to reread the article to be sure I understood that the standard dose chosen arbitrarily in the first trials was hundreds of times larger than necessary, leading to massive estrogen overdoses and severe side effects that killed many people due to cardiovascular complications, and yet these insane doses were typical for decades and might've caused its discontinuation.
Although it has been over a decade, decent waterproof phone mounts now exist, too.
Thank you for writing this, this is by far the strongest argument for taking this problem seriously tailored to leftists I've seen and I'll be sharing it. Hopefully the frequent (probably unavoidable) references to EA doesn't turn them off too much.
Here's why determinism doesn't bother me. I hope I get it across.
Deterministic systems still have to be simulated to find out what happens. Take cellular automata, such as Conway's Game of Life or Wolfram's Rule 110, . The result of all future steps is determined by the initial state, but we can't practically "skip ahead" because of what Wolfram calls 'computational irreducibility': despite the simplicity of the underlying program, there's no way to reduce the output to a calculation that's much cheaper than just simulating the whole thing. Same with a mat...
I disagree that the Reversal Curse demonstrates a fundamental lack of sophistication of knowledge on the model’s part. As Neel Nanda explained, it’s not surprising that current LLMs will store A -> B but not B -> A as they’re basically lookup tables, and this is definitely an important limitation. However, I think this is mainly due to a lack of computational depth. LLMs can perform that kind of deduction when the information is external, that is, if you prompt it with who Tom Cruise’s mom is, it can then answer who Mary Lee Pfeiffer’s son is. I...
Another idea: real photos have lots of tiny details to notice regularities in. Pixel art images, on the other hand, can only be interpreted properly by "looking at the big picture". AI vision is known to be biased towards textures rather than shape, compared to humans.