Quite right. AI safety is moving very quickly and doesn’t have any methods that are well-understood enough to merit a survey article. Those are for things that have a large but scattered literature, with maybe a couple of dozen to a hundred papers that need surveying. That takes a few years to accumulate.
Could you give an example of the sort of distinction you’re pointing at? Because I come to completely the opposite conclusion.
Part of my job is applied mathematics. I’d rather read a paper applying one technique to a variety of problems, than a paper applying a variety of techniques to one problem. Seeing the technique used on several problems lets me understand how and when to apply it. Seeing several techniques on the same problem tells me the best way to solve that particular problem, but I’ll probably never run into that particular problem in my work.
But that’s just me; presumably you want something else out of reading the literature. I would be interested to know what exactly.
When I say Pokémon-type games, I don’t mean games recounting the adventures of Ash Ketchum and Pikachu. I mean games with a series of obstacles set in a large semi-open world, with things you can carry, a small set of available actions at each point, and a goal of progressing past the obstacles. Such games can be manufactured in unlimited quantities by a program. They can also be “peopled” by simple LLMs, for increased complexity. They don’t actually have to be fun to play or look at, so the design requirements are loose.
There have ...
True. I was generalizing it to a system that tries to solve lots of Pokémon-like tasks in various artificial worlds, rather than just expecting it to solve Pokémon over and over. But I didn’t say that, I just imagined in my mind and assumed everyone else would too. Thank you for making it explicit!
This is an important case to think about. I think it is understudied. What separates current AIs from the CEO role? And how long will it take? I see three things:
My charity extends no further than the human race. Once in a while I think about animal ethics and decide that no, I still don't care enough to make an effort.
A basic commitment of my charity group from the beginning: no money that benefits things other than people. We don't donate to benefit political groups, organizations, arts, animals, or the natural world. I'm good with that. Members of the group may of course donate elsewhere, and generally do.
We've been doing this since 1998, decades before Effective Altruism was a thing. I don't have a commitment to Effective Altruism the movement, just to altruism which is effective.
It seems to me that I have done a lot of careful thinking about timelines, and that I also feel the AGI. Why can't you have a careful understanding what timelines we should expect, and also have an emotional reaction to that? Reasonably coming to the conclusion that many things will change greatly in the next few years deserves a reaction.
You would suppose wrong! My wife and I belong to a group of a couple of dozen people that investigates charities, picks the best ones, and sends them big checks. I used to participate more, but now I outsource all the effort to my wife. I wasn’t contributing much to the choosing process. I just earn the money 🙂.
What does this have to do with my camp#1 intuitions?
I don’t think enjoyment and suffering are arbitrary or unimportant. But I do think they’re nebulous. They don’t have crisp, definable, generally agreed-upon properties. We have to deal with them anyway.
I don’t reason about animal ethics; I just follow standard American cultural standards. And I think ethics matters because it helps us live a virtuous and rewarding life.
Is that helpful?
Well, I’m glad you’ve settled the nature of qualia. There’s a discussion downthread, between TAG and Signer, which contains several thousand words of philosophical discussion of qualia. What a pity they didn’t think to look in Wikipedia, which settles the question!
Seriously, I definitely have sensations. I just think some people experience an extra thing on top of sensations, which they think is an indissoluble part of cognition, and which causes them to find some things intuitive that I find incomprehensible.
With the lawn mower robot, we are able to say what portions of its construction and software are responsible for its charging-station-seeking behavior, in a well-understood mechanistic way. Presumably, if we knew more about the construction of the human mind, we’d be able to describe the mechanisms responsible for human enjoyment of eating and resting. Are the two mechanisms similar enough that it makes sense to refer to the robot enjoying things? I think that the answer is (a) we don’t know, (b) probably not, and (c) there is no fact of ...
An interesting and important question.
We have data about how problem-solving ability scales with reasoning time for a fixed model. This isn’t your question, but it’s related. It’s pretty much logarithmic, IIRC.
The important question is, how far can we push the technique whereby reasoning models are trained? They are trained by having them solve a problem with chains of thought (CoT), and then having them look at their own CoT, and ask “how could I have thought that faster?” It’s unclear how far this technique can be pushed (a...
For me, depression has been independent of the probability of doom. I’ve definitely been depressed, but I’ve been pretty cheerful for the past few years, even as the apparent probability of near-term doom has been mounting steadily. I did stop working on AI, and tried to talk my friends out of it, which was about all I could do. I decided not to worry about things I can’t affect, which has clarified my mind immensely.
The near-term future does indeed look very bright.
I am in violent agreement. Nowhere did I say that MuZero could learn a world model as complicated as those LLMs currently enjoy. But it could learn continuously, and execute pretty complex strategies. I don’t know how to combine that with the breadth of knowledge or cleverness of LLMs, but if we could, we’d be in trouble.
You shouldn’t worry about whether something “is AGI”; it’s an I’ll-defined concept. I agree that current models are lacking the ability to accomplish long-term tasks in the real world, and this keeps them safe. But I don’t think this is permanent, for two reasons.
Current large-language-model type AI is not capable of continuous learning, it is true. But AIs which are capable of it have been built. AlphaZero is perhaps the best example; it learns to play games to a superhuman level in a few hours. It’s a topic of current resear...
Welcome to Less Wrong. Sometimes I like to go around engaging with new people, so that’s what I’m doing.
On a sentence-by-sentence basis, your post is generally correct. It seems like you’re disagreeing with something you’ve read or heard. But I don’t know what you read, so I can’t understand what you’re arguing for or against. I could guess, but it would be better if you just said.
I work for a company that developed its own programming language and has been selling it for over twenty years for a great deal of money. For many of those twenty years, I worked in the group developing the language. Before working for my current employer, I participated in several language development efforts. I say this not in order to toot my own horn, but to indicate that what I say has some weight of experience behind it.
There is no way to get the funding you want. I am sorry to tell you this.
From a funder's point of view, ther...
Well, let me quote Wikipedia:
Much of the debate over the importance of qualia hinges on the definition of the term, and various philosophers emphasize or deny the existence of certain features of qualia. Some philosophers of mind, like Daniel Dennett, argue that qualia do not exist. Other philosophers, as well as neuroscientists and neurologists, believe qualia exist and that the desire by some philosophers to disregard qualia is based on an erroneous interpretation of what constitutes science.
If it was that easy to understand, we wouldn't be here arguing ...
Humans continue to get very offended if they find out they are talking to an AI
In my limited experience of phone contact with AIs, this is only true for distinctly subhuman AIs. Then I emotionally react like I am talking to someone who is being deliberately obtuse, and become enraged. I'm not entirely clear on why I have this emotional reaction, but it's very strong. Perhaps it is related to the Uncanny Valley effect. On the other hand, I've dealt with phone AIs that (acted like they) understood me, and we've concluded a pleasant an...
including probably reworking some of my blog post ideas into a peer-reviewed paper for a neuroscience journal this spring.
I think this is a great idea. It will broadcast your ideas to an audience prepared to receive them. You can leave out the "friendly AI" motivation and your ideas will stand on their own as a theory of (some of) cognition.
Do we have a sense for how much of the orca brain is specialized for sonar? About a third of the human brain is specialized to visual perception. If sonar is harder than vision, evolution might have dedicated more of the orca brain to it. On the other hand, orcas don't need a bunch of brain for manual dexterity, like us.
In humans, the prefrontal cortex is dedicated to "higher" forms of thinking. But evolution slides functions around on the cortical surface, and (Claude tells me) association areas like the prefrontal cortex are particularly prone to this. Just looking for the volume of the prefrontal cortex won't tell you how much actual thought goes on there.
Is this the consensus view? I think it’s generally agreed that software development has been sped up. A factor of two is ambitious! But that’s what it seems to me, and I’ve measured three examples of computer vision programming, each taking an hour or two, by doing them by hand and then with machine assistance. The machines are dumb and produce results that require rewriting. But my code is also inaccurate on a first try. I don’t have any references where people agree with me. And this may not apply to AI programming in general.
You ask about “anony...
This isn’t crazy— people have tried related techniques. But it needs more details thought out.
In the chess example, the AIs start out very stupid, being wired at random. But in a game between two idiots, moving at random, eventually someone is going to win. And then you reinforce the techniques used by the winner, and de-reinforce the ones used by the loser. In any encounter, you learn, regardless of who wins. But in an encounter between a PM and a programmer, if the programmer fails, who gets reinforced? It might ...
This is a great question!
Point one:
The computational capacity of the brain used to matter much more than it matters now. The AIs we have now are near-human or superhuman at many skills, and we can measure how skill capacity varies with resources in the near-human range. We can debate and extrapolate and argue with real data.
But we spent decades where the only intelligent system we had was the human brain, so it was the only anchor we had for timelines. So even though it’s very hard to make good estimates from, we had to use it.
Point two:
M...
I disagree that there is a difference of kind between "engineering ingenuity" and "scientific discovery", at least in the business of AI. The examples you give-- self-play, MCTS, ConvNets-- were all used in game-playing programs before AlphaGo. The trick of AlphaGo was to combine them, and then discover that it worked astonishingly well. It was very clever and tasteful engineering to combine them, but only a breakthrough in retrospect. And the people that developed them each earlier, for their independent purposes? They were p...
Came here to say this, got beaten to it by Radford Neal himself, wow! Well, I'm gonna comment anyway, even though it's mostly been said.
Gallagher proposed belief propagation as an approximate good-enough method of decoding a certain error-correcting code, but didn't notice that it worked on all sorts of probability problems. Pearl proposed it as a general mechanism for dealing with probability problems, but wanted perfect mathematical correctness, so confined himself to tree-shaped problems. It was their common generalization that was the...
Summary: Superintelligence in January-August, 2026. Paradise or mass death, shortly thereafter.
This is the shortest timeline proposed in these answers so far. My estimate (guess) is that there's only 20% of this coming true, but it looks feasible as of now. I can't honestly assert it as fact, but I will say it is possible.
It's a standard intelligence explosion scenario: with only human effort, the capacities of our AIs double every two years. Once AI gets good enough to do half the work, we double every one year. Once we've do...
There’s a shorter hill with a good slope in McLellan park, about a mile away. It debouches into a flat area, so you can coast a long time and don’t have to worry about hitting a fence. If you’ve got the nerve, you can sled onto a frozen pond and really go far.
The shorter hill means it’s quicker to climb, so it provides roughly equal fun per hour.
This is a lot easier to deal with than other large threats. The CO2 keeps rising because fossil fuels are so nearly indispensable. AI keeps getting smarter because they’re harmless and useful now and only dangerous in some uncertain future. Nuclear weapons still exist because they can end any war. But there is no strong argument for building mirror life.
I read (much of) the 300 page report giving the detailed argument. They make a good case that the effects of a release of a mirror bacterium would be apocalyptic. But wha...
The Antarctic Treaty (and subsequent treaties) forbid colonization. They also forbid extraction of useful resources from Antarctica, thereby eliminating one of the main motivations for colonization. They further forbid any profitable capitalist activity on the continent. So you can’t even do activities that would tend toward permanent settlement, like surveying to find mining opportunities, or opening a tourist hotel. Basically, the treaty system is set up so that not only can’t you colonize, but you can’t even get close to colonizi...
A fascinating recent paper on the topic of human bandwidth is https://arxiv.org/abs/2408.10234. Title and abstract:
...This article is about the neural conundrum behind the slowness of human behavior. The information throughput of a human being is about 10 bits/s. In comparison, our sensory systems gather data at an enormous rate, no less than 1 gigabits/s. The stark contrast between these numbers remains unexplained. Resolving this paradox should teach us something fundamental about brain
They’re measuring a noisy phenomenon, yes, but that’s only half the problem. The other half of the problem is that society demands answers. New psychology results are a matter of considerable public interest and you can become rich and famous from them. In the gap between the difficulty of supply and the massive demand grows a culture of fakery. The same is true of nutrition— everyone wants to know what the healthy thing to eat is, and the fact that our current methods are incapable of discerning this is no obstacle to people who cl...
Here is a category of book that I really loved at that age: non-embarrasing novels about how adults do stuff. Since, for me, that age was in 1973, the particular books I name might be obsolete. There’s a series of novels by Arthur Hailey, with titles like “Hotel” and “Airport”, that are set inside the titular institutions, and follow people as they deal with problems and interact with each other. And there is no, or at least minimal, sex, so they’re not icky to a kid. They’re not idealized; there is a reasonable degree of fallibility, ven...
Doesn’t matter, because HPMOR is engaging enough on a chapter-by-chapter basis. I read lots of books when I was a kid when I didn’t understand the overarching plot. As long as I had a reasonable expectation that cool stuff would happen in the next chapter, I’d keep reading. I read “Stand On Zanzibar” repeatedly as a child, and didn’t understand the plot until I reread it as an adult last year. Same with the detective novel “A Deadly Shade of Gold”. I read it for the fistfights, snappy dialogue, and insights into adult life. The plot was lost on me.
Nitpick: No single organism can destroy the biosphere; at most it can fill its niche & severely disrupt all ecosystems.
Have you read the report on mirror life that came out a few months ago? A mirror bacterium has a niche of “inside any organism that uses carbon-based biochemistry”. At least, it would parasitize all animals, plants, fungi, and the larger Protozoa, and probably kill them. I guess bacteria and viruses would be left. I bet that a reasonably smart superintelligence could figure out a way to get them too.