All of Carl Feynman's Comments + Replies

Nitpick: No single organism can destroy the biosphere; at most it can fill its niche & severely disrupt all ecosystems.

Have you read the report on mirror life that came out a few months ago?  A mirror bacterium has a niche of “inside any organism that uses carbon-based biochemistry”.  At least, it would parasitize all animals, plants, fungi, and the larger Protozoa, and probably kill them.  I guess bacteria and viruses would be left.  I bet that a reasonably smart superintelligence could figure out a way to get them too.

Quite right.  AI safety is moving very quickly and doesn’t have any methods that are well-understood enough to merit a survey article.  Those are for things that have a large but scattered literature, with maybe a couple of dozen to a hundred papers that need surveying.  That takes a few years to accumulate.

Could you give an example of the sort of distinction you’re pointing at?  Because I come to completely the opposite conclusion.  

Part of my job is applied mathematics.  I’d rather read a paper applying one technique to a variety of problems, than a paper applying a variety of techniques to one problem.  Seeing the technique used on several problems lets me understand how and when to apply it.  Seeing several techniques on the same problem tells me the best way to solve that particular problem, but I’ll probably never run into that particular problem in my work.

But that’s just me; presumably you want something else out of reading the literature.  I would be interested to know what exactly.

4Daniel Tan
I guess this perspective is informed by empirical ML / AI safety research. I don't really do applied math.  For example: I considered writing a survey on sparse autoencoders a while ago. But the field changed very quickly and I now think they are probably not the right approach.  In contrast, this paper from 2021 on open challenges in AI safety still holds up very well. https://arxiv.org/abs/2109.13916  In some sense I think big, comprehensive survey papers on techniques / paradigms only make sense when you've solved the hard bottlenecks and there are many parallelizable incremental directions you can go in from there. E.g. once people figured out scaling pre-training for LLMs 'just works', it makes sense to write a survey about that + future opportunities. 

When I say Pokémon-type games, I don’t mean games recounting the adventures of Ash Ketchum and Pikachu.  I mean games with a series of obstacles set in a large semi-open world, with things you can carry, a small set of available actions at each point, and a goal of progressing past the obstacles.  Such games can be manufactured in unlimited quantities by a program.  They can also be “peopled” by simple LLMs, for increased complexity.  They don’t actually have to be fun to play or look at, so the design requirements are loose.

There have ... (read more)

True.  I was generalizing it to a system that tries to solve lots of Pokémon-like tasks in various artificial worlds, rather than just expecting it to solve Pokémon over and over.  But I didn’t say that, I just imagined in my mind and assumed everyone else would too.  Thank you for making it explicit!

2ChristianKl
It depends on how much Pokémon-like tasks are available. Given that a lot of capital goes into creating each Pokémon game, there aren't that many Pokémon games. I would expect the number of games that are very Pokémon-like to also be limited. 

This is an important case to think about.  I think it is understudied.  What separates current AIs from the CEO role?  And how long will it take?  I see three things:

  • Long term thinking, agency, the ability to remember things, not going crazy in an hour or two.  It seems to me like this is all the same problem, in the sense that I think one innovation will solve all of them.  This has a lot of effort focused on it.  I feel like it's been a known big problem since GPT-4 and Sydney/Bing, 2 1/2 years ago.  So, by the Lin
... (read more)

Eyes are exactly the worst tissue in which to express luciferase.  Lighting the eyeball from within prevents seeing things outside the eye, causing blindness.  Make your cat glow anywhere else!

Epistemic status: I know you're kidding.  So am I.

My charity extends no further than the human race.  Once in a while I think about animal ethics and decide that no, I still don't care enough to make an effort.

A basic commitment of my charity group from the beginning: no money that benefits things other than people.  We don't donate to benefit political groups, organizations, arts, animals, or the natural world.  I'm good with that.  Members of the group may of course donate elsewhere, and generally do.

We've been doing this since 1998, decades before Effective Altruism was a thing.  I don't have a commitment to Effective Altruism the movement, just to altruism which is effective.

It seems to me that I have done a lot of careful thinking about timelines, and that I also feel the AGI.  Why can't you have a careful understanding what timelines we should expect, and also have an emotional reaction to that?  Reasonably coming to the conclusion that many things will change greatly in the next few years deserves a reaction.  

2Nate Showell
As I use the term, the presence or absence of an emotional reaction isn't what determines whether someone is "feeling the AGI" or not. I use it to mean basing one's AI timeline predictions on a feeling.

A task like this, at which the AI is lousy but not hopeless, is an excellent feedback signal for RL.  It's also an excellent feedback signal for "grad student descent": have a human add mechanisms, and see if Claude gets better.  This is a very good sign for capabilities, unfortunately.

3ChristianKl
It's quite easy to use Pokemon playing as feedback signal for becoming better at playing Pokemon. If you naively do that, the AI would learn how to solve the game but doesn't necessarily train executive function.  A task like doing computer programming where you have to find a lot of different solutions is likely providing better feedback for RL.

You would suppose wrong!  My wife and I belong to a group of a couple of dozen people that investigates charities, picks the best ones, and sends them big checks. I used to participate more, but now I outsource all the effort to my wife.  I wasn’t contributing much to the choosing process.  I just earn the money 🙂.

What does this have to do with my camp#1 intuitions?

1SpectrumDT
Now I am confused. Do you care about animal ethics as part of your commitment to effective altruism? If so, how can you do that without reasoning about it? Or do you just ignore the animals?

I don’t think enjoyment and suffering are arbitrary or unimportant.  But I do think they’re nebulous.  They don’t have crisp, definable, generally agreed-upon properties. We have to deal with them anyway.

I don’t reason about animal ethics; I just follow standard American cultural standards.  And I think ethics matters because it helps us live a virtuous and rewarding life.  

Is that helpful?

1SpectrumDT
Thanks for the response.  I suppose you do not have any interest in effective altruism either?

Well, I’m glad you’ve settled the nature of qualia.  There’s a discussion downthread, between TAG and Signer, which contains several thousand words of philosophical discussion of qualia.  What a pity they didn’t think to look in Wikipedia, which settles the question!

Seriously, I definitely have sensations.  I just think some people experience an extra thing on top of sensations, which they think is an indissoluble part of cognition, and which causes them to find some things intuitive that I find incomprehensible.

3TAG
Definitions are not theories Even if there is agreement about the meaning of the word, there can also be disagreement about the correct theory of qualia. Definitions always precede theories -- we could define "Sun" for thousands of years before we understood its nature as a fusion reactor. Shared definitions are a prerequisite of disagreement , rather than just talking past each other. The problem of defining qualia -- itself, the first stage in specifying the problem --can be much easier than the problem of coming up with a satisfactory theoretical account, a solution. It's a term that was created by an English speaking philosopher less than a hundred years ago, so it really doesn't present the semantic challenges of some philosophical jargon. (The resistance to qualia can also be motivated by unwillingness to give up commitments -- bias, bluntly -- not just semantic confusion) Semantic confusions and ideological rigidity already abound, so there is no need to propose differing cognitive mechanisms. Theories about how qualia work don't have to be based on direct intuition. Chalmers arguments are complex, and stretch over 100s of pages. Again, the definition is one thing and the "nature"...the correct ontological theory...is another. The definition is explained by Wikipedia, the correct theory , the ultimate explanation is not. "Sensation" is ambiguous between a functional capacity -- stopping at a red light -- and a felt quality -- what red looks like. The felt quality is definitely over and above the function, but that's probably not your concern. It's true that some people have concluded nonphysical theories from qualia... but it doesn't follow that they must be directly perceiving or intuiting any kind of nonphysicalism in qualia themselves. Because it's not true that every conclusion has to be arrived at immediately, without any theoretical, argumentative background. Chalmers' arguments don't work that way and are in fact quite complex. Physics is a co
1dirk
And I think you believe others to experience this extra thing because you have failed to understand what they're talking about when they discuss qualia.

Well, I can experience happiness and sadness, and all the usual other emotions, so I just always assumed that was enough to make me a morally significant entity. 

With the lawn mower robot, we are able to say what portions of its construction and software are responsible for its charging-station-seeking behavior, in a well-understood mechanistic way.  Presumably, if we knew more about the construction of the human mind, we’d be able to describe the mechanisms responsible for human enjoyment of eating and resting.  Are the two mechanisms similar enough that it makes sense to refer to the robot enjoying things?  I think that the answer is (a) we don’t know, (b) probably not, and (c) there is no fact of ... (read more)

1SpectrumDT
Thanks for the response. You make it sound as though enjoyment and suffering are just arbitrary and unimportant shorthands to describe certain mechanistic processes. From that perspective, how do you reason about animal ethics? For that matter, why does any ethics matter at all?

An interesting and important question.  

We have data about how problem-solving ability scales with reasoning time for a fixed model.  This isn’t your question, but it’s related.  It’s pretty much logarithmic, IIRC.

The important question is, how far can we push the technique whereby reasoning models are trained?  They are trained by having them solve a problem with chains of thought (CoT), and then having them look at their own CoT, and ask “how could I have thought that faster?”  It’s unclear how far this technique can be pushed (a... (read more)

For me, depression has been independent of the probability of doom.  I’ve definitely been depressed, but I’ve been pretty cheerful for the past few years, even as the apparent probability of near-term doom has been mounting steadily.  I did stop working on AI, and tried to talk my friends out of it, which was about all I could do.  I decided not to worry about things I can’t affect, which has clarified my mind immensely. 

The near-term future does indeed look very bright.

1abdallahhhm
Hey Carl, sorry to bother you what I'm about to say is pretty irrelevant to the discussion but I'm a highschool student looking to gather good research experience and I wanted to ask a few questions. Is there any place I can reach out to you other than here? I would greatly appreciate any and all help! 

I am in violent agreement.  Nowhere did I say that MuZero could learn a world model as complicated as those LLMs currently enjoy.  But it could learn continuously, and execute pretty complex strategies.  I don’t know how to combine that with the breadth of knowledge or cleverness of LLMs, but if we could, we’d be in trouble.

Whoops, meant MuZero instead of AlphaZero.

You shouldn’t worry about whether something “is AGI”; it’s an I’ll-defined concept.  I agree that current models are lacking the ability to accomplish long-term tasks in the real world, and this keeps them safe.  But I don’t think this is permanent, for two reasons.

Current large-language-model type AI is not capable of continuous learning, it is true.  But AIs which are capable of it have been built.  AlphaZero is perhaps the best example; it learns to play games to a superhuman level in a few hours.  It’s a topic of current resear... (read more)

1LWLW
MuZero doesn’t seem categorically different from AlphaZero. It has to do a little bit more work at the beginning, but if you don’t get any reward for breaking the rules: you will learn not to break the rules. If MuZero is continuously learning then so is AlphaZero. Also, the games used were still computationally simple, OOMs more simple than an open-world game, let alone a true World-Model. AFAIK MuZero doesn’t work on open-ended, open-world games. And AlphaStar never got to superhuman performance at human speed either.  
2Carl Feynman
Whoops, meant MuZero instead of AlphaZero.

Welcome to Less Wrong.  Sometimes I like to go around engaging with new people, so that’s what I’m doing.

On a sentence-by-sentence basis, your post is generally correct.  It seems like you’re disagreeing with something you’ve read or heard.  But I don’t know what you read, so I can’t understand what you’re arguing for or against.  I could guess, but it would be better if you just said.  

 

1LWLW
hi, thank you! i guess i was thinking about claims that "AGI is imminent and therefore we're doomed." it seems like if you define AGI as "really good at STEM" then it is obviously imminent. but if you define it as "capable of continuous learning like a human or animal," that's not true. we don't know how to build it and we can't even run a fruit-fly connectome on the most powerful computers we have for more than a couple of seconds without the instance breaking down: how would we expect to run something OOMs more complex and intelligent? "being good at STEM" seems like a much, much simpler and less computationally intensive task than continuous, dynamic learning. tourist is great at codeforces, but he obviously doesn't have the ability to take over the world (i am making the assumption that anyone with the capability to take over the world would do so). the second is a much, much fuzzier, more computationally complex task than the first. i had just been in a deep depression for a while (it's embarassing, but this started with GPT-4) because i thought some AI in the near future was going to wake up, become god, and pwn humanity. but when i think about it from this perspective, that future seems much less likely. in fact, the future (at least in the near-term) looks very bright. and i can actually plan for it, which feels deeply relieving to me.

I work for a company that developed its own programming language and has been selling it for over twenty years for a great deal of money.  For many of those twenty years, I worked in the group developing the language.  Before working for my current employer, I participated in several language development efforts.  I say this not in order to toot my own horn, but to indicate that what I say has some weight of experience behind it.

There is no way to get the funding you want.  I am sorry to tell you this.

From a funder's point of view, ther... (read more)

Well, let me quote Wikipedia:

Much of the debate over the importance of qualia hinges on the definition of the term, and various philosophers emphasize or deny the existence of certain features of qualia. Some philosophers of mind, like Daniel Dennett, argue that qualia do not exist. Other philosophers, as well as neuroscientists and neurologists, believe qualia exist and that the desire by some philosophers to disregard qualia is based on an erroneous interpretation of what constitutes science.

If it was that easy to understand, we wouldn't be here arguing ... (read more)

9dirk
Wikipedia also provides, in the first paragraph of the article you quoted, a quite straightforward definition: I am skeptical that you lack the cognitive architecture to experience these things, so I think your claim is false.

Humans continue to get very offended if they find out they are talking to an AI

In my limited experience of phone contact with AIs, this is only true for distinctly subhuman AIs.  Then I emotionally react like I am talking to someone who is being deliberately obtuse, and become enraged.  I'm not entirely clear on why I have this emotional reaction, but it's very strong.  Perhaps it is related to the Uncanny Valley effect.  On the other hand, I've dealt with phone AIs that (acted like they) understood me, and we've concluded a pleasant an... (read more)

Has anybody tried actual humans or smart LLMs?  It would be interesting to know what methods people actually use.

2JBlack
There are in general simple algorithms for determining S in polynomial time, since it's just a system of linear equations as in the post. Humans came up with those algorithms, and smart LLMs may be able to recognize the problem type and apply a suitable algorithm in chain-of-thought (with some probability of success). However, average humans don't know any linear algebra and almost certainly won't be able to solve more than a trivial-sized problem instance. Most struggle with the very much simpler "Lights Out" puzzle.

If we're being pragmatic, the priors we had at birth almost don't matter.  A few observations will overwhelm any reasonable prior.  As long as we don't assign zero probability to anything that can actually happen, the shape of the prior makes no practical difference.

including probably reworking some of my blog post ideas into a peer-reviewed paper for a neuroscience journal this spring.

I think this is a great idea.  It will broadcast your ideas to an audience prepared to receive them.  You can leave out the "friendly AI" motivation and your ideas will stand on their own as a theory of (some of) cognition.

Do we have a sense for how much of the orca brain is specialized for sonar?  About a third of the human brain is specialized to visual perception.  If sonar is harder than vision, evolution might have dedicated more of the orca brain to it.  On the other hand, orcas don't need a bunch of brain for manual dexterity, like us.

In humans, the prefrontal cortex is dedicated to "higher" forms of thinking.  But evolution slides functions around on the cortical surface, and (Claude tells me) association areas like the prefrontal cortex are particularly prone to this.  Just looking for the volume of the prefrontal cortex won't tell you how much actual thought goes on there.

3Towards_Keeperhood
I don't know. It's particularly bad for cetaceans. Their functional mapping looks completely different.

All the pictures are missing for me.

4gjm
They're present on the original for which this is a linkpost. I don't know what the mechanism was by which the text was imported here from the original, but presumably whatever it was it didn't preserve the images.

Is this the consensus view? I think it’s generally agreed that software development has been sped up. A factor of two is ambitious! But that’s what it seems to me, and I’ve measured three examples of computer vision programming, each taking an hour or two, by doing them by hand and then with machine assistance. The machines are dumb and produce results that require rewriting. But my code is also inaccurate on a first try. I don’t have any references where people agree with me. And this may not apply to AI programming in general.

You ask about “anony... (read more)

One argument against is that I think it’s coming soon, and I have a 40 year history of frothing technological enthusiasm, often predicting things will arrive decades before they actually do. 😀

These criticisms are often made of “market dominant minorities”, to use a sociologist’s term for what American Jews and Indian-Americans have in common. Here’s a good short article on the topic: https://scholarship.law.duke.edu/cgi/viewcontent.cgi?article=5582&context=faculty_scholarship

This isn’t crazy— people have tried related techniques.  But it needs more details thought out. 

In the chess example, the AIs start out very stupid, being wired at random.  But in a game between two idiots, moving at random, eventually someone is going to win.  And then you reinforce the techniques used by the winner, and de-reinforce the ones used by the loser.  In any encounter, you learn, regardless of who wins.  But in an encounter between a PM and a programmer, if the programmer fails, who gets reinforced?  It might ... (read more)

3Purplehermann
The point was more about creating your own data being easy, just generate code then check it by running it. Save this code, and later use it for training. If we wanted to go the way of AlphaZero it doesn't seem crazy. De-enforce commands, functions, programs which output errors, for a start. I didn't think of the pm as being trained by these games, that's interesting. Maybe have two instances competing to get closer on some test cases the pm can prepare to go with the task, and have them competing on time, compute, memory, and accuracy. You can de-enforce the less accurate, and if fully accurate they can compete on time, memory, cpu. I'm not sure "hard but possible" is the bar - you want lots of examples of what doesn't work along with what does, and you want it for easy problems and hard ones so the model learns everything

This is a great question!

Point one:

The computational capacity of the brain used to matter much more than it matters now.  The AIs we have now are near-human or superhuman at many skills, and we can measure how skill capacity varies with resources in the near-human range.  We can debate and extrapolate and argue with real data.

But we spent decades where the only intelligent system we had was the human brain, so it was the only anchor we had for timelines.  So even though it’s very hard to make good estimates from, we had to use it.

Point two:

M... (read more)

3Purplehermann
Product manager, non-technical counterpart to a team lead in a development team

I disagree that there is a difference of kind between "engineering ingenuity" and "scientific discovery", at least in the business of AI.  The examples you give-- self-play, MCTS, ConvNets-- were all used in game-playing programs before AlphaGo.  The trick of AlphaGo was to combine them, and then discover that it worked astonishingly well.  It was very clever and tasteful engineering to combine them, but only a breakthrough in retrospect.  And the people that developed them each earlier, for their independent purposes?  They were p... (read more)

9Steven Byrnes
Yeah I’m definitely describing something as a binary when it’s really a spectrum. (I was oversimplifying since I didn’t think it mattered for that particular context.) In the context of AI, I don’t know what the difference is (if any) between engineering and science. You’re right that I was off-base there… …But I do think that there’s a spectrum from ingenuity / insight to grunt-work. So I’m bringing up a possible scenario where near-future AI gets progressively less useful as you move towards the ingenuity side of that spectrum, and where changing that situation (i.e., automating ingenuity) itself requires a lot of ingenuity, posing a chicken-and-egg problem / bottleneck that limits the scope of rapid near-future recursive AI progress. Perhaps! Time will tell  :)

Came here to say this, got beaten to it by Radford Neal himself, wow!  Well, I'm gonna comment anyway, even though it's mostly been said.

Gallagher proposed belief propagation as an approximate good-enough method of decoding a certain error-correcting code, but didn't notice that it worked on all sorts of probability problems.  Pearl proposed it as a general mechanism for dealing with probability problems, but wanted perfect mathematical correctness, so confined himself to tree-shaped problems.  It was their common generalization that was the... (read more)

Summary: Superintelligence in January-August, 2026.  Paradise or mass death, shortly thereafter.

This is the shortest timeline proposed in these answers so far.  My estimate (guess) is that there's only 20% of this coming true, but it looks feasible as of now.  I can't honestly assert it as fact, but I will say it is possible.

It's a standard intelligence explosion scenario: with only human effort, the capacities of our AIs double every two years.  Once AI gets good enough to do half the work, we double every one year.  Once we've do... (read more)

8xpym
Is this the consensus view? I've seen people saying that those assistants give 10% productivity improvement, at best. On the other hand, the schedules for headline releases (GPT-5, Claude 3.5 Opus) continue to slip, and there are anonymous reports of diminishing returns from scaling. The current moment is interesting in that there are two essentially opposite prevalent narratives barely interacting with each other.

There’s a shorter hill with a good slope in McLellan park, about a mile away.  It debouches into a flat area, so you can coast a long time and don’t have to worry about hitting a fence.  If you’ve got the nerve, you can sled onto a frozen pond and really go far.

The shorter hill means it’s quicker to climb, so it provides roughly equal fun per hour.

This is a lot easier to deal with than other large threats.  The CO2 keeps rising because fossil fuels are so nearly indispensable.  AI keeps getting smarter because they’re harmless and useful now and only dangerous in some uncertain future.  Nuclear weapons still exist because they can end any war.  But there is no strong argument for building mirror life.

I read (much of) the 300 page report giving the detailed argument.  They make a good case that the effects of a release of a mirror bacterium would be apocalyptic.  But wha... (read more)

The advantage is that they would have neither predators nor parasites, and their prey would not have adapted defenses to them.  This would be true of any organism with a sufficiently unearthly biochemistry.  Mirror life is the only such organism we are likely to create in the near term.

Has anyone been able to get to the actual “300 page report”?  I follow the link in the second line of this article and I get to a page that doesn’t seem to have any way to actually download the report.

6dirk
When I went to the page just now there was a section at the top with an option to download it; here's the direct PDF link.

“…Solomonoff’s malignness…”

I was friends with Ray Solomonoff; he was a lovely guy and definitely not malign.

Epistemic status: true but not useful.

I was all set to disagree with this when I reread it more carefully and noticed it said “superhuman reasoning” and not “superintelligence”.  Your definition of “reasoning” can make this obviously true or probably false.  

The Antarctic Treaty (and subsequent treaties) forbid colonization.  They also forbid extraction of useful resources from Antarctica, thereby eliminating one of the main motivations for colonization.  They further forbid any profitable capitalist activity on the continent.  So you can’t even do activities that would tend toward permanent settlement, like surveying to find mining opportunities, or opening a tourist hotel.  Basically, the treaty system is set up so that not only can’t you colonize, but you can’t even get close to colonizi... (read more)

A fascinating recent paper on the topic of human bandwidth  is https://arxiv.org/abs/2408.10234.  Title and abstract:

The Unbearable Slowness of Being

Jieyu Zheng, Markus Meister

This article is about the neural conundrum behind the slowness of human behavior. The information throughput of a human being is about 10 bits/s. In comparison, our sensory systems gather data at an enormous rate, no less than 1 gigabits/s. The stark contrast between these numbers remains unexplained. Resolving this paradox should teach us something fundamental about brain

... (read more)

They’re measuring a noisy phenomenon, yes, but that’s only half the problem.  The other half of the problem is that society demands answers.  New psychology results are a matter of considerable public interest and you can become rich and famous from them.  In the gap between the difficulty of supply and the massive demand grows a culture of fakery.  The same is true of nutrition— everyone wants to know what the healthy thing to eat is, and the fact that our current methods are incapable of discerning this is no obstacle to people who cl... (read more)

Here is a category of book that I really loved at that age: non-embarrasing novels about how adults do stuff.  Since, for me, that age was in 1973, the particular books I name might be obsolete. There’s a series of novels by Arthur Hailey, with titles like “Hotel” and “Airport”, that are set inside the titular institutions, and follow people as they deal with problems and interact with each other.  And there is no, or at least minimal, sex, so they’re not icky to a kid.  They’re not idealized; there is a reasonable degree of fallibility, ven... (read more)

Doesn’t matter, because HPMOR is engaging enough on a chapter-by-chapter basis.  I read lots of books when I was a kid when I didn’t understand the overarching plot.  As long as I had a reasonable expectation that cool stuff would happen in the next chapter, I’d keep reading.  I read “Stand On Zanzibar” repeatedly as a child, and didn’t understand the plot until I reread it as an adult last year.  Same with the detective novel “A Deadly Shade of Gold”.  I read it for the fistfights, snappy dialogue, and insights into adult life.  The plot was lost on me.

Load More