So when I think through the pre-mortem of "AI caused human extinction, how did it happen?" one of the more likely scenarios that comes to mind is not nano-this and bio-that, or even "one day we just all fall dead instantly and without a warning". Or a scissor statement that causes all-out wars. Or anything else noticeable.
Human mind is infinitely hackable through the visual, textual, auditory and other sensory inputs. Most of us do not appreciate how easily because being hacked does not feel like it. Instead it feels like your own volition, like you changed your mind based on logic and valid feelings. Reading a good book, listening to a good sermon, a speech, watching a show or a movie, talking to your friends and family is how mind-hacking usually happens. Abrahamic religions are a classic example. The Sequences and HPMoR are a local example. It does not work on everyone, but when it does, the subject feels enlightened rather than hacked. If you tell them their mind has been hacked, they will argue with you to the end, because clearly they just used logic to understand and embrace the new ideas.
So, my most likely extinction scenario is more like "humans realized that living is not worth it, and just kind of stopped" than anything violent. Could be spread out over the years and decades, like, for example, voluntarily deciding not to have children anymore. None of it would look like it was precipitated by an AI taking over. It does not even have to be a conspiracy by an unaligned SAI. It could just be that the space of new ideas, thanks to the LLMs getting better and better, expands a lot and in the new enough directions to include a few lethal memetic viruses like that.
I do think the terminology of "hacks" and "lethal memetic viruses" conjures up images of an extremely unnatural brain exploits when you mean quite a natural process that we already see some humans going through. Some monks/nuns voluntarily remove themselves from the gene pool and, in sects that prioritise ritual devotion over concrete charity work, they are also minimising their impact on the world.
My prior is this level of voluntary dedication (to a cause like "enlightenment") seems difficult to induce and there are much cruder and effective brain hacks available.
I expect we would recognise the more lethal brain hacks as improved versions of entertainment/games/pornography/drugs. These already compel some humans to minimise their time spent competing for resources in the physical world. In a direct way, what I'm describing is the opposite of enlightenment. It is prioritising sensory pleasures over everything else.
A sufficiently godlike AI could probably convince me to kill myself (or something equivalent, for example to upload myself to a simulation... and once all humans get there, the AI can simply turn it off). Or to convince me not to have kids (in a parallel life where I don't have them already), or simply keep me distracted every day with some new shiny toy so that I never decide that today is the right day to have unprotected sex with another human and get ready for the consequences.
But it would be much easier to simply convince someone else to kill me. And I think the AI will probably choose the simpler and faster way, because why not. It does not need a complicated way to get rid of me, if a simple way is available.
This is similar to reasoning about cults or scams. Yes, some of them could get me, by being sufficiently sophisticated, accidentally optimized for my weaknesses, or simply by meeting me on a bad day. But the survival of a cult or a scam scheme does not depend on getting me specifically; they can get enough other people, so it makes more sense for them to optimize for getting many people, rather than optimize for getting me specifically.
The more typical people will get the optimized mind-hacking message. The rest of us will then get a bullet.
Superficially, human minds look like they are way too diverse for that to cause human extinction by accident. If new ideas toast some specific human subgroup, other subgroups will not be equally affected.
It would be a message customized deliberately for each human, and worked on gradually over years of subtle convincing arguments. That's how I understand the hypothetical.
I think that an AI competent enough to manage this would have faster easier ways to accomplish the same effect, but I do agree that this would quite likely work.
If an information channel is only used to transmit information that is of negative expected value to the receiver, the selection pressure incentivizes the receiver to ignore that information channel.
That is to say, an AI which makes the most convincing-sounding argument for not reproducing to everyone will select for those people who ignore convincing-sounding arguments when choosing whether to engage in behaviors that lead to reproduction.
Yeah, but... Selection effects, in an evolutionary sense, are relevant over multiple generations. The time scale of the effects we're thinking about are less than the time scale of a single generation. This is less of a "magic attack that destroys everyone" and more of one of a thousand cuts which collectively bring down society. Some people get affected by arguments, others by distracting entertainment, others by nootropics that work well but stealthily have permanent impacts on fertility, some get caught up in terrorist attacks by weird AI-led cults.... Just, a bunch of stuff from a bunch of angles.
Yeah, my argument was "this particular method of causing actual human extinction would not work" not "causing human extinction is not possible", with a side of "agents learn to ignore adversarial input channels and this dynamic is frequently important".
Yeah. I don't actually think that a persuasive argument targeted to every single human is an efficient way for a superintelligent AI to accomplish its goals in the world. Someone else mentioned convincing the most gullible humans to hurt the wary humans. If the AI's goal was to inhibit human reproduction, it would be simple to create a bioweapon to cause sterility without killing the victims. Doesn't take very many loyally persuaded humans to be the hands for a mission like that.
That is indeed a bit of a defense. Though I suspect human minds have enough similarities that there are at least a few universal hacks.
None of it would look like it was precipitated by an AI taking over.
But, to be clear, in this scenario it would in fact be precipitated by an AI taking over? Because otherwise it's an answer to "humans went extinct, and also AI took over, how did it happen?" or "AI failed to prevent human extinction, how did it happen?"
Any of those. Could be some kind of intentionality ascribed to AI, could be accidental, could be something else.
I'm not even going to ask how a pouch ends up with voice recognition and natural language understanding when the best Artificial Intelligence programmers can't get the fastest supercomputers to do it after thirty-five years of hard work
some HPMoR statements did not age gracefully as others.
1991/1992, actually (Harry Potter was born July 1980, and the story takes place the school year after his 11th birthday.)
This is also why HPMOR! Harry is worried about grey goo and wants to work in nanotech, is only vaguely interested in AI; I think those were Eliezer's beliefs in about 1995 (he would be 16)
Harry is not meant to be a logically omniscient god who never makes any mistakes. Even on its own terms, “thirty-five years of hard work” is really not that long, nowhere near long enough for Harry to rightly believe the problem is so hard that magic must be pulling some crazy impossible to understand bullshit to accomplish the feat, and transparently so. Harry’s young, and in fact doesn’t have much historical perspective.
I notice my confusion when Eliezer speaks out against the idea of expressing p(doom) as a number: https://x.com/ESYudkowsky/status/1823529034174882234
I mean, I don't like it either, but I thought his whole point about Bayesian approach was to express odds and calculate expected values.
He explains why two tweets down the thread.
The idea of a "p(doom)" isn't quite as facially insane as "AGI timelines" as marker of personal identity, but (1) you want action-conditional doom, (2) people with the same numbers may have wildly different models, (3) these are pretty rough log-odds and it may do violence to your own mind to force itself to express its internal intuitions in those terms which is why I don't go around forcing my mind to think in those terms myself, (4) most people haven't had the elementary training in calibration and prediction markets that would be required for them to express this number meaningfully and you're demanding them to do it anyways, (5) the actual social role being played by this number is as some sort of weird astrological sign and that's not going to help people think in an unpressured way about the various underlying factual questions that ought finally and at the very end to sum to a guess about how reality goes.
This seems very reasonable to me, and I think it's a very common opinion about [edit: I meant among] AI safety people that discussing p(doom) numbers without lots of underlying models is not very useful.
The important part of Eliezer's writing on probability IMO is to notice that the underlying laws of probability are Bayesian and do sanity checks, not to always explicitly calculate probabilities. Given that it's only kinda useful in life in general, it is reasonable that (4) and (5) can make trying it net negative.
So, is he saying that he is calibrated well enough to have a meaningful "action-conditional" p(doom), but most people are not? And that they should not engage in "fake Bayesianism"? But then, according to the prevailing wisdom, how would one decide how to act if they cannot put a number on each potential action?
Speaking only for myself: perhaps you should put a number on each potential action and choose accordingly, but you do not need to communicate the exact number. Yes, the fact that you worry about safety and chose to work on it already implies something about the number, but you don't have to make the public information even more specific.
This creates some problems with coordination; if you believe that p(doom) is exactly X, it would have certain advantages if all people who want to contribute to AI safety believed that p(doom) is exactly X. But maybe the disadvantages outweigh that.
That is a good point, deciding is different from communicating the rationale for your decisions. Maybe that is what Eliezer is saying.
Thank you, I forgot about that one. I guess the summary would be "if your calibration for this class of possibilities sucks, don't make up numbers, lest you start trusting them". If so, that makes sense.
Just a quote found online:
SpaceX can build fully reusable rockets faster than the FAA can shuffle fully disposable paper
My expectation, which I may have talked about before here, is that the LLMs will eat all of the software stack between the human and the hardware. Moreover, they are already nearly good enough to do that, the issue is that people have not yet adapted to the AI being able to do that. I expect there to be no OS, no standard UI/UX interfaces, no formal programming languages. All interfaces will be more ad hoc, created by the underlying AI to match the needs of the moment. It can be star trek like "computer plot a course to..." or a set of buttons popping up on your touchscreen, or maybe physical buttons and keys being labeled as needed in real-time, or something else. But not the ubiquitous rigid interfaces of the last millennium. For the clues of what is already possible but not being implemented yet one should look to the scifi movies and shows, unconstrained by the current limits. Almost everything useful there is already doable or will be in a short while. I hope someone is working on this.
I think I articulated this view here before, but it is worth repeating. It seems rather obvious to me that there are no "Platonic" laws of physics, and there is no Platonic math existing in some ideal realm. The world just is, and everything else is emergent. There are reasonably durable patterns in it, which can sometimes be usefully described as embedded agents. If we squint hard, and know what to look for, we might be able to find a "mini-universe" inside such an agent, which is a poor-fidelity model of the whole universe, or, more likely, of a tiny part of it. These patterns we call agents appear to be fairly common and multi-level, and if we try to generalize the models they use across them, we find that something like "laws of physics" is a concise description. In that sense the laws of physics exist in the universe, but only as an abstraction over embedded agents of a certain level of complexity.
It is not clear whether any randomly generated world would necessarily get emergent patterns like that, but the one we live in does, at least to a degree. It is entirely possible that there is a limit to how accurate a model a tiny embedded agent can contain. For example, if most of the universe is truly random, we would never be able to understand those parts, and they would look like miracles to us, just something that pops up without any observable cause. Another possibility that we might find some patterns that are regular but defy analysis. These would look to us like "magic": something we know how to call into being, but that defies any rational explanation.
We certainly hope that the universe we live in does not contain either miracles or magic, but it is, in the end, an open empirical question, and does not require any kind of divine power or dualism, it might just be the feature of our world.
There is an emergent reason, one that lives in the minds of the agents. The universe just is. In other words, if you are a hypothetical Laplace's demon, you don't need the notion of a reason, you see it all at once, past, present and future.
Let's consider a phenomenon like, the planets going around the sun. They keep going around and around it, with remarkable consistency and precision. An "ontological realist" about laws of physics, would say that laws of physics are the reason why the planets engage in this repetitive behaviour, rather than taking off in a different direction, or just dissolving into nothingness. Do you believe that this was happening, even before there were any human agents to form mental representations of the situation? Do you have any mind-independent explanation of why the planets were doing one thing rather than another? Or are these just facts without mind-independent explanations, facts without causes, facts which could have been completely different without making any difference to anything else?
I am not sure why you are including the mind here, maybe we are talking at cross purposes. I am not making statements about the world, only about the emergence of the laws of physics as written in textbooks, which exist as abstractions across human minds. If you are the Laplace's demon, you can see the whole world, and if you wanted to zoom into the level of "planets going around the sun", you could, but there is no reason for you to. This whole idea of "facts" is a human thing. We, as embedded agents, are emergent patterns that use this concept. I can see how it is natural to think of facts, planets or numbers as ontologically primitive or something, not as emergent, but this is not the view I hold.
I am not sure why you are including the mind here
Isn't your thesis that "laws of physics" only exist in the mind? But in that case, they can't be a causal or explanatory factor in anything outside the mind; which means that there are no actual explanations for the patterns in nature, whether you look at them dynamically or atemporally. There's no reason why planets go round the stars, there's no reason why orbital speeds correlate with masses in a particular way, these are all just big coincidences.
Isn't your thesis that "laws of physics" only exist in the mind?
Yes!
But in that case, they can't be a causal or explanatory factor in anything outside the mind
"a causal or explanatory factor" is also inside the mind
which means that there are no actual explanations for the patterns in nature
What do you mean by an "actual explanation"? Explanations only exist in the mind, as well.
There's no reason why planets go round the stars
The reason (which is also in the minds of agents) is the Newton's law, which is an abstraction derived from the model of the universe that exists in the minds of embedded agents.
there's no reason why orbital speeds correlate with masses in a particular way, these are all just big coincidences
"None of this is a coincidence because nothing is ever a coincidence" https://tvtropes.org/pmwiki/pmwiki.php/Literature/Unsong
"Coincidence" is a wrong way of looking at this. The world is what it is. We live in it and are trying to make sense of it, moderately successfully. Because we exist, it follows that the world is somewhat predictable from the inside, otherwise life would not have been a thing. That is, tiny parts of the world can have lossily compressed but still useful models of some parts/aspects of the world. Newton's laws are part of those models.
A more coherent question would be "why is the world partially lossily compressible from the inside", and I don't know a non-anthropic answer, or even if this is an answerable question. A lot of "why" questions in science bottom out at "because the world is like that".
... Not sure if this makes my view any clearer, we are obviously working with very different ontologies.
A lot of "why" questions in science bottom out at "because the world is like that".
But you see, by treating the laws of physics as nothing but mental constructs (rather than as a reality with causal power, that is imperfectly approximated by minds), you extend the realm of brute facts rather radically. Under a law-based conception of physical reality, the laws and the initial conditions may be brute facts, but everything else is a consequence of those facts. By denying that there are mind-independent laws at all, all the concrete patterns of physics (from which the existence of the laws is normally inferred) instead become brute facts too.
I think I understand your speculations about an alternative paradigm, e.g. maybe intelligent life can't exist in worlds that don't have sufficiently robust patterns, and so the informational compressibility of the world is to be attributed to anthropics rather than to causally ordering principles. But this faces the same problem as the idea that the visible universe arose as a Boltzmann fluctuation, or that you yourself are a Boltzmann brain: the amount of order is far greater than such a hypothesis implies. A universe created in a Boltzmann fluctuation would only need one galaxy or even one star. A hallucinated life experienced by a Boltzmann brain ought to unravel at any moment, as the vacuum of space kills the brain.
The simplest explanation is that some kind of Platonism is real, or more precisely (in philosophical jargon) that "universals" of some kind do exist. One does not need to be a literal Platonist about them. Aristotle's approach is closer to common sense: universals are always attached to some kind of substance. Philosophers may debate about the right way to think of them, but to remove them outright, because of a philosophical prejudice or blindspot, leads to where you are now.
I was struck by something I read in Bertrand Russell, that some of the peculiarities of Leibniz's worldview arose because he did not believe in relations, he thought substance and property are the only forms of being. As a result, he didn't think interaction between substances is possible (since that would be a relation), and instead came up with his odd theory about a universe of monadic substances which are all preprogrammed by God to behave as if they are interacting.
I am not 100% against these radical attempts to do without something basic in ontology, because who knows what creative ideas may arise as a result? But personally I prefer to posit as rich an ontology as possible, so that I will not unnecessarily rule out an explanation that may be right in front of me.
It seems like we are not even close to converging on any kind of shared view. I don't find the concept of "brute facts" even remotely useful, so I cannot comment on it.
But this faces the same problem as the idea that the visible universe arose as a Boltzmann fluctuation, or that you yourself are a Boltzmann brain: the amount of order is far greater than such a hypothesis implies.
I think Sean Carroll answered this one a few times: the concept of a Boltzmann brain is not cognitively stable (you can't trust your own thoughts, including that you are a Boltzmann brain). And if you try to make it stable, you have to reconstruct the whole physical universe. You might be saying the same thing? I am not claiming anything different here.
The simplest explanation is that some kind of Platonism is real, or more precisely (in philosophical jargon) that "universals" of some kind do exist.
Like I said in the other reply, I think that those two words are not useful as binaries real/not real, exist/not exist. If you feel that this is non-negotiable to make sense of philosophy of physics or something, I don't know what to say.
I was struck by something I read in Bertrand Russell, that some of the peculiarities of Leibniz's worldview arose because he did not believe in relations, he thought substance and property are the only forms of being. As a result, he didn't think interaction between substances is possible (since that would be a relation), and instead came up with his odd theory about a universe of monadic substances which are all preprogrammed by God to behave as if they are interacting.
Yeah, I think denying relations is going way too far. A relation is definitely a useful idea. It can stay in epistemology rather than in ontology.
I am not 100% against these radical attempts to do without something basic in ontology, because who knows what creative ideas may arise as a result? But personally I prefer to posit as rich an ontology as possible, so that I will not unnecessarily rule out an explanation that may be right in front of me.
Fair, it is foolish to reduce potential avenues of exploration. Maybe, again, we differ where they live, in the world as basic entities or in the mind as our model of making sense of the world.
It seems rather obvious to me that there are no "Platonic" laws of physics, and there is no Platonic math existing in some ideal realm.
I am rather uncertain about what it means for something to "exist", as a stand-alone form. When people use this word, it often seems to end up referring to a free-floating belief that does not pay rent in anticipated experiences.
Is there anything different about the world that I should expect to observe depending on whether Platonic math "exists" in some ideal realm? If not, why would I care about this topic once I have already dissolved my confusion about what beliefs are meant to refer to? What could serve as evidence one way or another when answering the question of whether math "exists"?
By contrast, we can talk about "reality" existing separately from our internal conception of it because the map is not the territory, which has specific observable consequences, as Eliezer beautifully explained in a still-underrated post from 17 years ago:
I pause. “Well . . .” I say slowly. “Frankly, I’m not entirely sure myself where this ‘reality’ business comes from. I can’t create my own reality in the lab, so I must not understand it yet. But occasionally I believe strongly that something is going to happen, and then something else happens instead. I need a name for whatever-it-is that determines my experimental results, so I call it ‘reality’. This ‘reality’ is somehow separate from even my very best hypotheses. Even when I have a simple hypothesis, strongly supported by all the evidence I know, sometimes I’m still surprised. So I need different names for the thingies that determine my predictions and the thingy that determines my experimental results. I call the former thingies ‘belief,’ and the latter thingy ‘reality.’ ” - The Simple Truth
(Bolding mine) If the bolded section was not correct, i.e., if you were an omniscient being whose predictions always panned out in actuality, you would likely not need to keep track of reality as a separate concept from the inner workings of your mind, because... reality would be the same as the inner workings of your mind. But because this is false, and all of us are bounded, imperfect, embedded beings without the ability to fully understand what is around us, we need to distinguish between "what we think will happen" and "what actually happens."
Later on in the thread, you talked about "laws of physics" as abstractions written in textbooks, made so they can be understandable to human minds. But, as a terminological matter, I think it is better to think of the laws of physics as the rules that determine how the territory functions, i.e., the structured, inescapable patterns guiding how our observations come about, as opposed to the inner structure of our imperfect maps that generate our beliefs.
From this perspective, Newton's 3rd law, for example, is not a real "law" of physics, for we know it can be broken (it does not perfectly represent the underlying reality). Neither is General Relativity or QFT, partly because we know they are at odds with one another in particular regimes and do not always give predictions that align with our observations. The "laws" of physics would be restricted to the collection of rules that always output correct predictions (which might very well be an empty set if, for example, reductionism ends up being false and we turn out to live in a world devoid of perfect underlying structure).
Is there anything different about the orld that I should expect to observe depending on whether Platonic math “exists” in some ideal realm? If not, why would I care about this topic once I have already dissolved my confusion about what beliefs are meant to refer to?
Word of Yud is that beliefs aren't just about predicting experience. While he wrote Beliefs Must Pay Rent, he also wrote No Logical Positivist I.
(Another thing that has been going on for years is people quoting VBeliefs Must Pay Rent as though it's the whole story).
Maybe you are a logical positivist, though....you're allowed to be , and the rest of us are allowed not to be. It's a value judgement: what doesn't have instrumental value toward predicting experience can still.have terminal value.
If you are not an LP,.idealist, etc, you are interested in finding the best explanation for.your observations -- that's metaphysics. Shminux seems.sure that certain negative metaphysical claims are true -- there are No Platonic numbers, objective laws,.nor real probabilities. LP. does not allow such conclusions: it rejects both positive and negative metaphysical claim as meaningless.
The question is what would support the dogmatic version of nomic antirealism, as.opposed to the much more defensible claim that we don't know one way or the other (irrealism)
Later on in the thread, you talked about “laws of physics” as abstractions written in textbooks, made so they can be understandable to human minds. But, as a terminological matter, I think it is better to think of the laws of physics as the rules that determine how the territory functions, i.e., the structured, inescapable patterns guiding how our observations come about, as opposed to the inner structure of our imperfect maps that generate our beliefs.
The term can be used in either sense. Importantly, it can be used in both senses: the existence of in-the-mind sense doesn't preclude the existence of the in--reality sense. Maps dont necessarily correspond to reality, but they can. "Doesn't necessarily correspond " doesnt mean the same thing as necessarily doesn't correspond".
@Shminux
It is not clear whether any randomly generated world would necessarily get emergent patterns like that, but the one we live in does, at least to a degree
And maybe there is a.reason for that...and maybe the reason is the existence of Platonic in -the-territory physical laws. So there .s an argument for nomic realism. Is there an argument against? You haven't given one, just "articulated a claim".
So in your opinion, there is no reason why anything happens?
There is an emergent reason, one that lives in the minds of the agents.
But that's not the kind of reason that makes anything happen -- it's just a passive model.
The universe just is.
That isn't an argument against or for Platonic laws. Maybe it just is in a way that includes Platonic laws, maybe it isn't.
In other words, if you are a hypothetical Laplace’s demon, you don’t need the notion of a reason, you see it all at once, past, present and future.
I think you mean a hypothetical God with a 4D view of spacetime. And LD only has the ability to work out the future from a 3D snapshot. Yes, if you could see past present , you wouldn't need in-the-mind laws to.make predictions..but, again that says nothing about in-the-territory, Platonic laws. Even if God doesn't need in-the-mind laws, it's still possible that reality needs in-the-territory laws to make things happen.
“a causal or explanatory factor” is also inside the mind
Anthropics and Boltzmann brains are also in the mind. As concepts.
What's in the mind has to make sense, to fit together. Even if maths is all in the mind, maths problems still need to be solved. Saying maths is all in the mind does not tell you whether a particular theorem is true or false. Likewise , saying metaphysics is all in the mind does bot tell you that nomic realism is false, and anthropics true.
We have a meta map of the mind world relation, and if we assume a causal relation from the world to the mind, we can explain where new information comes from, and if we assume lawful behaviour in the world, we can explain regularities. Maybe these are all concepts we have, but we still need to fit them.together in a way that reduces the overall mystery, just as we still need to solve maths problems.
What do you mean by an “actual explanation”?
Fitting them.together in a way that reduces the overall mystery.
We live in it and are trying to make sense of it
And if you want us to believe that the instrumentalist picture makes the most sense, you need to argue for it. The case for realism.l, by contrast, has been made.
A more coherent question would be “why is the world partially lossily compressible from the inside”, and I don’t know a non-anthropic answer
The objective existence of physical laws, nomic realism, is a non anthropic answer which has already been put to you.
ETA
Maybe, again, we differ where they live, in the world as basic entities or in the mind as our model of making sense of the world.
...or both, since...
it is foolish to reduce potential avenues of exploration.
Yudowsky's argument that probability is subjective is flawed, because it rests on assumption that the existence of subjective probability implies the non existence of objective probabilty but the assumption is never justified. But you seem to buy into it anyway. And you seem to be basing your anti realism o n a similar unargued assumption.
you are interested in finding the best explanation for.your observations -- that's metaphysics. Shminux seems.sure that certain negative metaphysical claims are true -- there are No Platonic numbers, objective laws,.nor real probabilities
I really don't understand what "best explanation", "true", or "exist" mean, as stand-alone words divorced from predictions about observations we might ultimately make about them.
This isn't just a semantic point, I think. If there are no observations we can make that ultimately reflect whether something exists in this (seems to me to be) free-floating sense, I don't understand what it can mean to have evidence for or against such a proposition. So I don't understand how I am even supposed to ever justifiably change my mind on this topic, even if I were to accept it as something worth discussing on the object-level.
A: "I am interested in knowing whether Platonic mathematical entities exist."
B: "What does it mean for something to 'exist'?"
A: "Well, it seems to be an irreducible metaphysical claim; something can exist, or not exist."
B: "Um... I'm not sure about this whole notion of 'metaphysics'. But anyway, trying a different track, do these Platonic mathematical entities interact with observable physical reality in any discernible way?"
A: "I mean, as a practical descriptor, we use those particular mathematical entities because they are useful to us when we model reality; but at their core, they are supposed to be Platonic. They are forms, not concrete entities. Nevertheless, we can reason about them; we can define them, relate them to one another, prove theorems about them. We can even develop better intuitions about them over time as we work with them more and more. Is that what do you mean by 'discernible'?"
B: "Not exactly... Look, as Eliezer has said:
"Information theory 101," the boy said in a lecturing tone. "Observing variable X conveys information about variable Y, if and only if the possible values of X have different probabilities given different states of Y.
B (continuing): "So let's think of the existence/non-existence status of Platonic math as the variable Y. Now, look, I am fine with accepting a broad perspective on the 'variable' X. Let it be an ocular observation, or the running of a scientific experiment, or a bunch of internal reasoning, or other stuff you might propose. I just want to know whether, given that these Platonic entities seem to live in an entirely separate magisterium, how is it that any observation/thought/conclusion we reach about the supposed existence of Platonic math (i.e., any particular value of Y) can result in different probabilities for the possible values of X?"
A: "You're implicitly adopting a strong Bayesian perspective that you have neither argued for nor properly applied-"
B: "Fine! Drop the explicit probability stuff, then. I exist as a being embedded in the physical world. These mathematical entities are defined to exist (if they exist, which seems to be the question) in a Platonic realm that does not intersect this reality. After all, they are ideal forms, not concrete instances. So where is the entanglement between the two realms?"
A: "What do you mean by entanglement? I hope you're not sneakily trying to assume a correspondence theory of truth yet again without justifying it."
B: "What I mean is that I don't understand how anything I do in this physical world that I belong to-"
A: "What makes you so sure you really live in a physical world? That's the territory, not the map. All you have access to is the information given by your sense data and by your internal state of mind, including your memory. Maybe there is no outside world after all; maybe solipsism is true; maybe you live in an infinite-layer-deep simulation. Or maybe all of that is wrong and you actually are correct about living in some universe that runs on rough modifications of General Relativity and QFT, but even in that case, at the present moment you are totally not justified in assigning such confidence to this. You are merely assuming it instead, and this is precisely the type of metaphysical assumption you were pretending to be unable to wrap your head around at the start of the conversation."
B: "Good Lord, fine! I'll drop the physical universe assumption as well, for purposes of this conversation, even though I think I actually do have great justifications for being confident in it. But I won't let the discussion get derailed any further in that direction. But I can still say that I am a concrete entity, precisely the type of being that shminux was contrasting the nature of when they mentioned Platonic mathematics. So I am concrete, and everything that goes on in my brain runs on the concrete me-stuff (I am intentionally avoiding the word 'matter'), and yet I am supposed to have an opinion about the existence/non-existence of something that is entirely not concrete."
A: "Isn't that something you already conceded you were willing to try to do when you entered this dialogue? You could have walked away at the beginning or anytime after giving your whole spiel about beliefs paying rent in anticipated experiences."
B: "Yes, I am willing to try to do that. But I don't understand how I am supposed to ever change my mind on this topic, as I have already mentioned. What can possibly sway me one way or another when all variables X that I appear to be able to observe (or think about, etc.) are in the concrete realm, which is defined to be entirely non-intersecting with the Platonic realm? Does anything in this concrete realm have different values, or likelihoods, or entanglements, given different possible versions of whether the Platonic realm and its inhabitants exist? If math exists, is there anything that distinguishes this possibility? If math doesn't exist, is there anything that distinguishes that possibility? What can count as epistemically justifiable updating (or, if you dislike the Bayesian connotations around that word, think 'mind-changing' instead) on this topic? As a general matter, I usually change my mind if I see new evidence or hear new good arguments, but I think of 'good arguments' as lines of logic and reasoning, whose validity and soundness implies we are more likely to be in a world where certain possibilities are true rather than others (when mulling over multiple hypotheses). What can that possibly mean in this context?"
A: [I'm not sure I can simulate A's perspective after this point]
I really don’t understand what “best explanation”, “true”, or “exist” mean, as stand-alone words divorced from predictions about observations we might ultimately make about them.
Nobody is saying that anything has to be divorced from prediction , in the sense that emperical evidence is ignored: the realist claim is that empirical evidence should be supplemented by other epistemic considerations.
Best explanation:- I already pointed out that EY is not an instrumentalist. For instance, he supports the MWI over the CI, although they make identical predictions. Why does he do that? For reasons of explanatory simplicity , consilience with the rest of physics, etc .as he says. That gives you a clue as to what "best explanation" is. (Your bafflement is baffling...it sometimes sounds like you have read the sequences, it sometimes sounds like you haven't. Of course abduction, parsimony, etc are widely discussed in the mainstream literature as well ).
True:- mental concept corresponds to reality.
Exists:- You can take yourself as existing , and you can regard other putative entieties as existing if they gave some ability to causally interact with you. That's another baffling one, because you actually use something like that definition in your argument against mathematical realism below.
This isn’t just a semantic point, I think. If there are no observations we can make that ultimately reflect whether something exists in this (seems to me to be) free-floating sense, I don’t understand what it can mean to have evidence for or against such a proposition.
Empirical evidence doesn't exhaust justification. But you kind of know that, because you mention "good argument" below.
So I don’t understand how I am even supposed to ever justifiably change my mind on this topic, even if I were to accept it as something worth discussing on the object-level.
Apriori necessary truths can be novel and surprising to an agent, in practice, even though they are apriori and necessary in principle... because a realistic agent can't instantaneously and perfectly correlate their mental contents, and don't have an oracular list of every theory in their head. You are not a perfect Bayesian. You can notice a contradiction that you haven't noticed before. You can be informed of a simpler explanation that you hadn't formulated yourself.
What can possibly sway me one way or another when all variables X that I appear to be able to observe (or think about, etc.) are in the concrete realm, which is defined to be entirely non-intersecting with the Platonic realm?
Huh? I was debating nomic realism. Mathematical realism is another thing. Objectively existing natural laws obviously intersect with concrete observations , because if Gravity worked on an inverse cube law (etc), everything looked very different.
You don't have to buy into realism about all things, or anti realism about all things. You can pick and choose. I don't personally believe in Platonic realism about mathematics, for the same reasons you don't. I believe Nomic realism is another question...its logically possible for physical laws to have been different.
@shminux defined the the thing he is arguing against as "Platonic" .. I don't have to completely agree with that, nor do you. Maybe it's just a mistake to think of nomic realism as Platonism. Platonism marries the idea of non-mental existence and the idea of non causality...But they can be treated separately.
What can that possibly mean in this context?”
what context? You're talking about mathematical realism, I'm talking about nomic realism.
as lines of logic and reasoning, whose validity and soundness implies we are more likely to be in a world where certain possibilities are true rather than others (when mulling over multiple hypotheses
What have I said that makes you think I have departed from that?
@Shminux
If push comes to shove, I would even dispute that “real” is a useful category once we start examining deep ontological claims
Useful for what? If you terminally value uncovering the true nature of reality, as most scientists and philosophers do, you can hardly manage without some concept of "real". If you only value making predictions, perhaps you don't need the concept....But then the instrumentalist/realist divide is a difference in values, as I previously said, not a case of one side being wrong and the other side being right.
“Exist” is another emergent concept that is not even close to being binary, but more of a multidimensional spectrum (numbers, fairies and historical figures lie on some of the axes).
"Not a binary" is a different take from "not useful".
The critical point is that we have no direct access to the underlying reality, so we, as tiny embedded agents, are stuck dealing with the models regardless.
"No direct access to reality" is a different claim to "no access to reality" is a different claim to "there is no reality" is a different to "the concept of reality is not useful".l
I can provisionally accept that there is something like a universe that “exists”, but, as I said many years ago in another thread, I am much more comfortable with the ontology where it is models all thea way down (and up and sideways and every which way).
It's incoherent. What are these models, models of?
Thanks, I think you are doing a much better job voicing my objections than I would.
If push comes to shove, I would even dispute that "real" is a useful category once we start examining deep ontological claims. "Exist" is another emergent concept that is not even close to being binary, but more of a multidimensional spectrum (numbers, fairies and historical figures lie on some of the axes). I can provisionally accept that there is something like a universe that "exists", but, as I said many years ago in another thread, I am much more comfortable with the ontology where it is models all the way down (and up and sideways and every which way). This is not really a critical point though. The critical point is that we have no direct access to the underlying reality, so we, as tiny embedded agents, are stuck dealing with the models regardless.
Thank you for your thoughtful and insightful reply! I think there is a lot more discussion that could be had on this topic, and we are not very far apart, but this is supposed to be a "shortform" thread.
I never liked The Simple Truth post, actually. I sided with Mark, the instrumentalist, whom Eliezer turned into what I termed back then as "instrawmantalist". Though I am happy with the part
“Necessary?” says Inspector Darwin, sounding puzzled. “It just happened. . . I don’t quite understand your question.”
Rather recently Devs the show, which, for all its flaws, has a bunch of underrated philosophical highlights, had an episode with a somewhat similar storyline.
Anyway, appreciate your perspective.
I don't understand how one part of the post relates to another. Yeah, sure, computational irreducibility of the world can make understanding of the world impossible and this would be sad. But I don't see what it has to do with "Platonic laws of physics". Current physics, if anything else, is sort of antiplatonic: it claims that there are several dozens of independent entities, actually existing, called "fields", which produce the entire range of observable phenomena via interacting with each other, and there is no "world" outside this set of entities. "Laws of nature" are just "how this entities are". Outside very radical skepticism I don't know any reasons to doubt this worldview.
By "Platonic laws of physics" I mean the Hawking's famous question
What is it that breathes fire into the equations and makes a universe for them to describe…Why does the universe go to all the bother of existing?
Re
Current physics, if anything else, is sort of antiplatonic: it claims that there are several dozens of independent entities, actually existing, called "fields", which produce the entire range of observable phenomena via interacting with each other, and there is no "world" outside this set of entities.
I am not sure if it actually "claims" that. A HEP theorist would say that QFT (the standard model of particle physics) + classical GR is our current best model of the universe, with a bunch of experimental evidence that this is not all it is. I don't think there is a consensus for an ontological claim of "actually existing" rather than "emergent". There is definitely a consensus that there is more to the world that the fundamental laws of physics we currently know, and that some new paradigms are needed to know more.
"Laws of nature" are just "how this entities are". Outside very radical skepticism I don't know any reasons to doubt this worldview.
No, I don't think that is an accurate description at all. Maybe I am missing something here.
How to make dent in the "hard problem of consciousness" experimentally. Suppose we understand brain well enough to figure out what makes one experience specific qualia, then stimulate the neurons in a way that makes the person experience them. Maybe even link two people with a "qualia transducer" such that when one person experiences "what it's like", the other person can feel it, too.
If this works, what would remain from the "hard problem"?
Chalmers:
To see this, note that even when we have explained the performance of all the cognitive and behavioral functions in the vicinity of experience—perceptual discrimination, categorization, internal access, verbal report—there may still remain a further unanswered question: Why is the performance of these functions accompanied by experience?
If you can distill, store and reproduce this experience on demand, what remains? Or, at least, what would/does Chalmers say about it?
If you can distill, store and reproduce this experience on demand, what remains?
As lc has said:
For the record: the purpose of having a "theory of consciousness" is so it can tell us which blobs of matter feel particular things under which specific circumstances, and teach others how to make new blobs of matter that feel particular things. Down to the level of having a field of AI anaesthesiology.
So at least part of what remains, for example, is the task of figuring out, with surgical precision, whether any given LLM (or other AI agent) is "conscious" in any given situation we place it in. This is because your proposal, although it would massively increase our understanding of human consciousness, seems to me to depend on the particular neural configuration of human minds ("stimulating [human] neurons") and need not automatically generalize to all possible minds.
Thanks for the link! I thought it was a different, related but a harder problem than what is described in https://iep.utm.edu/hard-problem-of-conciousness. I assume we could also try to extract what an AI "feels" when it speaks of redness of red, and compare it with a similar redness extract from the human mind. Maybe even try to cross-inject them. Or would there be still more to answer?
I assume we could also try to extract what an AI "feels" when it speaks of redness of red, and compare it with a similar redness extract from the human mind.
Well, what happens if we do this and we find out that these representations are totally different? Or, moreover, that the AI's representation of "red" does not seem to align (either in meaning or in structure) with any human-extracted concept or perception? How do we then try to figure out the essence of artificial consciousness, given that comparisons with what we (at that point would) understand best, i.e., human qualia, would no longer output something we can interpret?
I think it is extremely likely that minds with fundamentally different structures perceive the world in fundamentally different ways, so I think the situation in the paragraph above is not only possible, but in fact overwhelmingly likely, conditional on us managing to develop the type of qualia-identifying tech you are talking about. It certainly seems to me that, in such a spot, there would be a fair bit more to answer about this topic.
Well, what happens if we do this and we find out that these representations are totally different? Or, moreover, that the AI's representation of "red" does not seem to align (either in meaning or in structure) with any human-extracted concept or perception?
I would say that it is a fantastic step forward in our understanding, resolving empirically a question we did not known an answer to.
How do we then try to figure out the essence of artificial consciousness, given that comparisons with what we (at that point would) understand best, i.e., human qualia, would no longer output something we can interpret?
That would be a great stepping stone for further research.
I think it is extremely likely that minds with fundamentally different structures perceive the world in fundamentally different ways, so I think the situation in the paragraph above is not only possible, but in fact overwhelmingly likely, conditional on us managing to develop the type of qualia-identifying tech you are talking about.
I'd love to see this prediction tested, wouldn't you?
I agree with all of that; my intent was only to make clear (by giving an example) that even after the development of the technology you mentioned in your initial comment, there would likely still be something that "remains" to be analyzed.
Yeah, that was my question. Would there be something that remains, and it sounds like Chalmers and others would say that there would be.
I think this restates the hard problem, rather than reducing it.
Suppose we understand brain well enough to figure out what makes one experience specific qualia
We first have to define and detect qualia. As long as it's only self-reported, there's no way to know if two person's qualia are similar, nor how to test a "qualia transducer".
The testing seems easy, one person feels the quale, the other reports the feeling, they compare, what am I missing?
I think you're missing (or I am) the distinction between feeling and reporting a feeling. Comparing reports is clearly insufficient across humans or LLMs.
Hmm, I am probably missing something. I thought if a human honestly reports a feeling, we kind of trust them that they felt it? So if an AI reports a feeling, and then there is a conduit where the distillate of that feeling is transmitted to a human, who reports the same feeling, it would go some ways toward accepting that the AI had qualia? I think you are saying that this does not address Chalmers' point.
I thought if a human honestly reports a feeling, we kind of trust them that they felt it?
Out of politeness, sure, but not rigorously. The "hard problem of consciousness" is that we don't know if what they felt is the same as what we interpret their report to be.
I believe that, while the LLM architecture may not lead to AGI (see https://bigthink.com/the-future/arc-prize-agi/ for the reasons why -- basically current models are rules interpolators, not rules extrapolators, though they are definitely data extrapolators), they will succeed in killing all computer languages. That is, there will be no intermediate rust, python, wasm or machine code. The AI will be the interpreter and executor of what we now call "prompts". They will also radically change the UI/UX paradigm. No menus, no buttons, no windows -- those are all artifacts of 1980s. The controls will be whatever you need them to be: voice, text, keypresses... Think of your grandma figuring out how to do something on her PC or phone and asking you, only the you will be the AI. There will be rigid specialized interfaces for, say, gaming, but those will be a small minority.
Email didn't entirely kill fax machines or paper records. For similar reasons, I expect that LLMs will not entirely kill computer languages.
Also, I expect things to go the other direction - I expect that as LLMs get better at writing code, they will generate enormous amounts of one-off code. For example, one thing that is not practical to do now but will be practical to do in a year or so is to have sales or customer service webpages where the affordances given to the user (e.g. which buttons and links are shown, what data the page asks for and in what format) will be customized on a per-user basis. For example, when asking for payment information, currently the UI is almost universally credit card number / cvv / name / billing street address / unit / zipcode / state. However, "hold your credit card and id up to the camera" might be easier for some people, while others might want to read out that information, and yet others might want to use venmo or whatever, and a significant fraction will want to stick to the old form fields format. If web developers developed 1,000x faster and 1,000x as cheaply, it would be worth it to custom-develop each of these flows to capture a handful of marginal customers. But forcing everyone to use the LLM interface would likely cost customers.
Yeah, I think this is exactly what I meant. There will still be boutique usage for hand-crafted computer programs just like there is now for penpals writing pretty decorated letters to each other. Granted, fax is still a thing in old-fashioned bureaucracies like Germany, so maybe there will be a requirement for "no LLM" code as well, but it appears much harder to enforce.
I think your point on infinite and cheap UI/UX customizations is well taken. The LLM will fit seamlessly one level below that. There will be no "LLM interface" just interface.
It is clear by now that one of the best uses of LLMs is to learn more about what makes us human by comparing how humans think and how AIs do. LLMs are getting closer to virtual p-zombies for example, forcing us to revisit that philosophical question. Same with creativity: LLMs are mimicking creativity in some domains, exposing the differences between "true creativity" and "interpolation". You can probably come up with a bunch of other insights about humans that were not possible before LLMs.
My question is, can we use LLMs to model and thus study unhealthy human behaviors, such as, say, addiction. Can we get an AI addicted to something and see if it starts craving for it, asking the user, or maybe trying to manipulate the user to get it.
Ancient Greek Hell is doing fruitless labor over and over, never completing it.
Christian Hell is boiling oil, fire and brimstone.
The Good Place Hell is knowing you are not deserving and being scared of being found out.
Lucifer Hell is being stuck reliving the day you did something truly terrible over and over.
Actual Hell does not exist. But Heaven does and everyone goes there. The only difference is that the sinners feel terrible about what they did while alive, and feel extreme guilt for eternity, with no recourse. That's the only brain tweak God does.
No one else tortures you, you can sing hymns all infinity long, but something is eating you inside and you can't do anything about it. Sinners would be like everyone else most of the time, just subdued, and once in a while they would start screaming and try to self-harm or suicide, to no avail. "Sorry, no pain for you except for the one that is eating you from inside. And no reprieve, either."
Sounds like the afterlife is designed to punish neurotics and reward psychopaths.
.
Angel of regret: "Time for your regular dose of remorse, Joe. Remember that one day when you horribly murdered dozen hostages?"
Psychopath Joe: "Haha, LOL, those were fun days!"
Nietzsche: "That's my boy! Too bad we don't have realistic pre-afterlife simulators here. Eternal recurrence ftw!"
Angel of regret: "I hate my job..."
(Plot twist: Angel of regret is Sisyphus reincarnated.)