All of Shmi's Comments + Replies

Right, eventually it will. But abstraction building is very hard! If you have any other option, like growing in size, I would expect it to be taken first.

I guess I should be a bit more precise. Abstraction building at the same level as before is probably not very hard. But going up a level is basically equivalent to inventing a new way of compressing knowledge, which is a quantitative leap.

The argument goes through on probabilities of each possible world, the limit toward perfection is not singular. given the 1000:1 reward ratio, for any predictor who is substantially better than chance once ought to one-box to maximize EV. Anyway, this is an old argument where people rarely manage to convince the other side.

It is clear by now that one of the best uses of LLMs is to learn more about what makes us human by comparing how humans think and how AIs do. LLMs are getting closer to virtual p-zombies for example, forcing us to revisit that philosophical question. Same with creativity: LLMs are mimicking creativity in some domains, exposing the differences between "true creativity" and "interpolation". You can probably come up with a bunch of other insights about humans that were not possible before LLMs.

My question is, can we use LLMs to model and thus study unhealthy ... (read more)

That is definitely my observation, as well: "general world understanding but not agency", and yes, limited usefulness, but also... much more useful than gwern or Eliezer expected, no? I could not find a link. 

I guess whether it counts as AGI depends on what one means by "general intelligence". To me it was having a fairly general world model and being able to reason about it. What is your definition? Does "general world understanding" count? Or do you include the agency part in the definition of AGI? Or maybe something else?

Hmm, maybe this is a General Tool, as opposed a General Intelligence?

4Daniel Kokotajlo
There are different definitions of AGI, but I think they tend to cluster around the core idea "can do everything smart humans can do, or at least everything nonphysical / everything they can do at their desk." LLM chatbots are a giant leap in that direction in progress-space, but they are still maybe only 10% of the way there in what-fraction-of-economically-useful-tasks-can-they-do space. True AGI would be a drop-in substitute for a human employee in any remote-friendly job; current LLMs are not that for any job pretty much, though they can substitute for (some) particular tasks in many jobs. And the main reason for this, I claim, is that they lack agency skills: Put them in an AutoGPT scaffold and treat them like an employee, and what'll happen? They'll flail around uselessly, get stuck often, break things and not notice they broke things, etc. They'll be a terrible employee despite probably knowing more relevant facts and understanding more relevant concepts than your average professional.

Given that we basically got AGI (without the creativity of best humans) that is a Karnofsky's Tool AI very unexpectedly, as you admit, can you look back and see what assumptions were wrong in expecting the tools agentizing on their own and pretty quickly? Or is everything in that Eliezer's post still correct or at least reasonable, and we are simply not at the level where "foom" happens yet?

Come to think of it, I wonder if that post had been revisited somewhere at some point, by Eliezer or others, in light of the current SOTA. Feels like it could be instructive.

We did not basically get AGI. I think recent history has been a vindication of people like Gwern and Eliezer back in the day (as opposed to Karnofsky and Drexler and Hanson). The point was always that agency is useful/powerful, and now we find ourselves in a situation where we have general world understanding but not agency and indeed our AIs are not that useful (compared to how useful AGI would be) precisely because they lack agency skills. We can ask them questions and give them very short tasks but we can't let them operate autonomously for long periods in pursuit of ambitious goals like we would an employee. 

At least this is my take, you don't have to agree.

I'm not even going to ask how a pouch ends up with voice recognition and natural language understanding when the best Artificial Intelligence programmers can't get the fastest supercomputers to do it after thirty-five years of hard work

some HPMoR statements did not age gracefully as others.

7Garrett Baker
Harry is not meant to be a logically omniscient god who never makes any mistakes. Even on its own terms, “thirty-five years of hard work” is really not that long, nowhere near long enough for Harry to rightly believe the problem is so hard that magic must be pulling some crazy impossible to understand bullshit to accomplish the feat, and transparently so. Harry’s young, and in fact doesn’t have much historical perspective.

It's set in about 1995...

That is indeed a bit of a defense. Though I suspect human minds have enough similarities that there are at least a few universal hacks.

Any of those. Could be some kind of intentionality ascribed to AI, could be accidental, could be something else.

So when I think through the pre-mortem of "AI caused human extinction, how did it happen?" one of the more likely scenarios that comes to mind is not nano-this and bio-that, or even "one day we just all fall dead instantly and without a warning". Or a scissor statement that causes all-out wars. Or anything else noticeable. 

Human mind is infinitely hackable through the visual, textual, auditory and other sensory inputs. Most of us do not appreciate how easily because being hacked does not feel like it. Instead it feels like your own volition, like you ... (read more)

9Viliam
A sufficiently godlike AI could probably convince me to kill myself (or something equivalent, for example to upload myself to a simulation... and once all humans get there, the AI can simply turn it off). Or to convince me not to have kids (in a parallel life where I don't have them already), or simply keep me distracted every day with some new shiny toy so that I never decide that today is the right day to have unprotected sex with another human and get ready for the consequences. But it would be much easier to simply convince someone else to kill me. And I think the AI will probably choose the simpler and faster way, because why not. It does not need a complicated way to get rid of me, if a simple way is available. This is similar to reasoning about cults or scams. Yes, some of them could get me, by being sufficiently sophisticated, accidentally optimized for my weaknesses, or simply by meeting me on a bad day. But the survival of a cult or a scam scheme does not depend on getting me specifically; they can get enough other people, so it makes more sense for them to optimize for getting many people, rather than optimize for getting me specifically. The more typical people will get the optimized mind-hacking message. The rest of us will then get a bullet.
9Stephen Fowler
I do think the terminology of "hacks" and "lethal memetic viruses" conjures up images of an extremely unnatural brain exploits when you mean quite a natural process that we already see some humans going through. Some monks/nuns voluntarily remove themselves from the gene pool and, in sects that prioritise ritual devotion over concrete charity work, they are also minimising their impact on the world. My prior is this level of voluntary dedication (to a cause like "enlightenment") seems difficult to induce and there are much cruder and effective brain hacks available. I expect we would recognise the more lethal brain hacks as improved versions of entertainment/games/pornography/drugs. These already compel some humans to minimise their time spent competing for resources in the physical world. In a direct way, what I'm describing is the opposite of enlightenment. It is prioritising sensory pleasures over everything else.
6faul_sname
But, to be clear, in this scenario it would in fact be precipitated by an AI taking over? Because otherwise it's an answer to "humans went extinct, and also AI took over, how did it happen?" or "AI failed to prevent human extinction, how did it happen?"
8cata
Superficially, human minds look like they are way too diverse for that to cause human extinction by accident. If new ideas toast some specific human subgroup, other subgroups will not be equally affected.

What are the issues that are "difficult" in philosophy, in your opinion? What makes them difficult?

I remember you and others talking about the need to "solve philosophy", but I was never sure what it meant by that.

My expectation, which I may have talked about before here, is that the LLMs will eat all of the software stack between the human and the hardware. Moreover, they are already nearly good enough to do that, the issue is that people have not yet adapted to the AI being able to do that. I expect there to be no OS, no standard UI/UX interfaces, no formal programming languages. All interfaces will be more ad hoc, created by the underlying AI to match the needs of the moment. It can be star trek like "computer plot a course to..." or a set of buttons popping up o... (read more)

Just a quote found online:

SpaceX can build fully reusable rockets faster than the FAA can shuffle fully disposable paper

It seems like we are not even close to converging on any kind of shared view. I don't find the concept of "brute facts" even remotely useful, so I cannot comment on it.

But this faces the same problem as the idea that the visible universe arose as a Boltzmann fluctuation, or that you yourself are a Boltzmann brain: the amount of order is far greater than such a hypothesis implies.

I think Sean Carroll answered this one a few times: the concept of a Boltzmann brain is not cognitively stable (you can't trust your own thoughts, including that you are a Boltzman... (read more)

Thanks, I think you are doing a much better job voicing my objections than I would. 

If push comes to shove, I would even dispute that "real" is a useful category once we start examining deep ontological claims. "Exist" is another emergent concept that is not even close to being binary, but more of a multidimensional spectrum (numbers, fairies and historical figures lie on some of the axes). I can provisionally accept that there is something like a universe that "exists", but, as I said many years ago in another thread, I am much more comfortable with ... (read more)

By "Platonic laws of physics" I mean the Hawking's famous question

What is it that breathes fire into the equations and makes a universe for them to describe…Why does the universe go to all the bother of existing?

Re

Current physics, if anything else, is sort of antiplatonic: it claims that there are several dozens of independent entities, actually existing, called "fields", which produce the entire range of observable phenomena via interacting with each other, and there is no "world" outside this set of entities.

I am not sure if it actually "claims" that. A ... (read more)

Yeah, that was my question. Would there be something that remains, and it sounds like Chalmers and others would say that there would be.

Thank you for your thoughtful and insightful reply! I think there is a lot more discussion that could be had on this topic, and we are not very far apart, but this is supposed to be a "shortform" thread. 

I never liked The Simple Truth post, actually. I sided with Mark, the instrumentalist, whom Eliezer turned into what I termed back then as "instrawmantalist". Though I am happy with the part

Necessary?” says Inspector Darwin, sounding puzzled. “It just happened. . . I don’t quite understand your question.”

Rather recently Devs the show, which, for all ... (read more)

Thank you, I forgot about that one. I guess the summary would be "if your calibration for this class of possibilities sucks, don't make up numbers, lest you start trusting them". If so, that makes sense.

Isn't your thesis that "laws of physics" only exist in the mind? 

Yes!

But in that case, they can't be a causal or explanatory factor in anything outside the mind

"a causal or explanatory factor" is also inside the mind

which means that there are no actual explanations for the patterns in nature

What do you mean by an "actual explanation"? Explanations only exist in the mind, as well.

There's no reason why planets go round the stars

The reason (which is also in the minds of agents) is the Newton's law, which is an abstraction derived from the model of the un... (read more)

4Mitchell_Porter
But you see, by treating the laws of physics as nothing but mental constructs (rather than as a reality with causal power, that is imperfectly approximated by minds), you extend the realm of brute facts rather radically. Under a law-based conception of physical reality, the laws and the initial conditions may be brute facts, but everything else is a consequence of those facts. By denying that there are mind-independent laws at all, all the concrete patterns of physics (from which the existence of the laws is normally inferred) instead become brute facts too.  I think I understand your speculations about an alternative paradigm, e.g. maybe intelligent life can't exist in worlds that don't have sufficiently robust patterns, and so the informational compressibility of the world is to be attributed to anthropics rather than to causally ordering principles. But this faces the same problem as the idea that the visible universe arose as a Boltzmann fluctuation, or that you yourself are a Boltzmann brain: the amount of order is far greater than such a hypothesis implies. A universe created in a Boltzmann fluctuation would only need one galaxy or even one star. A hallucinated life experienced by a Boltzmann brain ought to unravel at any moment, as the vacuum of space kills the brain.  The simplest explanation is that some kind of Platonism is real, or more precisely (in philosophical jargon) that "universals" of some kind do exist. One does not need to be a literal Platonist about them. Aristotle's approach is closer to common sense: universals are always attached to some kind of substance. Philosophers may debate about the right way to think of them, but to remove them outright, because of a philosophical prejudice or blindspot, leads to where you are now.  I was struck by something I read in Bertrand Russell, that some of the peculiarities of Leibniz's worldview arose because he did not believe in relations, he thought substance and property are the only forms of being.

That is a good point, deciding is different from communicating the rationale for your decisions. Maybe that is what Eliezer is saying. 

I think you are missing the point, and taking cheap shots.

3ChristianKl
The prevailing wisdom does not say that you need to put a number on each potential action to act. 

So, is he saying that he is calibrated well enough to have a meaningful "action-conditional" p(doom), but most people are not? And that they should not engage in "fake Bayesianism"? But then, according to the prevailing wisdom, how would one decide how to act if they cannot put a number on each potential action?

4Viliam
Speaking only for myself: perhaps you should put a number on each potential action and choose accordingly, but you do not need to communicate the exact number. Yes, the fact that you worry about safety and chose to work on it already implies something about the number, but you don't have to make the public information even more specific. This creates some problems with coordination; if you believe that p(doom) is exactly X, it would have certain advantages if all people who want to contribute to AI safety believed that p(doom) is exactly X. But maybe the disadvantages outweigh that.
3ChristianKl
The idea that you can only decide how to act if you have numbers is a strawman. Rationalists are not Straw-Vulcans. 

I notice my confusion when Eliezer speaks out against the idea of expressing p(doom) as a number: https://x.com/ESYudkowsky/status/1823529034174882234

I mean, I don't like it either, but I thought his whole point about Bayesian approach was to express odds and calculate expected values.

1dirk
see https://www.lesswrong.com/posts/AJ9dX59QXokZb35fk/when-not-to-use-probabilities

He explains why two tweets down the thread.

The idea of a "p(doom)" isn't quite as facially insane as "AGI timelines" as marker of personal identity, but (1) you want action-conditional doom, (2) people with the same numbers may have wildly different models, (3) these are pretty rough log-odds and it may do violence to your own mind to force itself to express its internal intuitions in those terms which is why I don't go around forcing my mind to think in those terms myself, (4) most people haven't had the elementary training in calibration and prediction m

... (read more)

Hmm, I am probably missing something. I thought if a human honestly reports a feeling, we kind of trust them that they felt it? So if an AI reports a feeling, and then there is a conduit where the distillate of that feeling is transmitted to a human, who reports the same feeling, it would go some ways toward accepting that the AI had qualia? I think you are saying that this does not address Chalmers' point.

4Dagon
Out of politeness, sure, but not rigorously.  The "hard problem of consciousness" is that we don't know if what they felt is the same as what we interpret their report to be.

I am not sure why you are including the mind here, maybe we are talking at cross purposes. I am not making statements about the world, only about the emergence of the laws of physics as written in textbooks, which exist as abstractions across human minds. If you are the Laplace's demon, you can see the whole world, and if you wanted to zoom into the level of "planets going around the sun", you could, but there is no reason for you to. This whole idea of "facts" is a human thing. We, as embedded agents, are emergent patterns that use this concept. I can see how it is natural to think of facts, planets or numbers as ontologically primitive or something, not as emergent, but this is not the view I hold.

4Mitchell_Porter
Isn't your thesis that "laws of physics" only exist in the mind? But in that case, they can't be a causal or explanatory factor in anything outside the mind; which means that there are no actual explanations for the patterns in nature, whether you look at them dynamically or atemporally. There's no reason why planets go round the stars, there's no reason why orbital speeds correlate with masses in a particular way, these are all just big coincidences. 

Well, what happens if we do this and we find out that these representations are totally different? Or, moreover, that the AI's representation of "red" does not seem to align (either in meaning or in structure) with any human-extracted concept or perception?

I would say that it is a fantastic step forward in our understanding, resolving empirically a question we did not known an answer to.

How do we then try to figure out the essence of artificial consciousness, given that comparisons with what we (at that point would) understand best, i.e., human qualia, wou

... (read more)
1[anonymous]
I agree with all of that; my intent was only to make clear (by giving an example) that even after the development of the technology you mentioned in your initial comment, there would likely still be something that "remains" to be analyzed.

The testing seems easy, one person feels the quale, the other reports the feeling, they compare, what am I missing?

2Dagon
I think you're missing (or I am) the distinction between feeling and reporting a feeling.  Comparing reports is clearly insufficient across humans or LLMs.

Thanks for the link! I thought it was a different, related but a harder problem than what is described in https://iep.utm.edu/hard-problem-of-conciousness. I assume we could also try to extract what an AI "feels" when it speaks of redness of red, and compare it with a similar redness extract from the human mind. Maybe even try to cross-inject them. Or would there be still more to answer?

1[anonymous]
Well, what happens if we do this and we find out that these representations are totally different? Or, moreover, that the AI's representation of "red" does not seem to align (either in meaning or in structure) with any human-extracted concept or perception? How do we then try to figure out the essence of artificial consciousness, given that comparisons with what we (at that point would) understand best, i.e., human qualia, would no longer output something we can interpret? I think it is extremely likely that minds with fundamentally different structures perceive the world in fundamentally different ways, so I think the situation in the paragraph above is not only possible, but in fact overwhelmingly likely, conditional on us managing to develop the type of qualia-identifying tech you are talking about. It certainly seems to me that, in such a spot, there would be a fair bit more to answer about this topic.

How to make dent in the "hard problem of consciousness" experimentally. Suppose we understand brain well enough to figure out what makes one experience specific qualia, then stimulate the neurons in a way that makes the person experience them. Maybe even link two people with a "qualia transducer" such that when one person experiences "what it's like", the other person can feel it, too. 

If this works, what would remain from the "hard problem"?

Chalmers:

To see this, note that even when we have explained the performance of all the cognitive and behavioral

... (read more)
2Dagon
I think this restates the hard problem, rather than reducing it.   We first have to define and detect qualia.  As long as it's only self-reported, there's no way to know if two person's qualia are similar, nor how to test a "qualia transducer".
3[anonymous]
As lc has said: So at least part of what remains, for example, is the task of figuring out, with surgical precision, whether any given LLM (or other AI agent) is "conscious" in any given situation we place it in. This is because your proposal, although it would massively increase our understanding of human consciousness, seems to me to depend on the particular neural configuration of human minds ("stimulating [human] neurons") and need not automatically generalize to all possible minds.

There is an emergent reason, one that lives in the minds of the agents. The universe just is. In other words, if you are a hypothetical Laplace's demon, you don't need the notion of a reason, you see it all at once, past, present and future.

4Mitchell_Porter
Let's consider a phenomenon like, the planets going around the sun. They keep going around and around it, with remarkable consistency and precision. An "ontological realist" about laws of physics, would say that laws of physics are the reason why the planets engage in this repetitive behaviour, rather than taking off in a different direction, or just dissolving into nothingness. Do you believe that this was happening, even before there were any human agents to form mental representations of the situation? Do you have any mind-independent explanation of why the planets were doing one thing rather than another? Or are these just facts without mind-independent explanations, facts without causes, facts which could have been completely different without making any difference to anything else?

I think I articulated this view here before, but it is worth repeating. It seems rather obvious to me that there are no "Platonic" laws of physics, and there is no Platonic math existing in some ideal realm. The world just is, and everything else is emergent. There are reasonably durable patterns in it, which can sometimes be usefully described as embedded agents. If we squint hard, and know what to look for, we might be able to find a "mini-universe" inside such an agent, which is a poor-fidelity model of the whole universe, or, more likely, of a tiny par... (read more)

4quetzal_rainbow
I don't understand how one part of the post relates to another. Yeah, sure, computational irreducibility of the world can make understanding of the world impossible and this would be sad. But I don't see what it has to do with "Platonic laws of physics". Current physics, if anything else, is sort of antiplatonic: it claims that there are several dozens of independent entities, actually existing, called "fields", which produce the entire range of observable phenomena via interacting with each other, and there is no "world" outside this set of entities. "Laws of nature" are just "how this entities are". Outside very radical skepticism I don't know any reasons to doubt this worldview.
6[anonymous]
I am rather uncertain about what it means for something to "exist", as a stand-alone form. When people use this word, it often seems to end up referring to a free-floating belief that does not pay rent in anticipated experiences.  Is there anything different about the world that I should expect to observe depending on whether Platonic math "exists" in some ideal realm? If not, why would I care about this topic once I have already dissolved my confusion about what beliefs are meant to refer to? What could serve as evidence one way or another when answering the question of whether math "exists"? By contrast, we can talk about "reality" existing separately from our internal conception of it because the map is not the territory, which has specific observable consequences, as Eliezer beautifully explained in a still-underrated post from 17 years ago: (Bolding mine) If the bolded section was not correct, i.e., if you were an omniscient being whose predictions always panned out in actuality, you would likely not need to keep track of reality as a separate concept from the inner workings of your mind, because... reality would be the same as the inner workings of your mind. But because this is false, and all of us are bounded, imperfect, embedded beings without the ability to fully understand what is around us, we need to distinguish between "what we think will happen" and "what actually happens." Later on in the thread, you talked about "laws of physics" as abstractions written in textbooks, made so they can be understandable to human minds. But, as a terminological matter, I think it is better to think of the laws of physics as the rules that determine how the territory functions, i.e., the structured, inescapable patterns guiding how our observations come about, as opposed to the inner structure of our imperfect maps that generate our beliefs.  From this perspective, Newton's 3rd law, for example, is not a real "law" of physics, for we know it can be broken (it does
8Mitchell_Porter
So in your opinion, there is no reason why anything happens?

Hence the one tweak I mentioned.

Ancient Greek Hell is doing fruitless labor over and over, never completing it.

Christian Hell is boiling oil, fire and brimstone.

The Good Place Hell is knowing you are not deserving and being scared of being found out.

Lucifer Hell is being stuck reliving the day you did something truly terrible over and over.


Actual Hell does not exist. But Heaven does and everyone goes there. The only difference is that the sinners feel terrible about what they did while alive, and feel extreme guilt for eternity, with no recourse. That's the only brain tweak God does.&nbs... (read more)

8Viliam
Sounds like the afterlife is designed to punish neurotics and reward psychopaths. . Angel of regret: "Time for your regular dose of remorse, Joe. Remember that one day when you horribly murdered dozen hostages?" Psychopath Joe: "Haha, LOL, those were fun days!" Nietzsche: "That's my boy! Too bad we don't have realistic pre-afterlife simulators here. Eternal recurrence ftw!" Angel of regret: "I hate my job..." (Plot twist: Angel of regret is Sisyphus reincarnated.)

As Patrick McKenzie has been saying for almost 20 years, "you can probably stand to charge more".

Yeah, I think this is exactly what I meant. There will still be boutique usage for hand-crafted computer programs just like there is now for penpals writing pretty decorated letters to each other. Granted, fax is still a thing in old-fashioned bureaucracies like Germany, so maybe there will be a requirement for "no LLM" code as well, but it appears much harder to enforce.

I think your point on infinite and cheap UI/UX customizations is well taken. The LLM will fit seamlessly one level below that. There will be no "LLM interface" just interface.

Consider moral constructivism.

I believe that, while the LLM architecture may not lead to AGI (see https://bigthink.com/the-future/arc-prize-agi/ for the reasons why -- basically current models are rules interpolators, not rules extrapolators, though they are definitely data extrapolators), they will succeed in killing all computer languages. That is, there will be no intermediate rust, python, wasm or machine code. The AI will be the interpreter and executor of what we now call "prompts". They will also radically change the UI/UX paradigm. No menus, no buttons, no windows -- those are ... (read more)

4faul_sname
Email didn't entirely kill fax machines or paper records. For similar reasons, I expect that LLMs will not entirely kill computer languages. Also, I expect things to go the other direction - I expect that as LLMs get better at writing code, they will generate enormous amounts of one-off code. For example, one thing that is not practical to do now but will be practical to do in a year or so is to have sales or customer service webpages where the affordances given to the user (e.g. which buttons and links are shown, what data the page asks for and in what format) will be customized on a per-user basis. For example, when asking for payment information, currently the UI is almost universally credit card number / cvv / name / billing street address / unit / zipcode / state. However, "hold your credit card and id up to the camera" might be easier for some people, while others might want to read out that information, and yet others might want to use venmo or whatever, and a significant fraction will want to stick to the old form fields format. If web developers developed 1,000x faster and 1,000x as cheaply, it would be worth it to custom-develop each of these flows to capture a handful of marginal customers. But forcing everyone to use the LLM interface would likely cost customers.

That makes sense! Maybe you feel like writing a post on the topic? Potentially including a numerical or analytical model.

Excellent point about the compounding, which is often multiplicative, not additive. Incidentally, multiplicative advantages result in a power law distribution of income/net worth, whereas additive advantages/disadvantages result in a normal distribution. But that is a separate topic, well explored in the literature.

4localdeity
Absolutely.  For a quick model of why you get multiplicative results: * Intelligence—raw intellectual horsepower—might be considered a force-multiplier, whereby you produce more intellectual work per hour spent working. * Motivation (combined with say, health) determines how much time you spend working.  We could quantify it as hours per week. * Taste determines the quality of the project you choose to work on.  We might quantify it as "the expected value, per unit of intellectual work, of the project". Then you literally multiply those three quantities together and it's the expected value per week of your intellectual work.  My mentor says that these are the three most important traits that determine the best scientists.

I mostly meant your second point, just generally being kinder to others, but the other two are also well taken.

Answer by Shmi-24

First, your non-standard use of the term "counterfactual" is jarring, though, as I understand, it is somewhat normalized in your circles. "Counterfactual" unlike "factual" means something that could have happened, given your limited knowledge of the world, but did not. What you probably mean is "completely unexpected", "surprising" or something similar. I suspect you got this feedback before.

Sticking with physics. Galilean relativity was completely against the Aristotelian grain. More recently, the singularity theorems of Penrose and Hawking unexpectedly s... (read more)

9kave
I think it means the more specific "a discovery that if it counterfactually hadn't happened, wouldn't have happened another way for a long time". I think this is roughly the "counterfactual" in "counterfactual impact", but I agree not the more widespread one. It would be great to have a single word for this that was clearer.

Let's say I start my analysis with the model that the predictor is guessing, and my model attaches some prior probability for them guessing right in a single case. I might also have a prior about the likelihood of being lied about the predictor's success rate, etc. Now I make the observation that I am being told the predictor was right every single time in a row. Based on this incoming data, I can easily update my beliefs about what happened in the previous prediction excercises: I will conclude that (with some credence) the predictor was guessed right in

... (read more)

Sorry, could not reply due to rate limit.

In reply to your first point, I agree, in a deterministic world with perfect predictors the whole question is moot. I think we agree there.

Also, yes, assuming "you have a choice between two actions", what you will do has not been decided by you yet. Which is different from "Hence the information what I will do cannot have been available to the predictor." If the latter statement is correct, then how can could have "often correctly predicted the choices of other people, many of whom are similar to you, in the particu... (read more)

3Jobst Heitzig
There's many possible explanations for this data. Let's say I start my analysis with the model that the predictor is guessing, and my model attaches some prior probability for them guessing right in a single case. I might also have a prior about the likelihood of being lied about the predictor's success rate, etc. Now I make the observation that I am being told the predictor was right every single time in a row. Based on this incoming data, I can easily update my beliefs about what happened in the previous prediction excercises: I will conclude that (with some credence) the predictor was guessed right in each individual case or that (also with some credence) I am being lied to about their prediction success. This is all very simple Bayesian updating, no problem at all. As long as my prior beliefs assign nonzero credence to the possibility that the predictor guesses right (and I see not reason why that shouldn't be a possibility), I don't need to assign any posterior credence to the (physically impossible) assumption that they could actually foretell the actions.  

There is no possible world with a perfect predictor where a two-boxer wins without breaking the condition of it being perfect.

2PhilGoetz
But there is no possible world with a perfect predictor, unless it has a perfect track record by chance.  More obviously, there is no possible world in which we can deduce, from a finite number of observations, that a predictor is perfect.  The Newcomb paradox requires the decider to know, with certainty, that Omega is a perfect predictor.  That hypothesis is impossible, and thus inadmissible; so any argument in which something is deduced from that fact is invalid.
2Jobst Heitzig
Take a possible world in which the predictor is perfect (meaning: they were able to make a prediction, and there was no possible extension of that world's trajectory in which what I will actually do deviates from what they have predicted). In that world, by definition, I no longer have a choice. By definition I will do what the predictor has predicted. Whatever has caused what I will do lies in the past of the prediction, hence in the past of the current time point. There is no point in asking myself now what I should do as I have no longer causal influence on what I will do. I can simply relax and watch myself doing what I have been caused to do some time before. I can of course ask myself what might have caused my action and try to predict myself from that what I will do. If I come to believe that it was myself who decided at some earlier point in time what I will do, then I can ask myself what I should have decided at that earlier point in time. If I believe that at that earlier point in time I already knew that the predictor would act in the way it did, and if I believe that I have made the decision rationally, then I should conclude that I have decided to one-box. The original version of Newcomb's paradox in Nozick 1969 is not about a perfect predictor however. It begins with (1) "Suppose a being in whose power to predict your choices you have enormous confidence.... You know that this being has often correctly predicted your choices in the past (and has never, so far as you know, made an incorrect prediction about your choices), and furthermore you know that this being has often correctly predicted the choices of other people, many of whom are similar to you, in the particular situation to be described below". So the information you are given is explicitly only about things from the past (how could it be otherwise). It goes on to say (2) "You have a choice between two actions". Information (2) implies that what I will do has not been decided yet and I still h

People constantly underestimate how hackable their brains are. Have you changed your mind and your life based on what you read or watched? This happens constantly and feels like your own volition. Yet it comes from external stimuli. 

2Jiao Bu
"Comes from external stumuli" in this case, or more accurately incorporates external information =/= brainwashing into slavery.  To some extent what you're saying is built of correct sentences, but you're keeping things vague enough and unconnected enough to defend.  Above you said, "subset of this scenario is a nightmarish one where humans are brainwashed by their mindless but articulate creations and serve them, kind of like the ancients served the rock idols they created. Enslaved by an LLM, what an irony." Yes, I have changed my mind based on things I have read and watched.  One should do this based on new information.  As for "happens consistently and feels like your own volition" I think you would need to unpack it a bit.  "Consistently," I don't know.  I'm 44 and an engineer and kind of a jackass, so maybe I don't change my mind as often as I should.  My new partner has a PhD in Nutrition though, so I have changed my mind partly based on studies she has presented (including some of her own research) and input regarding diet in the last several months. That it "Feels like" "my" "volition" is even more complicated.  I don't know from whence will and volition arise, and they seem stochastic.  I'm not entirely sure what """I""" am or where consciousness is, if the continuity of it is an illusion, or etc.  These questions get really quickly out of what anyone knows for sure.  But having been presented with both the papers and the food, eaten a lot, and noticed improved mood and energy levels, I'm pretty well sold on her approach being sound and the diet being great. But you jump to service and enslavement?  This is a bit more like someone needs to headbag me and then dump me in the back of their truck and drag me to a hidden site and inject me with LSD for six months or something.  You are jumping scales drastically without discussing concrete anything, really.  It might have emotional salience, but that hardly seems fit for a rationalist board. Though I welco

Note that it does not matter in the slightest whether Claude is conscious. Once/if it is smart enough it will be able to convince dumber intelligences, like humans, that it is indeed conscious. A subset of this scenario is a nightmarish one where humans are brainwashed by their mindless but articulate creations and serve them, kind of like the ancients served the rock idols they created. Enslaved by an LLM, what an irony.

6Pasero
Consciousness might not matter for alignment, but it certainly matters for moral patienthood.
3Jiao Bu
"Brainwashing" is pretty vague and likely difficult.  Hypnosis and LSD usually will not get you there, if I'm to believe what is declassified.  It would need to have some way to set up incentives to get people to act, no?  Or at least completely control my environment (and have the ability to administer the LSD and hypnosis?)

Not into ancestral simulations and such, but figured I comment on this:

I think "love" means "To care about someone such that their life story is part of your life story."

I can understand how how it makes sense, but that is not the central definition for me. When I associate with this feeling is what comes to mind is willingness to sacrifice your own needs and change your own priorities in order to make the other person happier, if only a bit and if only temporarily. This is definitely not the feeling I would associate with villains, but I can see how other people might.

5Raemon
Nod. For me the ‘willing to sacrifice’ is mediated through the ‘caring about them as part of you’ thing. (I think the ‘life story’ bit is maybe more opinionated in that I largely engage with life through a narrative focus, so that was more like ‘the Raemon flavored version of it) But, yep this is sure not a word I expect everyone to be using the same way.

Thank you for checking! None of the permutations seem to work with LW, but all my other feeds seem fine. Probably some weird incompatibility with protopage.

neither worked... Something with the app, I assume.

Could be the app I use. It's protopage.com (which is the best clone of the defunct iGoogle I could find):

2habryka
Hmm, I don't know how that works. If you go to LessWrong.com/feed.xml you can see the RSS feed working. 
Load More