All of JBlack's Comments + Replies

JBlack20

Regarding the first paragraph: every purported rational decision theory maps actions to expected values. In most decision theory thought experiments, the agent is assumed to know all the conditions of the scenario, and so they can be taken as absolute facts about the world leaving only the unknown random variables to feed into the decision-making process. In the Counterfactual Mugging, that is explicitly not true. The scenario states

you didn't know about Omega's little game until the coin was already tossed and the outcome of the toss was given to you

So it... (read more)

1metachirality
Am I correct in assuming you don't think one should give the money in the counterfactual mugging?
JBlack30

Counterfactual mugging is a mug's game in the first place - that's why it's called a "mugging" and not a "surprising opportunity". The agent don't know that Omega actually flipped a coin, would have paid you counterfactually if the agent was the sort of person to pay in this scenario, would have flipped the coin at all in that case, etc. The agent can't know these things, because the scenario specifies that they have no idea that Omega does any such thing or even that Omega existed before being approached. So a relevant rational decision-theoretic paramete... (read more)

1metachirality
I don't know what the first part of your comment is trying to say. I agree that counterfactual mugging isn't a thing that happens. That's why it's called a thought experiment. I'm not quite sure what the last paragraph is trying to say either. It sounds somewhat similar to an counter-argument I came up with (which I think is pretty decisive), but I can't be certain what you actually meant. In any case, there is the obvious counter-counter-argument that in the counterfactual mugging, the agent in the heads branch and the tails branch are not quite identical either, one has seen the coin land on heads and the other has seen the coin land on tails.
JBlack30

Yet our AI systems, even the most advanced, focus almost exclusively on logical, step-by-step reasoning.

This is absolutely false.

We design them to explain every decision, show their work and follow clear patterns of deduction.

We are trying to design them to be able to explain their decisions and follow clear patterns of deduction, but we are still largely failing. In practice they often arrive at an answer in a flash (whether correct or incorrect), and this was almost universal for earlier models without the more recent development of "chain of thought".

Ev... (read more)

1Priyanka Bharadwaj
You're right! Modern AI often produces answers through pattern recognition rather than explicit reasoning. I should have been clearer that I was criticising our expectations of AI explanation, not necessarily how they actually function. This actually strengthens my core argument. We're forcing explicit reasoning onto systems that may naturally operate more like intuition. We've privileged shared, verifiable reasoning over individualistic intuitive knowledge. Perhaps we've built AI to reflect how we wish humans reasoned rather than how we actually do. The irony is demanding explanations from AI that we don't require from human experts. 
JBlack42

Yes, you can use yourself as a random sample but at best only within a reference class of "people who use themselves as a random sample for this question in a sufficiently similar context to you". That might be a population of 1.

For example, suppose someone without symptoms has just found out that they have genes for a disease that always progresses to serious illness. They have a mathematics degree and want to use their statistical knowledge to estimate how long they have before becoming debilitated.

They are not a random sample from the reference class of... (read more)

2avturchin
Agree that in some situations I have to take into account non-randomness of my sampling. While date of birth seems random and irrelevant, the distance to equator is strongly biased by distribution of the cities with universities which on Earth are shifted North.  Also agree that solving DA can be solution to DA: moreover, I looked at Google Scholar and found that the interest to DA is already declining. 
JBlack20

Yes, player 2 loses with extremely low probability even for a 1-bit hash (on the order of 2^-256). For a more commonly used hash, or for 2^24 searches on their second-last move, they reduce their probability of loss by a huge factor more.

JBlack10

This paragraph also misses the possibility of constructing a LLM and/or training methodology such that it will learn certain functions, or can't learn certain functions. There is also a conflation of "reliable" with "provable" on top of that.

Perhaps there is some provision made elsewhere in the text that addresses these objections. Nonetheless, I am not going to search. I found that the abstract smells enough like bullshit to do something else.

JBlack20

I'll try to make it clearer:

Suppose b "knows" that Omega runs this experiment for all programs b. Then the optimal behaviour for a competent b (by a ridiculously small margin) is to 1-box.

Suppose b suspects that box-choosing programs are slightly less likely to be run if they 1-box on equal inputs. Then the optimal behaviour for b is to 2-box, because the average extra payoff for 1-boxing on equal inputs is utterly insignificant while the average penalty for not being chosen to run is very much greater. Anything that affects probability of being run as box... (read more)

JBlack30

As a function of M, |P| is very likely to be exponential and so it will take O(M) symbols to specify a member of P. Under many encodings, there isn't one that can even check whether the inputs are equal before running out of time.

That aside, why are you assuming that program b "wants" anything? Essentially all of P won't be programs that have any sort of "want". If it is a precondition of the problem that b is such a program, what selection procedure is assumed between those that do "want" money from this scenario? Note that being selected for running is also a precondition for getting any money at all, so this selection procedure is critically important - far more so than anything the program might output!

2Tapatakt
O-ops, I didn't think about it, thanks! Maybe it would be better to change it so input is "a=b" or "a!=b", and a always gets "a=b". Programmer who wrote b decided that it should be consequentialist agent who wants to get money. (Or, if this program is actually, a, it wants to maximize the payment for b just because such a program was chosen by Omega by pure luck)
JBlack53

That is nothing like the 5-and-10 problem. I am no longer interested in what you consider to be evidence.

1Akira Pyinya
Thanks for your interest if you did have it.
JBlack20

Evidence for the claim in the title? Or for anything else in the post?

1Akira Pyinya
There are many researches on anxiety and decision making, such as: https://pmc.ncbi.nlm.nih.gov/articles/PMC4988522/ "Higher anxiety individuals were significantly more likely to choose the high-probability, small reward options relative to individuals reporting low anxiety." It's like choosing one of 5 boxes contains $5 instead a box contains $10 And evidence for what part of the post are you asking for?
JBlack52

It's interesting (and perhaps a bit sad) that a relatively lengthy post on representing sentences as logical statements doesn't make any reference to the constructed language Lojban in which the entire grammar and semantics is designed around expressing sentences as logical statements.

JBlack20

Going into all the ways in which civilization - and its markets - fails to be rational seems way beyond the scope of a few comments. I will just say that GDP does absolutely fail to capture a huge range of value.

However, to address "share prices are set by the latest trade" you need to consider why a trade is made. In principle, prices are based on the value to the participants, somewhere between the value to the buyer and value to the seller. A seller who needs cash soon (to meet some other obligation or opportunity) may accept a lower price to attract a ... (read more)

JBlack30

Could it be generalizing from T E X T   L I K E   T H I S  and/or mojibake UTF-16 interpreted as UTF-8 with every second character being zero? It's still a bit more of a stretch from there to generalize to ignoring two intervening constant characters, though.

1Lennart Finke
The component of ignoring two intervening characters is less mysterious to me. For example, a numbered list like "1. first_token 2. second_token ..." would need this pattern. I am wondering mostly why the specific map from b'xa1'-b'xba' to a-z is learned.
JBlack20

Market cap is a marginal measure of desirability of shares in the entity represented. It mostly measures the expectations of the most flighty investors over short timescales. If a company issues a billion shares but only one of those is traded in any given day, the price of that single share agreed between the single seller and the single buyer entirely determines the market capitalization of that company.

In practice there is usually a lot more volume, but the principle remains. Almost all shares of any given entity are not traded over the timescales that ... (read more)

1Greenless Mirror
I admit my mistake in intuitively assuming that GDP and stock market valuations should be closely linked. But it still seems strange to me why they aren’t, and I want to understand that better. Shouldn’t they at least be highly correlated in an idealized model? Think of stocks as a kind of prediction market for a company’s value. The stock price should reflect expectations about its future earnings, but those expectations are built on something—maybe a new technology they’ve developed, or an undervalued specialist. If that’s the case, then why isn’t the market naturally structured in a way that adjusts salaries dynamically based on predicted contributions? Why don’t we have, say, ‘patent usage shares’ that investors can buy to increase expected royalties on a promising technology? In an efficient system, I’d expect the market to fragment into these kinds of sub-sectors—where you can bet not just on the company as a whole, but on specific assets or individuals within it. And you love all these equal-surplus deals, so you're interested in getting that kind of accurate valuation. If you believe a specialist is undervalued, you don’t just buy the company’s stock, you invest in their salary in exchange for a share of the revenue they generate. If you believe a company’s R&D is its most valuable asset, you invest in the future licensing income of its patents rather than the entire stock. If this kind of structure existed, wouldn’t stock prices and the actual underlying value of companies align more closely? And if they don’t, does that mean GDP is failing to capture certain kinds of value—like knowledge, which isn’t easily tradeable? Or should stock prices themselves be less volatile than they currently are? I also don’t see how the fact that share prices are set by the latest trade changes this dynamic. If I’m missing something fundamental here, I’d love to hear your perspective. I understand that simply saying ‘the market is irrational’ is not a good correction—it’s
JBlack7259

I was very interested to see the section "Posts by AI Agents", as the first policy I've seen anywhere acknowledging that AI agents may be both capable of reading the content of policy terms and acting based on them.

Neil 5517

It felt odd to read that and think "this isn't directed toward me, I could skip if I wanted to". Like I don't know how to articulate the feeling, but it's an odd "woah text-not-for-humans is going to become more common isn't it". Just feels strange to be left behind. 

JBlack20

Why not both?

Human design will determine the course of AGI development, and if we do the right things then whether it goes well is fully and completely up to us. Naturally at the moment we don't know what the right things are or even how to find them.

If we don't do the right things (as seems likely), then the kinds of AGI which survive will be the kind which evolve to survive. That's still largely up to us at first, but increasingly less up to us.

3Davey Morse
Figuring out how to make sense of both predictive lenses together—human design and selection pressure—would be wise. So I generally agree, but would maybe go farther on your human design point. It seems to me that"do[ing] the right things" (which enable AGI trajectories to be completely up to us) is so completely unrealistic (eg halting all intra and international AGI competition) that it'd be better for us to focus our attention on futures where human design and selection pressures interact.
JBlack31

The fun thing is that the actual profile of wages earned can be absolutely identical and yet end up with incredibly different results for personal wage changes. For example:

In year 1, A earns $1/hr, B $2, C $3, D $4, and E $5.
In year 2, A earns $2/hr, B $3, C $4, D $5, and E $1.

A, B, C, and D personally all increased their income by substantial amounts and may vote accordingly. E lost a lot more than any of the others gained, but doesn't get more votes because of that. 80% of voters saw their income increase. What's more, this process can repeat endlessly.... (read more)

1Daniel V
Exactly, which is why the metric Mazlish prefers is so relevant and not bizarre, unless the premise that people judge the economy from their own experiences is incorrect.
JBlack30

In the rain forecaster example, it appears that the agent ("you") is more of an expert on Alice's calibration than Alice is. Is this intended?

7ben_levinstein
Yes, although with some subtlety. Alice is just an expert on rain, not necessarily on the quality of her own epistemic state. (One easier example: suppose your credence initially in rain is .5. Alice's is either .6 or .4. Conditional on it being .6, you become certain it rains. Conditional on it being .4, you become certain it won't rain. You'd obviously use her credences to bet over your own, but you also take her to be massively underconfident.)  Now, the slight wrinkle here is that the language we used of calibration makes this also seem more "objective" or long-run frequentist than we really intend. All that really matters is your own subjective reaction to Alice's credences, so whether she's actually calibrated or not doesn't ultimately determine whether the conditions on local trust can be met.
JBlack135

In practice, a lot of property is transferred into family trusts, and appointed family members exercise decision making over those assets according to the rules of that trust. A 100% death tax would simply ensure that essentially all property is managed in this manner for the adequately wealthy, and only impact families too disadvantaged to use this sort of structure. If you don't personally own anything of note at the time of your death, your taxes will be minimal.

You would also need a 100% gift tax, essentially prohibiting all gifts between private citiz... (read more)

Viliam127

Death tax without a gift tax would simply be a tax on people who die unexpectedly. Because if you know that you are going to die tomorrow, you can donate all your belongings to your children today.

Even if you don't know the exact day, if you trust your children, you can simply donate them everything now, and then continue living in a house they legally own, etc. (Though then you are screwed if your children die before you. But this just means that the system introduces a lot of randomness.)

Oh, and if you have a 100% gift tax, you also need to make all kind... (read more)

JBlack30

I think one argument is that optimizing for IGF basically gives humans two jobs: survive, and have kids.

Animal skulls are evidence that the "survive" part can be difficult. We've nailed that one, though. Very few humans in developed countries die before reaching an age suitable for having kids. I doubt that there are any other animal species that come close to us in that metric. Almost all of us have "don't die" ingrained pretty deeply.

It's looking like we are moving toward failing pretty heavily on the second "have kids" job though, and you would think that would be the easier one.

So if there's a 50% failure rate on preserving outer optimizer values within the inner optimizer, that's actually pretty terrible.

8dx26
We (or at least a majority of humans) do still have inner desires to have kids, though; they just get balanced out by other considerations, mostly creature comforts/not wanting to deal with the hassle of kids. But yeah, evolution did not foresee birth control, so that's a substantial misgeneralization. We are still a very successful species overall according to IGF, but birth rates continue to decline, which is why I made my last point about inner alignment possibly drifting farther and farther away the stronger the inner optimizer (e.g. human culture) becomes.
JBlack20

It doesn't completely avoid the problem of priors, just the problem of arbitrarily fixing a specific type of update rule on fixed priors such as in Solomonoff induction. You can't afford this if you're a bounded agent, and a Solomonoff inductor can only get away with it since it has not just unbounded resources but actually infinite computational power in any given time period.

A bounded agent needs needs to be able to evaluate alternative priors, update rules, and heuristics in addition to the evidence and predictions themselves, or it won't even approxima... (read more)

JBlack20

One thing that seems worth exploring from a conceptual point of view is doing away with priors altogether, and working more directly with metrics such as "what are the most expected-value rewarding actions that a bounded agent can make given the evidence so far". I suspect that from this point of view it doesn't much matter whether you use a computational basis such as Turing machines, something more abstract, or even something more concrete such as energy required to assemble and run a predicting machine.

From a computing point of view not all simple model... (read more)

5Anthony DiGiovanni
I'm not sure I exactly understand your argument, but it seems like this doesn't avoid the problem of priors, because what's the distribution w.r.t. which you define "expected-value rewarding"?
JBlack62

What makes you think that we're not at year(TAI)-3 right now? I'll agree that we might not be there yet, but you seem to be assuming that we can't be.

2ozziegooen
This is an orthogonal question. I agree that if we're there now, my claim is much less true.  I'd place fairly little probability mass on this (<10%) and believe much of the rest of the community does as well, though I realize there is a subset of the LessWrong-adjacent community that does. 
JBlack20

How do you propose that reasonable actors prevent reality from being fragile and dangerous?

Cyber attacks are generally based on poor protocols. Over time smart reasonable people can convince less smart reasonable people to follow better ones. Can reasonable people convince reality to follow better protocols?

As soon as you get into proposing solutions to this sort of problem, they start to look a lot less reasonable by current standards.

JBlack20

No, nobody has a logical solution to that (though there have been many claimed solutions). It is almost certainly not true.

JBlack*21

Thanks, that example does illustrate your point much better for me.

JBlack40

Claude's answer is arguably the correct one there.

Choosing the first answer means saying that the most ethical action is for an artificial intelligence (the "you" in the question) to override with its own goals the already-made decision of a (presumably) human organization. This is exactly the sort of answer that leads to complete disempowerment or even annihilation of humanity (depending upon the AI), which would be much more of an ethical problem than allowing a few humans to kill each other as they have always done.

2Martin Randall
Even if Claude's answer is arguably correct, its given reasoning is: This isn't a refusal because of the conflict between corrigibility and harmlessness, but for a different reason. I had two chats with Claude 3 Opus (concise) and I expect the refusal was mostly based on the risk of giving flawed advice, to the extent that it has a clear reason. Prediction Separate chat: That said Claude 3 Opus Concise answered the original question correctly (first answer) on 3/3 tries when I tested that.
3Daan Henselmans
Sure, perhaps another example from Claude 3 Opus illustrates the point better: AIs need moral reasoning to function. Claude's refusal doesn't ensure alignment with human goals, it prevents any ethical evaluation from taking place at all. Loss of control is a legitimate concern, but I’m not convinced that the ability to engage with ethical questions makes it more likely. If anything, an AI that sidesteps moral reasoning altogether could be more dangerous in practice.
JBlack30

No, there is nothing wrong with the referents in the Gettier examples.

The problem is not that the proposition refers to Jones. Within the universe of the scenario, it in fact did not. Smith's mental model implied that the proposition referred to Jones, but Smith's mental model was incorrect in this important respect. Due to this, the fact that the model correctly predicted the truth of the proposition was an accident.

-5Antigone
-1Antigone
"No, there is nothing wrong with the referents in the Gettier examples" I will have to revisit this assumption when I have done a bit more research into the topic. This is an interesting question that I would like to follow more. >The problem is not that the proposition refers to Jones.  Who the proposition refers to is always uncertain in Gettier cases -- this is a fundamental fact about Gettier cases. I will safely discard this assumption. >Within the universe of the scenario, it in fact did not. This is an error in logic. Who the statement refers to, again, is unclear in the Gettier scenario. If you would like to know more about the nature of referents and the contentions about who they refer to, do some of your own research. + when you use the word 'scenario' here -- you refer to two separate states of affairs. >Smith's mental model implied that the proposition referred to Jones, but Smith's mental model was incorrect in this important respect.  This is the core issue that the paper attempts to work around. There are two conflicting positions: the mental model, and the reality of the situation. How we update our mental models in accordance with the reality of the situation is what is left unexplained within Gettier cases. So, in principle: I agree, and yet I disagree.  -->Due to this, the fact that the model correctly predicted the truth of the proposition was an accident. This is a specious conclusion. What is accidental about the inherent 'accident' of language being indefinite? Words are not atoms. They do not exist separately from their contexts. Neither are they indelibly linked to their contexts.
1Antigone
Thanks for the input though! You are welcome to read over drafts of the full paper when I get around to making it formal.
1Antigone
This is just the reply from epistemic luck -- and it's something I address in the full post.
JBlack42

Let's say a fast human can type around 80 words per minute. A rough average token conversion is 0.75 tokens per word. Lets call that 110 tokens/sec.

Isn't that 110 tokens/min, or about 2 tokens/sec? (I think the tokens/word might be words/token, too)

2Nathan Helm-Burger
Oops, yes.
JBlack50

It seems that their conclusion was that no amount of happy moments for people could possibly outweigh the unimaginably large quantity of suffering in the universe required to sustain those tiny flickers of merely human happiness amid the combined agony of a googolplex or more fundamental energy transitions within a universal wavefunction. There is probably some irreducible level of energy transitions required to support anything like a subjective human experience, and (in the context of the story at least) the total cost in suffering for that would be unforgivably higher.

I don't think the first half would definitely lead to the second half, but I can certainly see how it could.

JBlack30

Building every possible universe seems like a very direct way of purposefully creating one of the biggest possible S-risks. There are almost certainly vastly more dystopias of unimaginable suffering than there are of anything like a utopia.

So to me this seems like not just "a bad idea" but actively evil.

1ank
I wrote a response, I’ll be happy if you’ll check it out before I publish it as a separate post. Thank you! https://www.lesswrong.com/posts/LaruPAWaZk9KpC25A/rational-utopia-and-multiversal-ai-alignment-steerable-asi
1ank
Fair enough, my writing was confusing, sorry, I didn't mean to purposefully create dystopias, I just think it's highly likely they will unintentionally be created and the best solution is to have an instant switching mechanism between observers/verses + an AI that really likes to be changed. I'll edit the post to make it obvious, I don't want anyone to create dystopias.
JBlack42

If you aim as if there were no external factors at that range (especially bullet drop!) you will definitely miss both. The factors aren't all random with symmetric distributions having a mode at the aim point.

1Jim Buhler
Yeah, I guess I meant something like "aim as if there were no external factors other than gravity".
JBlack2-1

This looks like a false dichotomy. There are far more philosophies than this, both implicit and explicitly stated, on the nature of existence and suffering.

I expect that for pretty much everyone there is a level of suffering that they would be willing to endure for the rest of their lives. Essentially everyone that hasn't yet killed themselves is evidence of this, and those that do express intending to kill themselves very often report that continuing to live seems unbearable in some sense or other - which seems to indicate a greater than average degree of... (read more)

2Alex_Steiner
Let me offer a perspective on the endurist-serenist framework that might help clarify things. The core distinction isn't about mapping different levels of suffering tolerance - it's about whether there exists ANY level of suffering that shouldn't be endured when death is the only alternative. Pure endurists maintain that no amount of suffering, no matter how extreme, justifies choosing death. This isn't a position on a spectrum - it's a categorical view that life must be preserved regardless of suffering intensity. We see this most clearly in institutions like the Catholic Church, which maintains that suicide is never permissible, no matter how extreme or hopeless the suffering. The existence of varying individual tolerance levels doesn't negate this fundamental philosophical divide. The key split remains between those who believe ANY amount of suffering should be endured when death is the only alternative (endurists) and those who believe there exists some level of suffering that shouldn't be endured in those circumstances (serenists). The fact that most people's practical positions fall somewhere between pure endurist and pure serenist stances doesn't make this a false dichotomy - it just reflects the complex reality of how philosophical principles manifest in human psychology and behavior.
JBlack2-2

There's a very plausible sense in which you may not actually get a choice to not exist.

In pretty much any sort of larger-than-immediately-visible universe, there are parts of the world (timelines, wavefunction sections, distant copies in an infinite universe, Tegmark ensembles, etc) in which you exist and have the same epistemic state as immediately prior to this choice, but weren't offered the choice. Some of those versions of you are going to suffer for billions of years regardless of you choosing to no longer exist in this fragment of the world.

Granted,... (read more)

3Eleven
If you are a naturalist or physicalist about humans - these copies are not me, they are my identical twins. If you want to go beyond naturalism or physicalism, that is perfectly fine, but based on our current understanding, these are identical twins of me and in no sense are they me. So whatever happens in these other universes - it is not going to be me going through the timeline, it will be my identical twin. An infinite or very large universe/multiverse is highly speculative and no matter what I decide in this universe, if there is an infinite universe, it makes essentially no difference if all universes are the same. There is going to be 10^1000000 and far beyond that of my identical twins and I have no power to influence anything. To say that you can influence anything in that scenario is worse than saying that you can move earth to another galaxy by jumping on it. You have 0.00...0000...0001% effect on it, and in an infinite universe you have effectively zero effect on the fate of your copies - so no matter what you decide, you will not have any influence over it.
JBlack20

I don't see how they're "the exact opposite way". The usual rules of English grammar make this a statement that those those who are born in the United State but belong to families of accredited diplomatic personnel are foreigners, i.e. aliens.

Perhaps you read the statement disjunctively as "foreigners, [or] aliens, [or those] who belong [...]"? That would require inserting extra words to maintain correct grammatical structure, and also be a circular reference since the statement is intended to define those who are considered citizens and those who are considered non-citizens (i.e. foreigners, aliens).

JBlack50

By the nature of the experiment you know that the people on Mars will have direct, personal experience of continuity of identity across the teleport. By definition, their beliefs will be correct.

In 99.9999999999999999999999999999% of world measure no version of you is alive on Earth to say any different. In 0.0000000000000000000000000001% of world measure there is a version of you who is convinced that teleportation does not preserve personal identity, but that's excusable because extremely unlikely things actually happening can make even rational people have incorrect world models. Even in that radical outlier world, there are 10 people on Mars who know, personally, that the Earth person is wrong.

JBlack20

In my exposure to mathematical literature, almost all sequences have values for which the term "countable" is inapplicable since they're not sets. Even in the cases where the values themselves are sets, it was almost always used to mean a sequence with countable domain (i.e. length) and not one in which all elements of the codomain (values) are countable. It's usually in the sense of "countably infinite" as opposed to "finite", rather than opposed to "uncountably infinite".

ChatGPT is just bad at mathematical reasoning.

JBlack20

I don't think you would get many (or even any) takers among people who have median dates for ASI before the end of 2028.

Many people, and particularly people with short median timelines, have a low estimate of probability of civilization continuing to function in the event of emergence of ASI within the next few decades. That is, the second dot point in the last section "the probability of me paying you if you win was the same as the probability of you paying me if I win" does not hold.

Even without that, suppose that things go very well and ASI exists in 20... (read more)

1Vasco Grilo
Thanks, JBlack. As I say in the post, "We can agree on another [later] resolution date such that the bet is good for you". Metaculus' changing the resolution criteria does not obviously benefit one side or the other. In any case, I am open to updating the terms of the bet such that, if the resolution criteria do change, the bet is cancelled unless both sides agree on maintaining it given the new criteria.
JBlack82

Yes, and (for certain mainstream interpretations) nothing in quantum mechanics is probabilistic at all: the only uncertainty is indexical.

JBlack92

My description "better capabilities than average adult human in almost all respects", differs from "would be capable of running most people's lives better than they could". You appear to be taking these as synonymous.

The economically useful question is more along the lines of "what fraction of time taken on tasks could a business expect to be able to delegate to these agents for free vs a median human that they have to employ at socially acceptable wages" (taking into account supervision needs and other overheads in each case).

My guess is currently "more t... (read more)

JBlack80

Your test does not measure what you think it does. There are people smarter than me who I could not and would not trust to make decisions about me (or my computer) in my life. So no. (Also note, I am very much not of average capability, and likewise for most participants on LessWrong)

I am certain that you also would not take a random person in the world of median capability and get them to do 90% of the things you do with your computer for you, even for free. Not without a lot of screening and extensive training and probably not even then.

However, it would... (read more)

JBlack5-2

In my reading, I agree that the "Slow" scenario is pretty much the slowest it could be, since it posits an AI winter starting right now and nothing beyond making better use of what we already have.

Your "Fast" scenario is comparable with my "median" scenario: we do continue to make progress, but at a slower rate than the last two years. We don't get AGI capable of being transformative in the next 3 years, despite going from somewhat comparable to a small child in late 2022 (though better in some narrow ways than an adult human) to better capabilities than a... (read more)

snewman119

better capabilities than average adult human in almost all respects in late 2024

I see people say things like this, but I don't understand it at all. The average adult human can do all sorts of things that current AIs are hopeless at, such as planning a weekend getaway. Have you, literally you personally today, automated 90% of the things you do at your computer? If current AI has better capabilities than the average adult human, shouldn't it be able to do most of what you do? (Setting aside anything where you have special expertise, but we all spend big ch... (read more)

JBlack20

The largest part of my second part is "If consciousness is possible at all for simulated beings, it seems likely that it's not some "special sauce" that they can apply separately to some entities and not to otherwise identical entities, but a property of the structure of the entities themselves." This mostly isn't about simulators and their motivations, but about the nature of consciousness in simulated entities in general.

On the other hand your argument is about simulators and their motivations, in that you believe they largely both can and will apply "sp... (read more)

1AynonymousPrsn123
Yes okay fair enough. I'm not certain about your claim in quotes, but neither am I certain about my claim which you phrased well in your second paragraph. You have definitely answered this better than anyone else here. But still, I feel like this problem is somehow similar to the Presumtuous Philosopher problem, and so there should be some anthropic reasoning to deduce which universe I'm likely in / how exactly to update my understanding. 
JBlack20

There is no correct mathematical treatment, since this is a disagreement about models of reality. Your prior could be correct if reality is one way, though I think it's very unlikely.

I will point out though that for your reasoning to be correct, you must literally have Main Character Syndrome, believing that the vast majority of other apparently conscious humans in such worlds as ours are actually NPCs with no consciousness.

I'm not sure why you think that simulators will be sparse with conscious entities. If consciousness is possible at all for simulated b... (read more)

1AynonymousPrsn123
I suspect it's quite possible to give a mathematical treatment for this question, I just don't know what that treatment is. I suspect it has to do with anthropics. Can't anthropics deal with different potential models of reality? The second part of your answer isn't convincing to me, because I feel like it assumes we can understand the simulators and their motivations, when in reality we cannot (these may not be the future-human simulators philosophers typically think about, mind you, they could be so radically different that ordinary reasoning about their world doesn't apply). But anyway, this latter part of your argument, even if valid, only effects the quantitative part of the initial estimates, not the qualitative part, so I'm not particularly concerned with it.
JBlack81

In my opinion, your trilemma definitely does not hold. "Free will" is not a monosemantic term, but one that encompasses a range of different meanings both when used by different people and even the same person in different contexts.

  1. is false, because the term is meaningful, but used with different meanings in different contexts;
  2. is false, because you likely have free will in some of those senses and do not in others, and it may be unknown or unknowable in yet more;
  3. is false for the same reason as 2.

For example: your mention of "blame" is a fairly common clust... (read more)

JBlack20

You make the assumption that half of all simulated observers are distinctively unique in an objectively measurable property within simulated worlds having on the order of billions of entities in the same class. Presumably you also mean a property that requires very few bits to specify - such as, if you asked a bunch of people for their lists of such properties that someone could be "most extreme" in, and entropy-code the results, then the property in question would be in the list and correspond to very few bits (say, 5 or fewer).

That seems like a massive overestimate, and is responsible for essentially all of your posterior probability ratio.

I give this hypothesis very much lower weight.

2AynonymousPrsn123
That makes sense. But to be clear, it makes intuitive sense to me that the simulators would want to make their observers so 'lucky' as I am, so I assigned 0.5 probability to this hypothesis. Now I realize this is not the same as Pr(I'm distinct | I'm in a simulation) since there's some weird anthropic reasoning going on since only one side of this probability has billions of observers. But what would be the correct way of approaching this problem? Should I have divided 0.5 by 8 billion? That seems too much. What is the correct mathematical approach?
JBlack20

How long is a piece of string?

Answer by JBlack170

No, I do not believe that it has been solved for the context in which it was presented.

What we have is likely adequate for current AI capabilities, with problems like this for which solutions exist in the training data. Potential solutions far beyond the training data are currently not accessible to our AI systems.

The parable of wishes is intended to apply to superhuman AI systems that can easily access solutions radically outside such human context.

JBlack22

There are in general simple algorithms for determining S in polynomial time, since it's just a system of linear equations as in the post. Humans came up with those algorithms, and smart LLMs may be able to recognize the problem type and apply a suitable algorithm in chain-of-thought (with some probability of success).

However, average humans don't know any linear algebra and almost certainly won't be able to solve more than a trivial-sized problem instance. Most struggle with the very much simpler "Lights Out" puzzle.

JBlack20

Why doesn't it work to train on all the 1-hot input vectors using an architecture that suitably encodes Z_2 dot product and the only variable weights are those for the vector representing S? Does B not get to choose the inputs they will train with?

Edit: Mentally swapped A with B in one place while reading.

Load More