quiet_NaN

Wiki Contributions

Comments

Sorted by

Let us assume that the utility of personal wealth is logarithmic, which is intuitive enough: 10k$ matter a lot more to you if you are broke than if your net worth is 1M$.

Then by your definition of exploitation, every transaction where a poor person pays a rich person and enhances their personal wealth in the process is exploitive. The worker surely needs the rent money more than the landlord, so the landlord should cut the rent to the point where he does not make a profit. Likewise the physician providing aid to the poor, or the CEO selling smartphones to the middle class. 

Classifying most of capitalism as "exploitive" (per se) is of course not helpful. You can add epicycles to your definition by considering money rather than its subjective utilitarian value, but under that definition all financial transactions would be 'fair': the poor person values 1k$ more than his kidney, while the rich person values the kidney more than 1k$, so they trade and everyone is better off (even though we would likely consider this trade exploitative).

More generally, to have a definition of fairness, we need some unit of inter-personal utility. Consider the Ultimatum game. If both participants are of similar wealth and with similar needs and have worked a similar amount for the gains, then a 50:50 split seems fair. But if we don't have a common frame of utility, perhaps because one of them is a peasant and the other is a feudal king, then objectively determining what is a fair split seems impossible. 

Some comments.

 

[...] We will quickly hit superintelligence, and, assuming the superintelligence is aligned, live in a post-scarcity technological wonderland where everything is possible.

Note, firstly, that money will continue being a thing, at least unless we have one single AI system doing all economic planning. Prices are largely about communicating information. If there are many actors and they trade with each other, the strong assumption should be that there are prices (even if humans do not see them or interact with them). Remember too that however sharp the singularity, abundance will still be finite, and must therefore be allocated.

 

Personally, I am reluctant to tell superintelligences how they should coordinate. It feels like some ants looking at the moon and thinking "surely if some animal is going to make it to the moon, it will be a winged ant."Just because market economies have absolutely dominated the period of human development we might call 'civilization', it is not clear that ASIs will not come up with something better. 

The era of human achievement in hard sciences will probably end within a few years because of the rate of AI progress in anything with crisp reward signals.

As an experimental physicist, I have opinions about that statement. Doing stuff in the physical world is hard. The business case for AI systems which can drive motor vehicles on the road is obvious to anyone, and yet autonomous vehicles remain the exception rather than the rule. (Yes, regulations are part of that story, but not all of it.) By contrast, the business case for an AI system which can cable up a particle detector is basically non-existent. I can see an AI using either a generic mobile robot developed for other purposes for plugging in all the BNC cables, or I can see it using a minimum wage worker with a head up display as a bio-drone -- but more likely in two decades than a few years. 

Of course, experimental physics these days is very much a team effort -- the low-hanging fruits have mostly been picked, nobody is going to discover radium or fission again, at the most they will be a small cog in a large machine which discovers the Higgs boson or gravity waves.[1] So you might argue that experimental physics today is already not a place for peak human excellence (a la the Humanists in Terra Ignota).

 

More broadly, I agree that if ASI happens, most unaugmented humans are unlikely to stay at the helm of our collective destiny (to the limited degree they ever were). Even if some billionaire manages to align the first ASI to maximize his personal wealth, if he is clever he will obey the ASI just like all the peasants. His agency is reduced to not following the advice of his AI on some trivial matters. ("I have calculated that you should wear a blue shirt today for optimal outcomes." -- "I am willing to take a slight hit in happiness and success by making the suboptimal choice to wear a green shirt, though.") 

Relevant fiction: Scott Alexander, The Whispering Earring.

Also, if we fail to align the first ASI, human inequality will drop to zero. 

  1. ^

     Of course, being a small cog in some large machine, I will say that.

Some additional context.

This fantasy world is copied from a role-playing game setting—a fact I discovered when Planecrash literally linked to a Wiki article to explain part of the in-universe setting.

The world of Golarian is a (or the?) setting of the Pathfinder role playing game, which is a fork of the D&D 3.5 rules[1] (but notably different from Forgotten Realms, which is owned by WotC/Hasbro). The core setting is defined in some twenty-odd books which cover everything from the political landscape in dozens of polities to detailed rule for how magic and divine spells work. From what I can tell, the Planecrash authors mostly take what given and fill in the blanks in a way which makes the world logically coherent, in the same way Eliezer did with the world of Harry Potter in HPMOR. 

Also, Planecrash (aka Project Lawful) is a form of collaborative writing (or free form roleplaying?) called glowfic. In this case, EY writes Keltham and the setting of dath ilan (which exists only in Keltham's mind, as far as the plot is concerned), plus a few other minor characters. Lintamande writes Carissa and most of the world -- it is clear that she is an expert in the Pathfinder setting -- as well as most of the other characters. At some point, other writers (including Alicorn) join and write minor characters. If you have read through planecrash and want to read more from Eliezer in that format, glowfic.com has you covered. For example, here is a story which sheds some light on the shrouded past of dath ilan (plus more BDSM, of course). 

 

  1. ^

    D&D 3 was my first exposure to the D&D (via Bioware's Neverwinter Nights), so it is objectively the best edition. Later editions are mostly WotC milking the franchise with a complete overhaul every few years. AD&D 2 is also fine, if a bit less streamlined -- THAC0, armor classes going negative and all that. 

 One big aspect of Yudkowskian decision theory is how to respond to threats. Following causal decision theory means you can neither make credible threats nor commit to deterrence to counter threats. Yudkowsky endorses not responding to threats to avoid incentivising them, while also having deterrence commitments to maintain good equilibria. He also implies this is a consequence of using a sensible functional decision theory. But there's a tension here: your deterrence commitment could be interpreted as a threat by someone else, or visa versa. 

I have also noted this tension. Intuitively, one might think that it depends on the morality of the action -- the robber who threatens to blow up a bank unless he gets his money might be seen as a threat, while a policy of blowing up your own banks in case of robberies might be seen as a deterrence commitment. 

However, this can not be it, because decision theory works with any utility functions. 

The other idea is that that 'to make threats' is one of these irregular verbs. I make credible deterrence commitments, you show a willingness to escalate, they try to blackmail me with irrational threats. This is of course just as silly in the context of game theory.

One axis of difference might be if you physically restrict your own options to prevent you from not following through on your threat (like a Chicken player removing their steering wheel, or that doomsday machine from Dr. Strangelove). But this merely makes a difference for people who are known to follow causal decision theory where they would try to maximize the utility in whatever branch of reality they find themselves in. From my understanding, the adherents of functional decision theory do not need to physically constrain their options -- they would be happy to burn the world in one branch of reality if that was the dominant strategy before their opponent had made their choice. 

Consider the ultimatum game (which gets convered in Planecrash, naturally) where one party makes a proposal on how to split 10$ and the other party can either accept (gaining their share) or reject it (in which case neither party gains anything). In planecrash, the dominant strategy is presented as rejecting unfair allocations with some probability so that the expected value of the proposing party is lower than if they had proposed a fair split. However, this hinges on the concept of fairness. If each dollar has the same utility to every participant, then a 50-50 split seems fair. But in a more general case, the utilities of both parties might be utterly incomparable, or the effort of both players might be very different -- an isomorphic situation is a silk merchant encountering the first of possibly multiple highwaymen, and having to agree on a split, with both parties having the option to burn all the silk if they don't agree. Agreeing to a 50-50 split each time could easily make the business model of the silk merchant impossible.

"This is my strategy, and I will not change it no matter what, so you better adapt your strategy if you want to avoid fatal outcomes" is an attitude likely to lead to a lot of fatal outcomes. 

Relatedly, if you perform an experiment n times, and the probability of success is p, and the expected number of total successes kp is much smaller than one, then kp is a reasonable measure of getting at least once success, because the probability of getting more than one success can be neglected. 

For example, if Bob plays the lottery for ten days, and each days has a 1:1000,000 chance of winning, then overall he will have a chance of 100,000 of winning once. 

This is also why micromorts are roughly additive: if travelling by railway has a mortality of one micomort per 10Mm, then travelling for 50Mm will set you back 5 micomort. Only if you leave what I would call the 'Newtonian regime of probability', e.g. by somehow managing to travel 1Tm with the railway, you are required to do proper probability math, because naive addition would tell you that you will surely have a fatal accident (1 mort) in that distance, which is clearly wrong. 

Getting down-voted to -27 is an achievement. Most things judged 'bad AI takes' only go to -11 or so, even that recent  P=NP proof only got to -25. Of course, if the author is right, then downvoting further is providing helpful incentives to him. 

I think that bullying is quite distinct from status hierarchies. The latter are unavoidable. There will always be some clique of cool kids in the class who will not invite the non-cool kids to their parties. This is ok. Sometimes, status is correlated with behaviors which are pro-social (kids not smoking; donating to EA), sometimes it is correlated with behaviors which are net-negative (kids smoking; serving in the SS). I was not part of the cool kids circle, and I was fine with that. Live and let live and all that. 

'Bullying' has a distinct negative connotation. The central example is someone who is targeted for sport for being different from the others. The bullies don't want the victims to change their ways, they just like to make their life miserable for thrills. I am sure that sometimes it unintentionally helps their victim, if you push enough people, at some point you are bound to push someone out of the path of a car or bullet. In the grand scheme of things, the bullies are however net negative for their victims and society overall. 

I see this as less of an endorsement of linear models and more of a scathing review of expert performance. 

This. Basically, if your job is to do predictions, and the accuracy of your predictions is not measured, then (at least the prediction part of) your job is bullshit. 

I think that if you compare simple linear models in domains where people actually care about their predictions, the outcome would be different. For example, if simple models predicted stock performance better than experts at investment banks, anyone with a spreadsheet could quickly become rich. There are few if any cases of 'I started with Excel and 1000$, and now I am a billionaire'. Likewise, I would be highly surprised to see a simple linear model outperform Nate Silver or the weather forecast. 

Even predicting chess outcomes from mid-game board configurations is something where I would expect human experts to outperform simple statistical models working on easily quantifiable data (e.g. number of pieces remaining, number of possible moves, being in check, etc).

Neural networks contained in animal brains (which includes human brains) are quite capable of implementing linear models, and such should at least perform equally well when they are properly trained. A wolf pack deciding to chase or not chase some prey has direct evolutionary skin in the game of making their prediction of success as accurate as possible which the average school counselor predicting academic success simply does not have.

--

You touch this a bit in 'In defense of explainatory modeling', but I want to emphasize that uncovering causal relationships and pathways is central to world modelling. Often, we don't want just predictions, we want predictions conditional on interventions. If you don't have that, you will end up trying to cure chickenpox with makeup, as 'visible blisters' is negatively correlated with outcomes. 

Likewise, if we know the causal pathway, we have a much better basis to judge if some finding can be applied to out-of-distribution data. No matter how many anvils you have seen falling, without a causal understanding (e.g. Newtonian mechanics), you will not be able to reliably apply your findings to falling apples or pianos. 
 

quiet_NaN3811

What I don't understand is why there should be a link between trapped priors and an moral philosophy. 

I mean, if moral realism was correct, i.e. if moral tenets such as "don't eat pork", "don't have sex with your sister", or "avoid killing sentient beings" had an universal truth value for all beings capable of moral behavior, then one might argue that the reason why people's ethics differ is that they have trapped priors which prevent them from recognizing these universal truths. 

This might be my trapped priors talking, but I am a non-cognitivist. I simply believe that assigning truth values to moral sentences such as "killing is wrong" is pointless, and they are better parsed as prescriptive sentences such as "don't kill" or "boo on killing". 

In my view, moral codes are intrinsically subjective. There is no factual disagreement between Harry and Professor Quirrell which they could hope to overcome through empiricism, they simply have different utility functions.

--

My second point is that if moral realism was true, and one of the key roles of religion was to free people from trapped priors so they could recognize these universal moral truths, then at least during the founding of religions, we should see some evidence of higher moral standards before they invariably mutate into institutions devoid of moral truths. I would argue that either, our commonly accepted humanitarian moral values are all wrong or this mutation process happened almost instantly:

  • Whatever Jesus thought about gender equality when he achieved moral enlightenment, Paul had his own ideas a few decades later. 
  • Mohammed was clearly not opposed to offensive warfare.
  • Martin Luther evidently believed that serfs should not rebel against their lords. 

On the other hand, for instances where religions did advocate for tenets compatible with humanitarianism, such as in Christian abolitionism, do not seem to correspond to strong spiritualism. Was Pope Benedict XIV condemning the slave trade because he was more spiritual (and thus in touch with the universal moral truth) than his predecessors who had endorsed it?

--

My last point is that especially with regard to relational conflicts, our map not corresponding to the territory might often not be a bug, but a feature. Per Hanson, we deceive ourselves so that we can better deceive others. Evolution has not shaped our brains to be objective cognitive engines. In some cases, objective cognition is what it advantageous -- if you are alone hunting a rabbit, no amount of self-deception will fill your stomach -- but in any social situation, expect evolution to put the hand on the scales of your impartial judgement. Arguing that your son should become the new chieftain because he is the best hunter and strongest warrior is much more effective than arguing for that simply because he is your son -- and the best way to argue that is to believe it, no matter if it is objectively true. 

The adulterer, the slave owner and the wartime rapist all have solid evolutionary reasons to engage in behaviors most of us might find immoral. I think their moral blind spots are likely not caused by trapped priors, like an exaggerated fear of dogs is. Also, I have no reason to believe that I don't have similar moral blind spots hard-wired into my brain by evolution.

I would bet that most of the serious roadblocks to a true moral theory (if such a thing existed) are of that kind, instead of being maladaptive trapped priors. Thus, even if religion and spirituality are effective at overcoming maladaptive trapped priors, I don't see how they would us bring closer to moral cognition. 

Note: there is an AI audio version of this text over here: https://askwhocastsai.substack.com/p/eliezer-yudkowsky-tweet-jul-21-2024

I find the AI narrations offered by askwho generally ok, worse than what a skilled narrator (or team) could do but much better than what I could accomplish. 

[...] somehow humanity's 100-fold productivity increase (since the days of agriculture) didn't eliminate poverty.

That feels to me about as convincing as saying: "Chemical fertilizers have not eliminated hunger, just the other weekend I was stuck on a campus with a broken vending machine." 

I mean, sure, both the broken vending machine and actual starvation can be called hunger, just as both working 60h/week to make ends meet or sending your surviving kids into the mines or prostituting them could be called poverty, but the implication that either scourge of humankind has not lost most of its terror seems clearly false. 

Sure, being poor in the US sucks, but I would rather spend a year living the life of someone in the bottom 10% income bracket in the 2024 US than spending a month living the life of a poor person during the English industrial revolution.

I am also not convinced that 60h/week is what it actually takes to survive in the US. I can totally believe that this amount of unskilled labor might be required to rent accommodations in cities, though. 

Load More