Thanks, I mostly agree.
But even in colonialism, individual traits played a role. For example, compare King Leopold II's rule over the Congo Free State vs. other colonial regimes.
While all colonialism was exploitative, under Leopold's personal rule the Congo saw extraordinarily brutal policies, e.g., his rubber quota system led soldiers to torture and cut off the hands of workers, including children, who failed to meet quotas. Under his rule,1.5-15 million Congolese people died—the total population was only around 15 to 20 million. The brutality was s...
The British weren't much more compassionate. North America and Australia were basically cleared of their native populations and repopulated with Europeans. Under British rule in India, tens of millions died from many famines, which instantly stopped after independence.
Colonialism didn't end due to benevolence. Wars for colonial liberation continued well after WWII and were very brutal, the Algerian war for example. I think the actual reason is that colonies stopped making economic sense.
So I guess the difference between your view and mine is that I think c...
Thanks, good point! I suppose it's a balancing act and depends on the specifics in question and the amount of shame we dole out. My hunch would be that a combination of empathy and shame ("carrot and stick") may be best.
I agree that the problem of "evil" is multifactorial with individual personality traits being only one of several relevant factors, with others like "evil/fanatical ideologies" or misaligned incentives/organizations plausibly being overall more important. Still, I think that ignoring the individual character dimension is perilous.
...It seems to me that most people become much more evil when they aren't punished for it. [...] So if we teach AIs to be as "aligned" as the average person, and then AIs increase in power beyond our ability to punish them, we
Thanks. Sorry for not being more clear, I pasted a screenshot (I'm reading the book on Kindle and can't copy-paste) and asked Claude to transcribe the image into written text.
Again, this is not the first time this happened. Claude refused to help me translate a passage from the Quran (I wanted to check which of two translations was more accurate), refused to transcribe other parts of the above-mentioned Kindle book, and refused to provide me with details about what happened at Tuol Sleng prison. I eventually could persuade Claude in all of these case...
I downvoted Claude's response (i.e., clicked the thumbs-down symbol below the response) and selected "overactive refusal" as the reason. I didn't get in contact with Anthropic directly.
I had to cancel my Claude subscription (and signed up for ChatGPT) because Claude (3.5 Sonnet) constantly refuses to transcribe or engage with texts that discuss extremism or violence, even if it's clear that this is done in order to better understand and prevent extremist violence.
Example text Claude refuses to transcribe below. For context, the text discusses the motivations and beliefs of Yigal Amir who assassinated the Israeli Prime Minister in 1995.
...God gave the land of Israel to the Jewish People," he explained, and he, Yigal Amir, was making ce
Really great post!
It’s unclear how much human psychology can inform our understanding of AI motivations and relevant interventions but it does seem relevant that spitefulness correlates highly (Moshagen et al., 2018, Table 8, N 1,261) with several other “dark traits”, especially psychopathy (r = .74), sadism (r = .59), and Machiavellianism (r = .59).
(Moshagen et al. (2018) therefore suggest that “[...] dark traits are specific manifestations of a general, basic dispositional behavioral tendency [...] to maximize one’s individual...
Great post, thanks for writing!
Most of this matches my experience pretty well. I think I had my best ideas during phases (others seem to agree) when I was unusually low on guilt- and obligation-driven EA/impact-focused motivation and was just playfully exploring ideas for fun and out of curiosity.
One problem with letting your research/ideas be guided by impact-focused thinking is that you basically train your mind to immediately ask yourself after entertaining a certain idea for a few seconds "well, is that actually impactful?". And basically all of ...
Lol, thanks. :)
Thanks for this post, I thought this was useful.
I needed a writing buddy to pick up the momentum to actually write it
I'd be interested in knowing more how this worked in practice (no worries if you don't feel like elaborating/don't have the time!).
...I think mostly I expect us to continue to overestimate the sanity and integrity of most of the world, then get fucked over like we got fucked over by OpenAI or FTX. I think there are ways to relating to the rest of the world that would be much better, but a naive update in the direction of "just trust other people more" would likely make things worse.
[...]
Again, I think the question you are raising is crucial, and I have giant warning flags about a bunch of the things that are going on (the foremost one is that it sure really is a time to reflect on your r
This is mentioned in the introduction.
I'm biased, of course, but it seems fine to write a post like this. (Similarly, it's fine for CFAR staff members to write a post about CFAR techniques. In fact, I prefer if precisely these people write such posts because they have the relevant expertise.)
Would you like us to add a more prominent disclaimer somewhere? (We worried that this might look like advertising.)
A quick look through https://www.goodtherapy.org/learn-about-therapy/types/compassion-focused-therapy gives an impression of yet another mix of CBT, DBT and ACT, nothing revolutionary or especially new, though maybe I missed something.
In my experience, ~nothing in this area is downright revolutionary. Most therapies are heavily influenced by previous concepts and techniques. (Personally, I'd still say that CFT brings something new to the table.)
I guess what matters if it works for you or not.
...Is this assertion borne out by twin studies? Or is believin
From studying and using all of the above my conclusion is that IFS offers the most tractable approach to this issue of competing 'parts'. And in many ways the most powerful.
In our experience, different people respond to different therapies. I know several people for whom, say, CFT worked better than IFS. Glad to hear that IFS worked for you!
When you read about modern therapies, they all borrow from one another in a way that did not occur say 50 years ago where there were very entrenched schools of thought.
Yes, that's definitely the case. My sense is ...
For what it's worth, I read/skimmed all of the listed IDA explanations and found this post to be the best explanation of IDA and Debate (and how they relate to each other). So thanks a lot for writing this!
Thanks a lot for this post (and the whole sequence), Kaj! I found it very helpful already.
Below a question I first wanted to ask you via PM but others might also benefit from an elaboration on this.
You describe the second step of the erasure sequence as follows (emphasis mine):
>Activating, at the same time, the contradictory belief and having the experience of simultaneously believing in two different things which cannot both be true.
When I try this myself, I feel like I cannot actually experience two things simultaneously. There...
The post Reducing long-term risks from malevolent actors is somewhat related and might be of interest to you.
Cool post! Daniel Kokotajlo and I have been exploring somewhat similar ideas.
In a nutshell, our idea was that a major social media company (such as Twitter) could develop a feature that incentivizes forecasting in two ways. First, the feature would automatically suggest questions of interest to the user, e.g., questions thematically related to the user’s current tweet or currently trending issues. Second, users who make more accurate forecasts than the community will be rewarded with increased visibility.
Our idea is different in two major ways: ...
Regarding how melatonin might cause more vivid dreams. I found the theory put forward here quite plausible:
There are user reports that melatonin causes vivid dreams. Actually, all sleep aids appear to some users to produce more vivid dreams.
What is most likely happening is that the drug modifies the sleep cycle so the person emerges from REM sleep (when dreams are most vivid) to waking quickly – more quickly that when no drug is used. The user subjectively reports the drug as producing vivid dreams.
Great that you're thinking about this issue! A few sketchy thoughts below:
I) As you say, autistic people seem to be more resilient with regards to tribalism. And autistic tendencies and following rationality communities arguably correlates as well. So intuitively, it seems that something like higher rationality and awareness of biases could be useful for reducing tribalism. Or is there another way of making people "more autistic"?
Given this and other observations (e.g., autistic people seem to have lower mental health, on average), it seems ...
Can one use the service reflect also if one is not located in the Bay Area? Or do you happen to know of similar services for outside the Bay Area or US? Thanks a lot in advance.
The open beta will end with a vote of users with over a thousand karma on whether we should switch the lesswrong.com URL to point to the new code and database
How will you alert these users? (I'm asking because I have over 1000 karma but I don't know where I should vote.)
One of the more crucial points, I think, is that positive utility is – for most humans – complex and its creation is conjunctive. Disutility, in contrast, is disjunctive. Consequently, the probability of creating the former is smaller than the latter – all else being equal (of course, all else is not equal).
In other words, the scenarios leading towards the creation of (large amounts of) positive human value are conjunctive: to create a highly positive future, we have to eliminate (or at least substantially reduce) physical pain and boredom and injustice ...
The article that introduced the term "s-risk" was shared on LessWrong in October 2016. The content of the article and the talk seem similar.
Did you simply not come across it or did the article just (catastrophically) fail to explain the concept of s-risks and its relevance?
Here is another question that would be very interesting, IMO:
“For what value of X would you be indifferent about the choice between A) creating a utopia that lasts for one-hundred years and whose X inhabitants are all extremely happy, cultured, intelligent, fair, just, benevolent, etc. and lead rich, meaningful lives, and B) preventing one average human from being horribly tortured for one month?"
I think it's great that you're doing this survey!
I would like to suggest two possible questions about acausal thinking/superrationality:
1)
Newcomb’s problem: one box or two boxes?
- Accept: two boxes
- Lean toward: two boxes
- Accept: one box
- Lean toward: one box
- Other
(This is the formulation used in the famous PhilPapers survey.)
2)
Would you cooperate or defect against other community members in a one-shot Prisoner’s Dilemma?
- Definitely cooperate
- Leaning toward: cooperate
- Leaning toward: defect
- Definitely defect
- Other
I think that these questions a...
First of all, I don't think that morality is objective as I'm a proponent of moral anti-realism. That means that I don't believe that there is such a thing as "objective utility" that you could objectively measure.
But, to use your terms, I also believe that there currently exists more "disutility" than "utility" in the world. I'd formulating it this way: I think there exists more suffering (disutility, disvalue, etc.) than happiness (utility, value, etc.) in the world today. Note that this is just a consequence of my own pers...
Great list!
IMO, one should add Prescriptions, Paradoxes, and Perversities to the list. Maybe to the section "Medicine, Therapy, and Human Enhancement".
I don't understand why you exclude risks of astronomical suffering ("hell apocalypses").
Below you claim that those risks are "Pascalian" but this seems wrong.
Cool that you are doing this!
Is there also a facebook event?
That's not true -- for example, in cases where the search costs for the full space are trivial, pure maximizing is very common.
Ok, sure. I probably should have written that pure maximizing or satisficing is hard to find in important, complex and non-contrived instances. I had in mind such domains as career, ethics, romance, and so on. I think it's hard to find a pure maximizer or satisficer here.
...My objection is stronger. The behavior of optimizing for (gain - cost) does NOT lie on the continuum between satisficing and maximizing as defined in your po
But you don't seem to have made a compelling argument that such people are worse off than epistemic maximisers.
If we just consider personal happiness, then I agree with you – it's probably even the case that epistemic satisficers are happier than epistemic maximizers. But many of us don't live for the sake of happiness alone. Furthermore, it's probably the case that epistemic maximizers are good for society as a whole. If every human had been an epistemic satisficer we never would have discovered the scientific method or eradicated small pox, for examp...
Continuing my previous comment
That's not satisficing because I don't take the first option alternative that is good enough. That's also not maximizing as I am not committed to searching for the global optimum.
I agree: It's neither pure satisficing nor pure maximizing. Generally speaking, in the real world it's probably very hard to find (non-contrived) instances of pure satisficing or pure maximizing. In reality, people fall on a continuum from pure satisficers to pure maximizers (I did acknowledge this in footnotes 1 and 2, but I probably should have ...
I see no mention of costs in these definitions.
Let's try a basic and, dare I say it, rational way of trying to achieve some outcome: you look for a better alternative until your estimate of costs for further search exceeds your estimate of the gains you would get from finding a superior option.
Agree. Thus in footnote 3 I wrote:
[3] Rational maximizers take the value of information and opportunity costs into account.
You've got me there :)
But what does one maximize?
Expected utility :)
We can not maximize more than one thing (except in trivial cases).
I guess I have to disagree. Sure, in any given moment you can maximize only one thing but this is simply not true for larger time horizons. Let's illustrate this with a typical day of Imaginary John: He wakes up and goes to work at an investment bank to earn money (money maximizing) to donate it later to GiveWell (ethical maximizing). Later at night he goes on OKCupid/or to a party to find his true soulmate (romantic maximizing). He maximi...
Again, I'm just giving quick feedback. Hopefully you've already given more detail in essay. Other than that, your summary seems fine to me.
Thanks! And yeah, ending aging and death are some of the examples I gave in the complete essay.
I wrote an essay about the advantages (and disadvantages) of maximizing over satisficing but I’m a bit unsure about its quality, that’s why I would like to ask for feedback here before I post it on LessWrong.
Here’s a short summary:
According to research there are so called “maximizers” who tend to extensively search for the optimal solution. Other people — “satisficers” — settle for good enough and tend to accept the status quo. One can apply this distinction to many areas:
Epistemology/Belief systems: Some people, one could describe them as epistemic max...
Great post. Some cases of "attempted telekinesis" seem to be similar to "shoulding at the universe".
To stay with your example: I can easily imagine that if I were in your place and experienced this stressful situation with CFAR, my system 1 would have became emotionally upset and "shoulded" at the universe: "I shouldn't have to do this alone. Someone should help me. It is so unfair that I have so much responsibility".
This is similar to attempted telekinesis in the sense that my system 1 somehow thinks that just by...
Two words: Interindividual differences.
They also recommend 8-9 hours sleep. Some people need more, some people need less. The same point applies to many different phenomena.
I think Bostrom puts it nicely in his new book "Superintelligence":
A colleague of mine likes to point out that a Fields Medal (the highest honor in mathematics) indicates two things about the recipient: that he was capable of accomplishing something important, and that he didn't.
I'm reminded of my petroleum engineering professor who assured me that a friend would eventually stop wasting his time on physics and come around to what was really important, namely petroleum engineering.
WTF. That's a fucking ignorant remark.
You know, I'm having a bit of a bad day, so there's more venom in me than there normally is. And I might sometimes hesitate to attack a person for being stupid, since I might have committed an isomorphic stupidity myself.
But today, I am not going to care, I am just going to vent. Right now, I feel contempt for the arrogant ignorance of whoever said that. Lacking context, it's hard to know exactly where they are coming from. Is it some transhumanist, whose definition of "something important" reduces to resea...
I translated the essay Superintelligence and the paper In Defense of posthuman Dignity by Nick Bostrom into German in order to publish them on the blog of GBS Schweiz.
He thanked me by sending me a signed copy of his new book "Superintelligence". Which made me pretty happy.
I changed the privacy settings. Link should work now.
Don't know how useful that is, but I created a FB event: https://www.facebook.com/events/360486800773506/?ref_dashboard_filter=upcoming&source=1
Cool, yeah, I'm going to the Berlin Meetup. See you there!
You got me kinda scared. I just use Evernote or wordpress for all my important writing. That should be enough, right?
Great post of course.
If it took a mutant to do monstrous things, the history of the human species would look very different. Mutants would be rare.
Maybe I'm missing something, but shouldn't it read: "Mutants would not be rare." ? Many monstrous things happened in human history, so if only mutants could do evil deeds, there would have to be a lot of them. Furthermore, mutants are rare, so no need for the subjunctive "would".
But... I read quickly through it, and I saw no meta-analysis. Just a literature review. What's with the post title?
You're right. I don't remember why I wrote "meta-analysis". (Probably because it sounds fancy and smart). I updated the title.
Is this referring to effect sizes or p-values?
p-values.
Eh. Absence of improvement != damage.
True.
...Randal 2004 didn't find a statistically-significant decrease...
No. In Randall et al. (2004) participants in the 200 mg modafinil condition made significantly more errors (p<0,05) in the Intra...
Well, I take modafinil primarily as a motivation-enhancer.
Hm, I don't think so. What about Lincoln, JFK, Roosevelt, Marcus Aurelius, Adenauer, etc.?