User Comment Replies

Would you pay for a search engine limited to rationalist sites?

Answer by Luk27182Aug 03, 202330

When I am looking for rationalist content and can't find it, using Metaphor (free) usually finds what I want (sometimes even without a rationalist-specific prompt. Could be the data it was trained on? In any case, it does what I want.)

Don't there already exist extensions for google that you can use to whitelist certain websites (parental locks and such)? I'd think you could just copy paste a list of rationalist blogs into something like that? This seems like what you are proposing to create, unless I misunderstand.

Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo

Luk271822y10

Answer: (decode with rot13)

Should this link instead go to rot13.com?

Good puzzle, thanks!

3Neel Nanda2y

Thanks, fixed! LW doesn't render links without https:// in front it seems

What money-pumps exist, if any, for deontologists?

Luk271822y124

Weird things CAN happen if others can cause you to kill people with your bare hands (See Lexi-Pessimist Pump here). But assuming you can choose to never be in a world where you kill someone with your bare hands, I also don't think there are problems? The world states may as well just not exist.

(Also, not money pump, but consider: Say I have 10^100 perfectly realistic mannequin robots and one real human captive. I give the constrained utilitarian the choice between choking one of the bodies with their bare hands or let me wipe out humanity. Does the agent really choose to not risk killing someone themself?)

Alignment works both ways

Luk271822y65

I didn't want this change, it just happened.

I might be misunderstanding- isn't this what the question was? Whether we should want (/be willing to) change our values?

Sometimes I felt like a fool afterward, having believed in stupid things

The problem with this is: If I change your value system in any direction, the hypnotized "you" will always believe that the intervention was positive. If I hypnotized you to believe that being carnivorous was more moral by changing your underlying value system to value animal suffering, then that version of you would view t... (read more)

2Karl von Wendt2y

In principle, I agree with your logic: If I have value X, I don't want to change that to Y. However, values like "veganism" are not isolated. It may be that I have a system of values [A...X], and changing X to Y would actually fit better with the other values, or more or less the same. Then I wouldn't object that change. I may not be aware of this in advance, though. This is were learning comes into play: I may discover facts about the world that make me realize that Y fits better into my set of values than X. So vegan Karl may be a better fit to my other set of values than carnivorous Karl. In this way, the whole set of values may change over time, up to the point where they significantly differ from the original set (I feel like this happened to me in my life, and I think it is good). However, I realize that I'm not really good at arguing about this - I don't have a fleshed-out "theory of values". And that wasn't really the point of my post. I just wanted to point out that our values may be changed by an AI, and that it may not necessarily be bad, but could also lead to an existential catastrophe - at least from today's point of view.

Alignment works both ways

Luk271822y65

My language was admittedly overly dramatic, but I don't think it make rational sense to want to change your values for the sake of just having the new value. If I wanted to value something, then by definition I would already value that thing. That said, I might not take actions based on that value if:

There was social/economic pressure not to do so
I already had the habit of acting a different way
I didn't realize I was acting against my value
etc.

I think that actions like becoming vegan are more like overcoming the above points than fundamentally changing you... (read more)

2Karl von Wendt2y

I guess it depends on how you define "value". I have definitely changed my stance towards many things in my lifetime, not because I was under social pressure or unaware of it before, but because I changed my mind. I didn't want this change, it just happened, because someone convinced me, or because I spent more time thinking about things, or because of reading a book, etc. Sometimes I felt like a fool afterward, having believed in stupid things. If you reduce the term "value" to "doing good things", then maybe it hasn't changed. But what "good things" means did change a lot for me, and I don't see this as a bad thing.

Alignment works both ways

Luk271822y11

If I were convinced to value things, I would no longer be myself. Changing values is suicide.

You might somehow convince me through hypnosis that eating babies is actually kind of fun, and after that, that-which-inhabits-my-body would enjoy eating babies. However, that being would no longer be me. I'm not sure what a necessary and sufficient condition is for recognizing another version of myself, but sharing values is at least part of the necessary condition.

1baturinsky2y

The value of not dying, the value of not changing values and the value of amassing power are not mandatory. It's just values that are favored by the natural selection. Unless we have a system that switches AI off if it has those values.

5Karl von Wendt2y

This seems quite drastic. For example, my son convinced me to become an (almost) vegan, because I realized that the way we treat animals isn't right and I don't want to add to their suffering. This certainly changed my value system, as well as my diet. Changing values means changing yourself, but change is not death, otherwise we wouldn't survive the first convincing LessWrong post. :) Of course, there are changes to the better and changes to the worse. The whole problem seems to be to differentiate between both.

A Telepathic Exam about AI and Consequentialism

Luk271822y10

I'd think the goal for 1,2,3 is to find/fix the failure modes? And for 4 to find a definition of "optimizer" that fits evolution/humans, but not paperclips? Less sure about 5,6, but there is something similar to the others about "finding the flaw in reasoning"

Here's my take on the prompts:

The first AI has no incentive to change itself to be more like the second- it can just decide to start working on the wormhole if it wants to make the wormhole. Even more egregious, the first AI should definitely not change its utility function to be more like the second!

... (read more)

Escape Velocity from Bullshit Jobs

Luk271822y2-1

This is besides the point of your own comment, but “how big are bullshit jobs as % of GDP” is exactly 0 by definition!

Framing Practicum: Stable Equilibrium

Answer by Luk27182Apr 15, 202260

Most metrics of productivity/success are at a stable equilibrium in my life. For example:

The work I get done in a day (month?) is fairly constant. If I work hard throughout the day, I eventually feel satisfied and relax for a while. If I relax for too long, I start feeling sluggish and want to get back to working. Sometimes this happens more so on the scale of an entire month (an incredibly productive week followed by a very sluggish week).
The amount of socialization I partake in each week is also constant. When I socialize too much my battery

... (read more)

LESSWRONG
LW

All of Luk27182's Comments + Replies