Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Joey KL10

You mean this substance? https://en.wikipedia.org/wiki/Mesembrine

Do you have a recommended brand, or places to read more about it?

Joey KL32

I would love to hear the principal’s take on your conversation.

Joey KL30

Interesting, I can see why that would be a feature. I don't mind the taste at all actually. Before, I had some of their smaller citrus flavored kind, and they dissolved super quick and made me a little nauseous. I can see these ones being better in that respect. 

Joey KL10

I ordered some of the Life Extension lozenges you said you were using; they are very large and take a long time to dissolve. It's not super unpleasant or anything, I'm just wondering if you would count this against them?

Joey KL10

Thank you for your extended engagement on this! I understand your point of view much better now.

Joey KL30

Oh, I think I get what you’re asking now. Within-lifetime learning is a process that includes something like a training process for the brain, where we learn to do things that feel good (a kind of training reward). That’s what you’re asking about if I understand correctly?

I would say no, we aren’t schemers relative to this process, because we don’t gain power by succeeding at it. I agree this is subtle and confusing question, and I don’t know if Joe Carlsmith would agree, but the subtlety to me seems to belong more to the nuances of the situation & analogy and not to the imprecision of the definition.

(Ordinary mental development includes something like a training process, but it also includes other stuff more analogous to building out a blueprint, so I wouldn’t overall consider it a kind of training process.)

Joey KL20

If you're talking about this report, it looks to me like it does contain a clear definition of "schemer" in section 1.1.3, pg. 25: 

It’s easy to see why terminally valuing reward-on-the-episode would lead to training-gaming (since training-gaming just is: optimizing for reward-on-the-episode). But what about instrumental training-gaming? Why would reward-on-the-episode be a good instrumental goal?

In principle, this could happen in various ways. Maybe, for example, the AI wants the humans who designed it to get raises, and it knows that getting high reward on the episode will cause this, so it training-games for this reason.

The most common story, though, is that getting reward-on-the-episode is a good instrumental strategy for getting power—either for the AI itself, or for some other AIs (and power is useful for a very wide variety of goals). I’ll call AIs that are training-gaming for this reason “power-motivated instrumental training-gamers,” or “schemers” for short.

By this definition, a human would be considered a schemer if they gamed something analogous to a training process in order to gain power. For example, if a company tries to instill loyalty in its employees, an employee who professes loyalty insincerely as a means to a promotion would be considered a schemer (as I understand it). 

Joey KL32

I think this post would be a lot stronger with concrete examples of these terms being applied in problematic ways. A term being vague is only a problem if it creates some kind of miscommunication, confused conceptualization, or opportunity for strategic ambiguity. I'm willing to believe these terms could pose these problems in certain contexts, but this is hard to evaluate in the abstract without concrete cases where they posed a problem.

Joey KL65

I'm not sure I can come up with a distinguishing principle here, but I feel like some but not all unpleasant emotions feel similar to physical pain, such that I would call them a kind of pain ("emotional pain"), and cringing at a bad joke can be painful in this way.

Load More