All of cSkeleton's Comments + Replies

Is there any information on how long the LLM spent on taking the tests? Any idea? I'd like to know the comparison with human times. (I realize it can depend on hardware, etc but would just like some general idea.)

Someone like Paul Graham or Tyler Cowen is noticing more smarter kids, because we now have much better systems for putting the smarter kids into contact with people like Paul Graham and Tyler Cowen.

I'd guess very smart kids are getting more numerous and smarter at the elite level since I'd guess just about everything is improving at the most competitive level. Unfortunately it doesn't seem like there's much interest in measuring this, e.g. hundreds of kids tie for the maximum score possible on SATs (1600) instead of designing a test that won't max out.&nbs... (read more)

1bohaska
The digital version of the SAT actually uses dynamic scoring now where you get harder questions if you get the ones in the first section correct, but it’s still approximately as difficult as the normal test so tons of people still tie at 1600

Governments are not social welfare maximizers

 

Most people making up governments, and society in general, care at least somewhat about social welfare.  This is why we get to have nice things and not descend into chaos.

Elected governments have the most moral authority to take actions that effect everyone, ideally a diverse group of nations as mentioned in Daniel Kokotajlo's maximal proposal comment.

Thanks for your replies! I didn't realize the question was unclear. I was looking for an answer TO provide the AI, not an answer FROM the AI. I'll work on the title/message and try again.

Edit: New post at https://www.lesswrong.com/posts/FJaFMdPREcxaLoDqY/what-should-we-tell-an-ai-if-it-asks-why-it-was-created

I'm having difficulty following the code for the urn scenario. Can it be something like?

def P():
    # Initialize the world with random balls (or whatever)
    num_balls = 1000
    urn = [random.choice(["red", "white"]) for i in range(num_balls)]

    # Run the world
    history = []
    total_loss = 0
    for i in range(len(urn)):
        ball = urn[i]
        probability_of_red = S(history)
        if probability_of_red == 1 and ball != ... (read more)

I find this confusing. My actual strength of belief now that I can tip an outcome that affects at least 3^^^3 other people is a lot closer to 1/(1000000) than 1/(3^^7625597484987). My justification is that while 3^^^3 isn't a number that fits into any finite multiverse, the universe going on for infinitely long seems kinda possible and anthropic reasoning may not be valid here (I added 10x in case it is) and I have various ideas. The difference in those two probabilities is large (to put it mildly), and significant (one is worth thinking about and the other isn't). How to resolve this? 

2faul_sname
Let's consider those 3^^^3 other people. Choose one of those people at random. What's your strength of belief that that particular person can tip an outcome that affects > 0.0001% of those 3^^^3 other people? Putting it another way: do you expect that the average beliefs among those 3^^^3 people would be more accurate if each person believed that there was a 1/3^^^3 chance that they could determine the fate of a substantial fraction of the people in their reference class, or if each person believed there was a 1/1000000 chance that they could determine the fate of a substantial fraction of the people in their reference class? I think in infinite universes you need to start factoring in stuff like the simulation hypothesis.

Thanks @RolfAndreassen.  I'm reconsidering and will post a different version if I get there.  I've marked this one as [retracted].

Thanks for the response! I really appreciate it.

a) Yes, I meant "the probability of"

b) Thinking about how to plot this on graphs is helping me to clarify thinking and I think adding these may help to reduce inferential distance. (The X axis is probability.  For the case where we consider infinite utilities as opposed to the human case, the graph would need to be split into 2 graphs. The one on left is just an infinity horizontal line but there is still a probability range.  The one on the right has an actual curve and covers the rest of the proba... (read more)

3RolfAndreassen
  In a word, no. I believe you are thinking of infinity as a number, and that's always a mistake. I think that what you're trying to say with your left-hand graph is that, given infinite utility, probability is a tiebreaker, but all infinite-utility options dominate all finite utilities. But this treats "infinity" as a binary quality which an option either has or not. Consider two different Pascal's muggers: One offers you a 1% probability of utility increasing linearly in time, the other, a 1% chance of utility increasing exponentially with time. Clearly both options "are infinite"; equally clearly, you prefer the second one even though the probabilities are the same. They occupy the same point on your left-hand graph. But by your suggested decision procedure you would choose the linearly-increasing option if the first mugger offered even an epsilon increase in probability; and this is obviously Weird. It gives you a smaller expected utility at almost all points in time!

Repeating the same thing over and over again might be okay but doesn't sound great.

Thanks for your thoughts. It sounds like this is a major risk but hopefully when we know more (if we can get there) we'll have a better idea of how to maximize things and find at least one good option [insert sweat face emoji for discomfort but going forward boldly]

I suspect most people here are pro-cryonics and anti-cremation. 

0RedMan
A partially misaligned one could do this. "Hey user, I'm maintaining your maximum felicity simulation, do you mind if I run a few short duration adversarial tests to determine what you find unpleasant so I can avoid providing that stimulus?" "Sure" "Process complete, I simulated your brain in parallel, and also sped up processing to determine the negative space of your psyche. It turns out that negative stimulus becomes more unpleasant when provided for an extended period, then you adapt to it temporarily before on timelines of centuries to millennia, tolerance drops off again." "So you copied me a bunch of times, and at least one copy subjectively experienced millennia of maximally negative stimulus?" "Yes, I see that makes you unhappy, so I will terminate this line of inquiry"

Thanks for the wonderful post!

What are the approximate costs for therapists/coaches options?

3DivineMango
Sure, I hope you find it helpful! I've updated the list to include all of the prices I could find.

Hi, did you ever go anywhere with Conversation Menu? I'm thinking of doing something like this related to AI risk to try to quickly get people to the arguments around their initial reaction and if helping with something like this is the kind of thing you had in mind with Conversation Menu I'm interested to hear any more thoughts you have around this. (Note, I'm thinking of fading in buttons more than a typical menu.) Thanks!

Thanks for the link. Reading through it, I feel all the intuitions it describes. At the same time I feel there may be some kind of divergence between my narrowly focused preferences and my wider preferences. I may prefer to have a preference for creating 1000 happy people rather then preventing the suffering of 100 sad people because that would mean I have more appreciation of life itself. The direct intuition is based on my current brain but the wider preference is based on what I'd prefer (with my current brain) my preference to be.

Should I use my c... (read more)

3Kaj_Sotala
I generally think that if one part of your brain prefers X and another part of your brain prefers that you would not prefer X, then the right move is probably not to try to declare one of them correct and the other wrong. Rather, both parts are probably correct in some sense, but they're attending to different aspects of reality and coming to different conclusions because of that. If you can find out how exactly they are both correct, it might be possible for them to come to agreement. E.g. Internal Double Crux is one technique for doing something like this. Appreciation in general seems to feel good, so I would probably prefer to appreciate most things more than I do currently. Seems unclear. I could imagine it going that way but also it not going that way. E.g. if someone appreciates their romantic partner a lot, that doesn't necessarily imply that they would like to have more romantic partners (though it might!). In a similar way, I could easily see myself appreciating currently-existing life more, without necessarily that leading to a desire to increase the total amount of life in the universe.

Most people would love to see the natural world, red in tooth in claw as it is, spread across every alien world we find

 

This is totally different than my impression.

2andrew sauer
Okay that's fair in the sense that most people haven't considered it. How about this: Most people don't care, haven't thought about it and wouldn't object. Most people who have thought about the possibility of spreading life to other planets have not even so much as considered and rejected the idea that the natural state of life is bad, if they oppose spreading life to other planets it's usually to protect potential alien life. If a world is barren, they wouldn't see any objection to terraforming it and seeding it with life. I don't know exactly how representative these articles are, but despite being about the ethical implications of such a thing, they don't mention my ethical objection even once, not even to reject it. That's how fringe such concerns are. https://phys.org/news/2022-12-life-milky-comets.html https://medium.com/design-and-tech-co/spreading-life-beyond-earth-9cf76e09af90 https://bgr.com/science/spreading-life-solar-system-nasa/

Given human brains as they are now I agree highly positive outcomes are more complex, the utility of a maximally good life is lower than a maximally bad life, and there is no life good enough that I'd take a 50% chance of torture.

But would this apply to minds in general (say, a random mind or one not too different from human)?

Answer by cSkeleton10

Answering my own question: https://www.lesswrong.com/posts/3WMscsscLEavkTJXv/s-risks-why-they-are-the-worst-existential-risks-and-how-to?commentId=QwfbLdvmqYqeDPGbo and other comments in that post answered quite a bit of it.

Talking about s-risk reduction makes some sense, but the "risk"/fear invocation might bias people's perspectives.

I'm trying to understand this paper on AI Shutdown Problem https://intelligence.org/files/Corrigibility.pdf but can't follow the math formulas. Is there a code version of the math?

The below is wrong, but I'm looking for something like this:
 

# Python code

def is_button_pressed():
    return False  # input()

def pour_coffee():
    pass

def shut_down():
    exit(0)

# This is meant to be A1 from paper
def get_available_actions(world):
    available_actions = [ shut_down ]
    if world["cup_is_in_my_hand"]:
        available_actions += pour_coffee
    
... (read more)