x

LESSWRONG
LW

lePAN6517

Subscribe

Message

7

2

6y

lePAN6517

lePAN6517 has not written any posts yet.

lePAN6517

Subscribe

Message

7

2

6y

lePAN6517

lePAN6517 has not written any posts yet.

lePAN65172y

Can you speak to any, let's say, "hypothetical" specific concerns that somebody who was in your position at a company like OpenAI might have had that would cause them to quit in a similar way to you?

3

1

0

Replying toMore information about the dangerous capability evaluations we did with GPT-4 and Claude.

lePAN65173y

More information about the dangerous capability evaluations we did with GPT-4 and Claude.

Thank you for your valuable work doing this. Can you please expand up on why you did not test the final version of GPT-4? In section 2.9 of the GPT-4 System Card paper, it says:

"We granted the Alignment Research Center (ARC) early access to the models as a part of our expert red teaming efforts in order to enable their team to assess risks from power-seeking behavior. The specific form of power-seeking that ARC assessed was the ability for the model to autonomously replicate and acquire resources. We provided them with early access to multiple versions of the GPT-4 model, but they did not have the ability to fine-tune it. They also did not

... (read more)

1

8

6