habryka

Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com

Sequences

A Moderate Update to your Artificial Priors
A Moderate Update to your Organic Priors
Concepts in formal epistemology

Wiki Contributions

Comments

habryka40

My current best guess is that actually cashing out the vested equity is tied to an NDA, but I am really not confident. OpenAI has a bunch of really weird equity arrangements.

habryka20

Hmm, I have sympathy for this tag, but also I do feel like the tagging system probably shouldn't implicitly carry judgement. Seems valuable to keep your map separate from your incentives and all that.

Happy to discuss here what to do. I do think allowing people to somehow tag stuff that seems like it increases capabilities in some dangerous way seems good, but I do think it should come with less judgement in the site's voice (judgement in a user's voice is totally fine, but the tagging system speaks more with the voice of the site than any individual user).

habryka20

Oh, yeah, admins currently have access to a purely recommended view, and I prefer it. I would be in favor of making that accessible to users (maybe behind a beta flag, or maybe not, depending on uptake).

I think the priors here are very low, so while I agree it looks suspicious, I don't think it's remotely suspicious enough to have the correct posterior be "about zero chance that wasn't murder". Corporations, at least in the U.S. really very rarely murder people.

habryka2215

My understanding is that the extent of NDAs can differ a lot between different implementations, so it might be hard to speak in generalities here. From the revealed behavior of people I poked here who have worked at OpenAI full-time, the OpenAI NDAs seem very comprehensive and limiting. My guess is also the NDAs for contractors and for events are a very different beast and much less limiting. 

Also just the de-facto result of signing non-disclosure-agreements is that people don't feel comfortable navigating the legal ambiguity and default very strongly to not sharing approximately any information about the organization at all.

Maybe people would do better things here with more legal guidance, and I agree that you don't generally seem super constrained in what you feel comfortable saying, but like I sure now have run into lots of people who seem constrained by NDAs they signed (even without any non-disparagement component). Also, if the NDA has a gag clause that covers the existence of the agreement, there is no way to verify the extent of the NDA, and that makes navigating this kind of stuff super hard and also majorly contributes to people avoiding the topic completely. 

habryka3011

I think having signed an NDA (and especially a non-disparagement agreement) from a major capabilities company should probably rule you out of any kind of leadership position in AI Safety, and especially any kind of policy position. Given that I think Daniel has a pretty decent chance of doing either or both of these things, and that work is very valuable and constrained on the kind of person that Daniel is, I would be very surprised if this wasn't worth it on altruistic grounds.  

Edit: As Buck points out, different non-disclosure-agreements can differ hugely in scope. To be clear, I think non-disclosure-agreements that cover specific data or information you were given seems fine, but non-disclosure-agreements that cover their own existence, or that are very broadly worded and prevent you from basically talking about anything related to an organization, are pretty bad. My sense is the stuff that OpenAI employees are asked to sign when they leave are very constraining, but my guess is the kind of stuff that people have to sign for a small amount of contract work or for events are not very constraining, though I would definitely read any contract carefully in this space.

Oh, hmm, I sure wasn't tracking a 1000 character limit. If you can submit it, I wouldn't be worried about it (and feel free to put that into your references section). I certainly have never paid attention to whether anyone stayed within the character limit.

I haven't engaged with this in enough detail, but some people who engaged with Scott's sequence who I can imagine being interested in this: @Scott Garrabrant , @James Payor, @Quinn, @Nathan Helm-Burger, @paulfchristiano

Promoted to curated: I sure tend to have a lot of conversations about honesty and integrity, and this specific post was useful in 2-3 conversations I've had since it came out. I like having a concept handle for "trying to actively act with an intent to inform", I like the list of concrete examples of the above, and I like how the post situates this as something with benefits and drawbacks (while also not shying away too much from making concrete recommendations on what would be better on the margin).

Despite my general interest in open inquiry, I will avoid talking about my detailed hypothesis of how to construct such a virus. I am not confident this is worth the tradeoff, but the costs of speculating about the details here in public do seem non-trivial.

Load More