TFD

Posts

Sorted by New

8A different take on the Musk v OpenAI preliminary injunction order

13d

0

Wikitag Contributions

Comments

Sorted by

Newest

AI #106: Not so Fast

TFD18d10

A judge has ruled that on the merits Musk is probably correct that the conversion is not okay

The question that stopped Musk from getting one was whether Musk has standing to sue based on his donations. The judge thinks that is a toss-up. But the judge went out of their way to point out that if Musk does have standing, he’s a very strong favorite to win, implicitly 75%+ and maybe 90%.

Can you explain what parts of the order lead to these conclusions? for several of the counts The Court does find standing issues, but the count relevant to this (breach of charitable trust) is addressed in section III.c. of the order. I'm not a lawyer and definitely might have missed or misunderstood something, but reading that section, it isn't clear to me that the main issue is about standing. My read is that The Court thinks there is a factual question about whether a trust/contract exists at all, with that question going to the merits of the count, not to standing.

I think footnote 11 is consistent with this reading. This footnote occurs in section III.c. in the following context (document link):

Because the threshold question of whether a charitable trust was created remains a toss-up,
Musk has not demonstrated likelihood of success on the merits sufficient to obtain an injunction. The request for an injunction barring any steps towards OpenAI’s conversion to a for-profit entity
is DENIED.11

And the text of the footnote:

Defendants also challenge plaintiffs’ standing. As Musk has not been directly affiliated
with OpenAI for several years, any standing must come from an interest in OpenAI’s assets.
California Corporations Code Section 5142(a). The Court is aware of the distinction between
Restatement (Second) of Trusts § 391 and Restatement (Third) of Trusts § 94 and cmt. g, plus the
California state authorities following the Restatement (Third). Thus, for purposes of this motion,
the Court finds plaintiffs’ standing sufficient as a settlor given the modern trend in that direction.
The motion to dismiss on this issue is DENIED. Further briefing on this topic is not necessary.

Again, I'm not a lawyer, but I don't see how this footnote can be consistent with standing being the main issue.

A lot of the comments about whether the conversion would be in the public interest occur in the context of evaluating the preliminary injunction factors, of which the public interest is one factor. The order first address likelihood of success on the merits, finding this to be a toss-up. The Court then says about the other factors (emphasis in original):

Because of this, the remaining Winter factors are derivative of, and dependent on, the first.
If a trust was indeed created, then preventing or remedying breach would be in the public interest.

This finding about the public interest is expressly conditional ("derivative of, and dependent on") the assumption that a trust was created, but that is "a toss-up". Likewise, the public interest is in "preventing or remedying breach".

I think it is helpful to distinguish two arguments against the conversion:

Musk/plaintiffs advance a theory that such a conversion would breach a trust/agreement that OpenAI/defendants made when Musk donated to OpenAI.
Separately, people have raised issues with whether the conversion would be consistent with the OpenAI non-profit's charitable purpose (which seems based on various documents to heavily overlap with what is in the "public interest").

Only the first of those issues is before The Court that is writing this order. I interpret some of the commentary around this order to be suggestion that The Court is commenting (perhaps implicitly) on the second issue. Its not clear to me if that is the case. I think its entirely possible that The Court is only commenting on the first issue (because it is the one relevant to the order) and isn't expressing any opinion on the second. It seems to me the court is saying something like "assuming OpenAI/defendants did make a commitment not to convert OpenAI into a for-profit, it can't possibly be against the public interest or contrary to the balance of the equities to require them to do so for the relatively short period from now until trial". But in The Courts view the other preliminary injunction factors essentially collapse into the likelihood of success on the merits factor due to the conditionality of that determination, and thus since The Court doesn't quite think Musk/plaintiff's evidence is quite strong enough to say success on the merits is "likely", The Court denies the preliminary injunction. I think this is much more narrow then some of your comments and some that you quote imply. I do think some parts of the order I could see an argument that if you "read between the lines" the judge might be putting in some things that kind of cast shade at OpenAI, but I think they are pretty far from definitive.

and Judge Rogers pointed this out several times to make sure that message got through

Can you elaborate on what parts of the order you had in mind for this?

*edited to fix link

Reply

Sabotage Evaluations for Frontier Models

TFD5mo32

One question about the threat model presented here. If we consider a given sabotage evaluation, does the threat model include the possibility of that sabotage evaluation itself being subject to sabotage (or sandbagging, "deceptive alignment" etc.)? "Underperforming on dangerous-capability evaluations" would arguably include this, but the paper introduces the term "sabotage evaluations". So depending on whether the authors consider sabotage evaluations a subset vs a distinct set from dangerous-capabilities evaluations I could see this going either way based on the text of the paper.

To put my cards on the table here, I'm very skeptical of the ability of any evaluation that works strictly on inputs and outputs (not looking "inside" the model) to address the threat model where those same evals are subject to sabotage without some relevant assumptions. In my subjective opinion I believe the conclusion that current models aren't capable of sophisticated sabotage is correct, but to arrive there I am implicitly relying on the assumption that current models aren't powerful enough for that level of behavior. Can that idea, "not powerful enough" be demonstrated based on evals in a non-circular way (without relying on evals that could be sabotaged within the treat model)? Its not clear to me whether that is possible.

*edited to fix spelling errors

Reply