Mitchell_Porter - LessWrong

Superintelligence isn’t Approximated by a Rational Agent

I have not previously noticed the debate over coherence theorems, and I am short of the time I'd need to fully tackle this essay on its own terms. But I did notice this:

Attempting to rank 𝛀 without knowing its potential outputs means you’re not ranking it by the properties of what it does...

Is undecidability an issue here? Could we say that the preferences of a conceptually sophisticated superintelligence can never be fully ranked in practice, for Godelian reasons?

I wonder about the impact of this on a formal approach to alignment like FMI, and also whether this has some overlap with Roman Yampolskiy's work (which I've never read).

AI #126: Go Fund Yourself

Mitchell_Porter7h20

For me, the standout items in this update are

(1) the ongoing crystallization of attitudes and policy towards AI, at least on the Republican side of US politics (not sure what the Democrats are doing), even in an unusually hectic and turbulent period

(2) the rise of individual researchers in importance to the point that they can command $100 million salaries (one might want to keep track, not just of AI companies, but of AI researchers who are on this level)

(3) being able to monitor Chain of Thought as critical for AI safety

Keeping track of AI in China is still important. Googling "china ai substack" turns up the good resources I know about...

nikola's Shortform

Mitchell_Porter2d60

Source please?

Someone from the team just advertised two positions, btw.

sam's Shortform

Mitchell_Porter2d51

Let me first try to convey how this conversation appears from my perspective. I don't think I've ever debated directly with you about anything, but I have an impression of you as doing solid work in the areas of your interest.

Then, I run across you alleging that BB is using AI to write some of his articles. This catches my attention because I do keep an eye on BB's work. Furthermore, your reason for supposing that he is using AI seems bizarre to me - you think his (very occasional) "sneering" is too "dumb and cliche" to be the work of human hands. Let's look at an example:

bees seem to matter a surprising amount. They are far more cognitively sophisticated than most other insects, having about a million neurons—far more than our current president.

If that strikes you as something that a human being would never spontaneously write, I don't know what to say. Surely human comedians say similar things hundreds of times every month? It's also exactly the kind of thing that a smart-aleck "science communicator" with a progressive audience might say, don't you think? BB may be neither of those things, but he's a popular Gen-Z philosophy blogger who was a high-school debater, so he's not a thousand light-years from either of those universes of discourse.

I had a look at your other alleged example of AI writing, and I also found other callouts from your recent reddit posts. Generally I think your judgments were reasonable, but not in this case.

sam's Shortform

Mitchell_Porter3d30

off on a tangent

You say in another comment that you're going around claiming to detect LLM use in many places. I found the reasons that you gave in the case of BB, bizarrely mundane. You linked to another analysis of yours as an example of hidden LLM use, so I went to check it out. You have more evidence in the case of Alex Kesin, *maybe* even a preponderance of evidence. But there really are two hypotheses to consider, even in that case. One is that Kesin is a writer who naturally writes that way, and whose use of ChatGPT is limited to copying links without trimming them. The other is that Kesin's workflow does include the use of ChatGPT in composition or editing, and that this gave rise to certain telltale stylistic features.

Are you claiming to have seen jibes and sneering as dumb and cliche as those in his writings from before ChatGPT (Nov 2022)

The essay in question ("Don't Eat Honey") contains, by my count, two such sneers, one asserting that Donald Trump is stupid, the other asserting that Curtis Yarvin is boring. Do you not think that we could, for example, go back to the corpus of Bush-era American-college-student writings and find similar attacks on Bush administration figures, inserted into essays that are not about politics?

I am a bit worried about how fatally seductive I could find a debate about this topic to be. Clearly LLM use is widespread, and its signs can be subtle. Developing a precise taxonomy of the ways in which LLMs can be part of the writing process; developing a knowledge of "blatant signs" of LLM use and a sense for the subtle signs too; debating whether something is a false positive; learning how to analyze the innumerable aspects of the genuinely human corpus that have a bearing on these probabilistic judgments... It would be empowering to achieve sophistication on this topic, but I don't know if I can spare the time to achieve that.

sam's Shortform

Mitchell_Porter3d30

I just looked at your other alleged example of AI-generated polemic (from alexkesin.com), and I think evidence is lacking there too. That a link contains a UTM parameter referring to ChatGPT, tells us only that this link was provided in ChatGPT output, it doesn't tell us that the text around the link was written by ChatGPT as well. As for the article itself, I find nothing in its verbal style that is outside the range of human authorship. I wouldn't even call it a bad essay, just perhaps dense, colorful, and polemical. People do choose to write this way, because they want to be stylish and vivid.

I hear that the use of emdashes is far more dispositive; as a sign of AI authorship, it's up there alongside frequent "delving". But even the emdash has its human fans (e.g. the Chicago Manual of Style). It can be a sign of a cultivated writer, not an AI... Forensic cyber-philology is still an art with a lot of judgment calls.

No, Grok, No

Mitchell_Porter3d40

Claire Berlinski, whose usual beat is geopolitics, has produced an excellent overview of Grok's time as a white nationalist edgelord - what happened, where it might have come from, what it suggests. She's definitely done her homework on the AI safety side.

A non-magical explanation of Jeffrey Epstein

Mitchell_Porter7d20

possible "intelligence connections"

Epstein's benefactor co-founded the Mega Group, a target of American counterintelligence. Epstein's partner-in-crime was the daughter of a billionaire famous for his lifelong entanglements with intelligence. The future head of Biden's CIA visited Epstein, even after Epstein was convicted as a sex criminal. But these are regime secrets and he's dead, so Trump's bluff is safe.

Why Alignment Fails Without a Functional Model of Intelligence

Mitchell_Porter8d20

I commend this paper, and the theory behind it, to people's attention! It reminds me of formal verification in software engineering, but applied to thinking systems.

It used to be that, when I wanted to point to an example of advanced alignment theory, I would single out June Ku's MetaEthical AI, and also what Vanessa Kosoy and Tamsin Leake are doing. Lately I also mention PRISM as well, as an example of a brain-derived decision theory that might result from something like CEV.

Now we have this formal theory of alignment too, based on the "FMI" theory of cognition. There would be a section like this, in the mythical "textbook from the future" that tells you how to align a superintelligence.

So if I was working for a frontier AI organization, and trying to make human-friendly superintelligence, I might be looking at how to feed PRISM and FMI to my AlphaEvolve-like hive of AI researchers, as a first draft of final superalignment theory.

Zero sum expectations as an explanation of omnicide-indifference

Mitchell_Porter8d20

I agree with those who say that genocide is seen as deliberate, extinction as an act of God/Nature. I think extinction is also seen as too big to stop, whereas genocide might be stoppable by political or military means. And finally, genocide is seen as something that happens, extinction as something fictional or hypothetical.

Since we are talking about extinction being caused by human creation of out-of-control AIs... you need to recognize that this is a scenario way outside common sense. It's like asking people to think about alien invasion or being in the Matrix. Genocide may be shocking but in the end it's just murder multiplied by thousands or millions, and murder is something that all adults know to be real, even if only from the media.

So when you judge how people feel about genocide and about omnicide... it's like asking how they feel about Hitler, versus asking how they feel about Thanos. Genocide is a thing that happened in the waking world of families and jobs, the world where normal adults spend most of their lives. Omnicide is something we only know about from movies and paleontology.

There is more to reality than the human life cycle, and the boundaries of what is normal do shift, or sometimes are just forcibly invaded by external factors. COVID was an external factor; AI is an external factor even if there are attempts to domesticate and normalize it. Modernity in general, going back more than a hundred years, has constantly been haunted by strange new things and strange new ideas.

It's not impossible that AI doomerism will become a force sufficient to slow or stop the global race towards superintelligence (though if it's going to happen, it had better happen soon). Stranger things have taken place. But for that to happen, people would have to regard it as a serious possibility, and they would have to not be seduced by dreams of AI Heaven over fears of AI Hell. So there are specific psychological barriers that would need to be overcome.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments