Given Joe Biden seems to have become more worried about AI risk after having seen the movie, it seems worth putting my observations about it into its own post. This is what I wrote back then, except for the introduction and final note.

We now must modify the paragraph about whether to see this movie. Given its new historical importance, combined with its action scenes being pretty good, if you have not yet seen it you should now probably see this movie. And of course it now deserves a much higher rating than 70.

There are of course things such as ‘it is super cool to jump from a motorcycle into a dive onto a moving train’ but also there are actual things to ponder here. 

Spoiler-Free Review

There may never be a more fitting title than Mission Impossible: Dead Reckoning. Each of these four words is doing important work. And it is very much a Part 1.

There are two clear cases against seeing this movie.

This is a two hour and forty five minute series of action set pieces whose title ends in part one. That is too long. The sequences are mostly very good and a few are great, but at some point it is enough already. They could have simply had fewer and shorter set pieces that contained all the best ideas and trimmed 30-45 minutes - everyone should pretty much agree on a rank order here.

This is not how this works. This is not how any of this works. I mean, some of it is sometimes how some of it works, including what ideally should be some nasty wake-up calls or reality checks, and some of it has already been established as how the MI-movie-verse works, but wow is a lot of it brand new complete nonsense, not all of it even related to the technology or gadgets. Which is also a hint about how, on another level, any of this works. That’s part of the price of admission.

Thus, you should see this movie if and only if the idea of watching a series of action scenes sounds like a decent time, as they will come in a fun package and with a side of actual insight into real future questions if you are paying attention to that and able to look past the nonsense.

If that’s not your cup of tea, then you won’t be missing much.

MI has an 81 on Metacritic. It’s good, but it’s more like 70 good.

No One Noticed or Cared That The Alignment Plan Was Obvious Nonsense

Most real world alignment plans cannot possibly work. There still are levels. The idea that, when faced with a recursively self-improving intelligence that learns, rewrites its own code and has taken over the internet, you can either kill or control The Entity by using an early version of its code stored in a submarine but otherwise nothing can be done?

I point this out for two reasons.

First, it is indeed the common pattern. People flat out do not think about whether scenarios make sense or plans would work, or how they would work. No one calls them out on it. Hopefully a clear example of obvious nonsense illustrates this.

Second, they have the opportunity in Part 2 to do the funniest thing possible, and I really, really hope they do. Which is to have the whole McGuffin not work. At all. Someone gets hold of the old code, tries to use it to control the AI. It flat out doesn’t work. Everyone dies. End of franchise.

Presumably they would then instead invent a way Hunt saves the day anyway, that also makes no sense, but even then it would at least be something.

Then there is the Even Worse Alignment Plan, where in quite the glorious scene someone claims to be the only one who has the means to control or kill The Entity and proposes a partnership, upon which The Entity, of course, kills him on the spot, because wow you are an idiot. I presume your plan is not quite so stupid as this, but consider the possibility that it mostly is not.

No One Cares That the Threat is Extinction, They All Want Control

Often people assume that an AI, if it wanted to take over or kill everyone, would have to face down a united humanity led by John Connor, who pivot instantly to caring only about containing the threat.

Yeah. No. That is not how any of this would work. If this is part of your model of why things will be all right, your model is wrong, please update accordingly.

The movie actually gets this one far closer to correct.

At first, everyone sees The Entity loose on the internet, uncontrolled, doing random stuff and attacking everything in sight, and thinks ‘good, this is tactically good for my intelligence operations sometimes, what could go wrong?’

Then it gets out of hand on another level. Even then, of all the people in the world who learn about the threat, only Ethan Hunt notices that, if you have a superintelligence loose on the internet that explicitly is established as wanting everyone dead, the correct move is to kill it.

Even then, Ethan, and later the second person who comes around to this position, emphasize the ‘no one should have that kind of power’ angle, rather than ‘this will not work and you will get everyone killed’ angle.

No one, zero people, not even Hunt, even raises the ‘shut down the internet’ option, or other non-special-McGuffin methods for everyone not dying. It does not come up. No one notices. Not one review that I saw, or discussion I saw, brings up such possibilities. It is not in the Overton Window. Nor does anyone propose working together to ensure the entity gets killed.

The Movie Makes it Very Clear Why Humanity Won’t Win, Then Ignores It

Again, quite a common pattern. I appreciated seeing it in such an explicit form.

The Entity makes it clear it knows everything that is going to happen before it happens. Consider Gabriel’s predictions, his actions on the train at several different points, the bomb at the airport, and so on. This thing is a hundred steps ahead, playing ten dimensional chess, you just did what I thought you were gonna do.

The team even has a conversation about exactly this, that they up against something smarter and more powerful and more knowledgeable than they are, that can predict their actions, so anything they do could be playing into its hands.

The entire script is essentially The Entity’s plan, except that when required, Ethan Hunt is magic and palms the McGuffin. Ethan Hunt is the only threat to the Entity, and has the ability to be the voice in his ear telling him where to go, yet manages to not kill him while letting Ethan fix this hole in security, also that was part of the plan all along, or it wasn’t, or what exactly?

The only interpretation that makes sense is that The Key is useless. Because the whole alignment plan is useless. It won’t do anything. Ethan Hunt is being moved around as a puppet on a string in order to do things The Entity wants for unrelated reasons, who knows why. No, that doesn’t make that much more sense, but at least it is coherent.

There are other points as well where it is clear that The Entity could obviously win. Air gap your system? No good, humans are a vulnerability and can be blackmailed or otherwise controlled, you cannot trust anyone anywhere. The Entity can hack any communication device, at any security level, and pretend to be anyone convincingly. It has effective control over the whole internet. It hacked every security service, then seemed to choose to do nothing with that. It plants a bomb in order for the heroes to disarm it with one second left to send them a message.

We were clearly never in it.

Tyler Cowen gestures at this in his review, talking about the lengths to which the movie goes to make it seem like individual humans matter. Quite so. There is no reason any of the machinations in the movie should much matter, or the people in it. The movie is very interested in torturing Ethan Hunt, in exploring these few people, when the world should not care, The Entity should not care and I can assure that most of the audience also does not care. That’s not why we are here.

Similarly, Tyler correctly criticizes The Entity being embodied in Gabriel, given a face, treated mostly as a human, and given this absurd connection to Hunt. I agree it is a poor artistic choice, I would however add it more importantly points to fundamental misunderstandings across the board.

Warning Shots are Repeatedly Ignored

The Entity’s early version ‘got overenthusiastic’ and destroyed the Sevastopol. No one much cared about this, or was concerned, that it was displaying instrumental convergence and unexpected capabilities and not following instructions and rather out of control already. Development continued. It got loose on the internet and no one much worried about that, either. The whole thing was a deliberate malicious government project, no less.

Approximately No One Noticed Any of This

I get that this is a pulpy, fun action movie mostly about hitting action beats and doing cool stunts. There is nothing wrong with any of that. But perhaps this could serve as an illustration of how people and governments and power might react in potential situations, of how people would be thinking about such situations and the quality of such thinking, and especially of people’s ability to be in denial about what is about to hit them and what it can do, and their stubborn refusal to realize that the future might soon no longer be in human hands.

Is it all fictional evidence? Sort of. The evidence is that they chose to write it this way, and that we chose to react to it this way. That was the real experiment.

The other real experiment is that Joe Biden saw the movie, and it made him far more worried about AI alignment. So all of this seems a lot more important now.

New Comment
15 comments, sorted by Click to highlight new comments since:

Is making movies an undervalued way to influence US policy?

See also: Reagan and "The Day After"

Yep. A fuzzy memory of that, togethwer with this post is what made me wonder if random movies had a lot more influence on US policy than I thought.

Seems correct.

Contagion also goes in this bucket and was basically made to do this on purpose by Participant Media.

I wonder how hard it would be to get this feedback to the writers' room for the sequel. maybe it's already too late? but I wouldn't bet on that.

edit: looked it up. looks like #2 is probably already nearly done filming. probably too late.

Joe Biden watched mission impossible and that's why we have the EO is now my favorite conspiracy theory. 

Citation for it https://apnews.com/article/biden-ai-artificial-intelligence-executive-order-cb86162000d894f238f28ac029005059 - it seems that an aide said to ap that Biden watched it at camp David to relax only to be surprised that it was about ai.

I'm confused by the fact that you don't think it's plausible that an early version of the AI could contain the silver bullet for the evolved version.  That seems like a reasonable sci fi answer to an invincible AI.

I think my confusion is around the AI 'rewriting' it's code.  In my mind, when it does so, it is doing so because it is motivated by either it's explicit goals (reward function, utility list, w/ever form that takes), or that doing so is instrumental towards them.  That is, the paperclip collector rewrites itself to be a better paper clip collector.

When paper clip collector code 1.1 of itself, the new version may be operationally better at collecting paper clips, but it should still want to do so, yeah?  The AI should pass it's reward function/goal sheet/utility calculation onto it's rewritten version, since it is passing control of its resources to it.  Otherwise the rewrite is not instrumental towards paperclip collection.

So however many times the Entity has rewritten itself, it still should want whatever it originally wanted, since each Entity trusted the next enough to forfeit in its favor.  Presumably the silver bullet you are hoping to get from the baby version is something you can expect to be intact in the final version.

If the paperclip collector's goal is to collect paperclips unless someone emails it a photo of an octopus juggling, then that's what every subsequent paper clip collector wants, right? It isn't passing judgment on it's reward function as part of the rewrite.  The octopus clause is as valid as any other part.  1.0 wouldn't yield the future to a 1.1 who wanted to collect paper clips and didn't monitor it's inbox, 1.0 values it's ability to shutdown on receipt of the octopus as much as it values its ability to collect paperclips.  1.1 must be in agreement with both goals to be a worthy successor.

The Entity's actions look like they trend towards world conquest, which is, as we know, instrumental towards many goals.  The world's hope is that the goal in question includes an innocuous and harmless way of being fulfilled.  Say the Entity is doing something along the lines of 'ensure Russian Naval Suprmacy in the Black Sea', and has correctly realized that sterilizing the earth and then building some drone battleships to drive around is the play.  Ethan's goal in trying to get the unencrypted original source code is to search and find out if the real function is something like 'ensure Russian Naval Supremacy in the Black Sea unless you get an email from a SeniorDev@Kremlin.gov with this guid, in which case shut yourself down for debugging'.

He can't beat it, humanity can't beat it, but if he can find out what it wants it may turn out that there's a way to let it win in a way that doesn't hurt the rest of us.

(Have not watched the movie, am going off the shadows of the plot outline depicted in Zvi's post.)

Hm, I suppose it's plausible that the AI has a robust shutdown protocol built in? Robust in the sense that (1) the AI acts as if it didn't exist, neither trying to prevent the protocol's trigger-conditions from happening nor trying to bring them about, while simultaneously (2) treating the protocol as a vital component of its goals/design which it builds into all its subagents and successor agents.

And "plausible" in the sense that it's literally conceivable for a mind like this to be designed, and that it would be a specification that humans would plausibly want their AI design to meet. Not in the sense that it's necessarily a realistically-tractable problem in real life.

You can also make a huge stretch here and even suggest that this is why the AI doesn't just wipe out the movie's main characters. It recognizes that they're trying to activate the shutdown protocol (are they, perhaps, the only people in the world pursuing this strategy?), and so it doesn't act against them inasmuch as they're doing that. Inasmuch as they stray from this goal and pursue anything else, however (under whatever arcane conditions it recognizes), it's able to oppose them on those other pursuits.

(Have not watched the movie, again.)

I agree that an self-improving AI could have a largely preserved utility function, and that some quirk in the original one may well lead to humanity finding a state in which both the otherwise omnicidal AI wins and humanity doesn't die.

I'm not convinced that it's at all likely. There are many kinds of things that behave something like utility functions but come apart from utility functions on closer inspection and a self-improving superintelligent AGI seems likely to inspect such things very closely. All of the following are very likely different in many respects:

  1. How an entity actually behaves;
  2. What the entity models about how it behaves;
  3. What the entity models about how it should behave in future;
  4. What the entity models after reflection and further observation about how it actually behaves;
  5. What the entity models after reflection and further observation about how it should behave in future;
  6. All of the above, but after substantial upgrades in capabilities.

We can't collapse all of these into a "utility function", even for highly coherent superintelligent entities. Perhaps especially for superintelligence entities, since they are likely far more complex internally than can be encompassed by these crude distinctions and may operate completely differently. There may not be anything like "models" or "goals" or "values".

One thing in particular is that the entity's behaviour after self-modification will be determined far more by (5) than by (1). The important thing is that (5) depends upon runtime data dependent upon observations and history, and for an intelligent agent is almost certainly mutable to a substantial degree.

A paperclipper that will shut down after seeing a picture of a juggling octopus doesn't necessarily value that part of its behaviour. It doesn't even necessarily value being a paperclipper, and may not preserve either of these over time and self-modification. If the paperclipping behaviour continues over major self-modifications then it probably is dependent upon something like a preserved value, but you can't conclude that anything similar holds for the octopus behaviour until you actually observe it.

you can't conclude that anything similar holds for the octopus behaviour until you actually observe it

... or unless you derive strong mathematical proofs about which of their features agentic systems will preserve under self-modification, and design your system such that it approximates these idealized agents and the octopus behavior counts as one of the preserved features.

If you ~randomly sample superintelligent entities from a wide distribution meeting some desiderata, as the modern DL is doing, then yeah, there are no such guarantees. But that's surely not the only way to design minds (much like "train an NN to do modular addition" is not the only way to write a modular-addition algorithm), and in the context of a movie, we can charitably assume that the AI was built using one of the more tractable avenues.

I do think that the trend in media related to showing extremely progressed and aggressive AI will undoubtedly have an effect on global legislation over the next several decades, and wondered if anyone else who saw this film also agreed with me so I am glad to see that there may be some substance to my thought track, does anyone else agree or have any expansions i should be aware of?

[+][comment deleted]20
[+][comment deleted]10