Posts

Sorted by New

Wiki Contributions

Comments

If you watch the first episode of Hazbin Hotel (quick plot synopsis, Hell's princess argues for reform in the treatment of the damned to an unsympathetic audience) there's a musical number called 'Hell Is Forever' sung by a sneering maniac in the face of an earnest protagonist asking for basic, incremental fixes.

It isn't directly related to any of the causes this site usually champions, but if you've ever worked with the legal/incarceration system and had the temerity to question the way things operate the vibe will be very familiar.  

Hazbin Hotel Official Full Episode "OVERTURE" | Prime Video (youtube.com)

No one writes articles about planes that land safely.

I'm confused by the fact that you don't think it's plausible that an early version of the AI could contain the silver bullet for the evolved version.  That seems like a reasonable sci fi answer to an invincible AI.

I think my confusion is around the AI 'rewriting' it's code.  In my mind, when it does so, it is doing so because it is motivated by either it's explicit goals (reward function, utility list, w/ever form that takes), or that doing so is instrumental towards them.  That is, the paperclip collector rewrites itself to be a better paper clip collector.

When paper clip collector code 1.1 of itself, the new version may be operationally better at collecting paper clips, but it should still want to do so, yeah?  The AI should pass it's reward function/goal sheet/utility calculation onto it's rewritten version, since it is passing control of its resources to it.  Otherwise the rewrite is not instrumental towards paperclip collection.

So however many times the Entity has rewritten itself, it still should want whatever it originally wanted, since each Entity trusted the next enough to forfeit in its favor.  Presumably the silver bullet you are hoping to get from the baby version is something you can expect to be intact in the final version.

If the paperclip collector's goal is to collect paperclips unless someone emails it a photo of an octopus juggling, then that's what every subsequent paper clip collector wants, right? It isn't passing judgment on it's reward function as part of the rewrite.  The octopus clause is as valid as any other part.  1.0 wouldn't yield the future to a 1.1 who wanted to collect paper clips and didn't monitor it's inbox, 1.0 values it's ability to shutdown on receipt of the octopus as much as it values its ability to collect paperclips.  1.1 must be in agreement with both goals to be a worthy successor.

The Entity's actions look like they trend towards world conquest, which is, as we know, instrumental towards many goals.  The world's hope is that the goal in question includes an innocuous and harmless way of being fulfilled.  Say the Entity is doing something along the lines of 'ensure Russian Naval Suprmacy in the Black Sea', and has correctly realized that sterilizing the earth and then building some drone battleships to drive around is the play.  Ethan's goal in trying to get the unencrypted original source code is to search and find out if the real function is something like 'ensure Russian Naval Supremacy in the Black Sea unless you get an email from a SeniorDev@Kremlin.gov with this guid, in which case shut yourself down for debugging'.

He can't beat it, humanity can't beat it, but if he can find out what it wants it may turn out that there's a way to let it win in a way that doesn't hurt the rest of us.

My 'trust me on the sunscreen' tip for oral stuff is to use flouride mouthwash.  I come from a 'cheaper by the dozen' kind of family, and we basically operated as an assembly line. Each just like the one before, plus any changes that the parents made this time around.

One of the changes that they made to my upbringing was to make me use mouthwash. Now, in adulthood, my teeth are top 10% teeth (0 cavities most years, no operations, etc), as are those of all of my younger siblings.  My elders have much more difficulty with their teeth, aside from one sister who started using mouthwash after Mom told her how it was working for me + my younger bros.

I think (not that anyone is saying otherwise) that the power fantasy can be expressed in a coop game just fine.

We all know the guy who brokenbirds about playing the healer in D&D, yeah? Like, the person who it is real important to that everyone knows how unselfish they are.

If you put a 'forego personal advancement to help the team win' button in a game without a solo winner people will break their fingers cuz they all tried to mash it at once. People mash these in games WITH a solo winner (kingmaker syndrome, home brew victory conditions, etc).

100% red means everyone lives, and it doesn't require any trust or coordination to achieve.

 

--Yes this.

 

If you change it so there are hostages (people who don't get to choose, but will die if the blue threshold isn't met), then it becomes interesting.

-- That was actually a strongfemaleprotagonist storyline, cleaving along a difference between superheroic morality and civilian morality, then examined further as the teacher was interrogated later on.

It seems like everyone will pick red pill, so everyone will live.  Simple deciders will minimize risk to self by picking red, complex deciders will realize simple deciders exist and pick red, extravagant theorists will realize that the universal red accomplishes the same thing as universal blue.

A cause, any cause whatsoever, can only get the support of one of the two major US parties.  Weirdly, it is also almost impossible to get the support of less than one of the major US parties, but putting that aside, getting the support of both is impossible.  Look at Covid if you want a recent demonstration.

Broadly speaking, you want the support of the left if you want the gov to do something, the right if you are worried about the gov doing something.  This is because the left is the gov's party (look at how DC votes, etc), so left admins are unified and capable by comparison with right admins, which suffer from 'Yes Minister' syndrome.

AI safety is a cause that needs the gov to act affirmatively.  It's proponents are asking the US to take a strong and controversial position, that its industry will vigorously oppose.  You need a lefty gov to pull something like that off, if indeed it is possible at all.

Getting support from the right will automatically decrease your support from the left.  Going on Glenn Beck would be an own goal, unless EY kicked him in the dick while they were live.

The old joke about the guy searching for his spectacles under the stoplight even though he lost them elsewhere feels applicable.

In many cases people's real drive is to reduce the internal pressure to act, not to succeed at whatever prompted that pressure.  Going full speed and turning around both might provoke the shame function (I am ignoring my nagging doubts...), but doing something, anything, in response to it quiets the inner shouting, even if it is nonsensical.

Load More