Comment author: Matthew_Opitz 05 September 2014 01:34:23AM 3 points [-]

Okay, wow, I don't know if I quite understand any of this, but this part caught my attention:

The Omohundrian/Yudkowskian argument is not that we can take an arbitrary stupid young AI and it will be smart enough to self-modify in a way that preserves its values, but rather that most AIs that don't self-destruct will eventually end up at a stable fixed-point of coherent consequentialist values. This could easily involve a step where, e.g., an AI that started out with a neural-style delta-rule policy-reinforcement learning algorithm, or an AI that started out as a big soup of self-modifying heuristics, is "taken over" by whatever part of the AI first learns to do consequentialist reasoning about code.

I have sometimes wondered whether the best way to teach an AI a human's utility function would not be to program it into the AI directly (because that will require that we figure out what we really want in a really precisely-defined way, which seems like a gargantuan task), but rather, perhaps the best way would be to "raise" the AI like a kid at a stage where the AI would have minimal and restricted ways of interacting with human society (to minimize harm...much like a toddler thankfully does not have the muscles of Arnold Schwarzenegger to use during its temper tantrums), and where we would then "reward" or "punish" the AI for seeming to demonstrate better or worse understanding of our utility function.

It always seemed to me that this strategy had the fatal flaw that we would not be able to tell if the AI was really already superintelligent and was just playing dumb and telling us what we wanted to hear so that we would let it loose, or if the AI really was just learning.

In addition to that fatal flaw, it seems to me that the above quote suggests another fatal flaw to the "raising an AI" strategy—that there would be a limited time window in which the AI's utility function would still be malleable. It would appear that, as soon as part of the AI figures out how to do consequentialist reasoning about code, then its "critical period" in which we could still mould its utility function would be over. Is this the right way of thinking about this, or is this line of thought waaaay too amateurish?

Comment author: lfghjkl 05 September 2014 07:01:03PM 6 points [-]

Very relevant article from the sequences: Detached Lever Fallacy.

Not saying you're committing this fallacy, but it does explain some of the bigger problems with "raising an AI like a child" that you might not have thought of.

Comment author: gwern 26 July 2014 08:03:19PM 12 points [-]

I did enjoy the rest of the chapter however. Quirrel's statements about horcruxes were initially surprising - if he is telling the truth, then how is he still alive? If not, then wouldn't he want Harry experimenting with horcruxes in order to turn him to the dark side?

Perhaps he is a Horcrux transfer (as long speculated) but a failed one; introspecting about how different he is from his memories of 'himself', he would realize 'he' hadn't survived and all that was left was a weird mishmash of Monroe's personality and Voldemort's memories, and this was entirely worthless as immortality.

What argument could be more convincing to Quirrel than personally embodying the failure of horcruxes as an immortality strategy?

Comment author: lfghjkl 26 July 2014 10:13:10PM *  5 points [-]

I've also been thinking along these lines, anyone remember this part from the opening ceremony?

The young, thin, nervous man who Harry had first met in the Leaky Cauldron slowly made his way up to the podium, glancing fearfully around in all directions. Harry caught a glimpse of the back of his head, and it looked like Professor Quirrell might already be going bald, despite his seeming youth.

"Wonder what's wrong with him," whispered the older-looking student sitting next to Harry. Similar hushed comments were being exchanged elsewhere along the table.

Professor Quirrell made his way up to the podium and stood there, blinking. "Ah..." he said. "Ah..." Then his courage seemed to fail him utterly, and he stood there in silence, occasionally twitching.

"Oh, great," whispered the older student, "looks like another long year in Defence class -"

"Salutations, my young apprentices," Professor Quirrell said in a dry, confident tone.

It seems to imply that becoming the second victim of a Horcrux might not necessarily create a mishmash of personalities, but instead have them competing as separate (maybe "partially mixed"?) identities. This would also explain why Harry consider his "dark side" different from himself.

Comment author: [deleted] 18 July 2014 01:27:14PM *  11 points [-]

"Doesn't work against a perfectly rational, informed agent" does not preclude "works quite well against naïve, stupid newbie LW'ers that haven't properly digested the sequences."

Memetic hazard is not a fancy word for coverup. It means that the average person accessing the information is likely to reach dangerous conclusions. That says more about the average of humanity than the information itself.

Comment author: lfghjkl 18 July 2014 04:43:28PM *  9 points [-]

Good point. To build on that here's something I thought of when trying (but most likely not succeeding) to model/steelman Eliezer's thoughts at the time of his decision:

This basilisk is clearly bullshit, but there's a small (and maybe not vanishingly small) chance that with enough discussion people can come up with a sequence of "improved" basilisks that suffer from less and less obvious flaws until we end up with one worth taking seriously. It's probably better to just nip this one in the bud. Also, creating and debunking all these basilisks would be a huge waste of time.

At least Eliezer's move has focused all attention on the current (and easily debunked) basilisk, and it has made it sufficiently low-status to try and think of a better one. So in this sense it could even be called a success.

Comment author: Lumifer 14 June 2014 04:50:04AM 14 points [-]

I am not sure about the attention example, there looks to be an issue with units. For example, if we think in terms of percentages, going from juggling 10 things to 9 gives ~11% more attention to the nine remaining things. Going from 2 things to 1 gives 100% more attention to the remaining single. And that's just math, not increasing marginal utility.

And if we're talking about resources to be amassed by societies, pretty much anything with a network effect qualifies.

Comment author: lfghjkl 15 June 2014 12:25:29AM 3 points [-]

Going from 2 things to 1 gives 100% more attention to the remaining single.

The effect will be much higher than that:

Because the brain cannot fully focus when multitasking, people take longer to complete tasks and are predisposed to error. When people attempt to complete many tasks at one time, “or [alternate] rapidly between them, errors go way up and it takes far longer—often double the time or more—to get the jobs done than if they were done sequentially,” states Meyer.[9] This is largely because “the brain is compelled to restart and refocus”.[10] A study by Meyer and David Kieras found that in the interim between each exchange, the brain makes no progress whatsoever. Therefore, multitasking people not only perform each task less suitably, but lose time in the process.

Source.

So, by focusing your attention on a single task instead of trying to do two at the same time you'll be done with that task in less than a quarter of the time (and not half as one would expect).

Comment author: shminux 23 March 2014 07:09:37AM 0 points [-]

On average, twice as long.

Comment author: lfghjkl 23 March 2014 08:03:37AM 11 points [-]

Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.

Comment author: jaibot 13 January 2014 02:07:16PM 2 points [-]

Why?

Comment author: lfghjkl 15 January 2014 12:15:55AM 0 points [-]
Comment author: RichardKennaway 25 November 2013 12:26:58PM 1 point [-]

In intuitionistic logic, it is still the case that nothing can be both true and false.

Comment author: lfghjkl 25 November 2013 02:05:30PM 1 point [-]

Sorry, misread your comment and thought you referred to the law of excluded middle. The problem with reading while I should be sleeping.

Comment author: [deleted] 25 November 2013 01:23:15AM *  5 points [-]

The Law of Non-Contradiction. Try going against this law and you may find how figured-out, in the bag, dusted and done it is. Tremendously useful.

In response to comment by [deleted] on What do we already have right?
Comment author: lfghjkl 25 November 2013 12:07:56PM *  0 points [-]

Unless you're dealing with Intuitionistic logic:

Semantically, intuitionistic logic is a restriction of classical logic in which the law of excluded middle and double negation elimination are not admitted as axioms.

Comment author: James_Miller 25 October 2013 06:20:30PM *  4 points [-]

If you have devoted a lot of resources to a "crazy" political party there is probably something wrong with you.

Comment author: lfghjkl 25 October 2013 09:37:00PM 3 points [-]

Not if you consider it the "least crazy" alternative, and with only two parties in your country there doesn't seem to be much choice.

Comment author: Ishaan 10 October 2013 03:32:11AM *  -2 points [-]

Check out "Story of your life" by the same author.

Ur'f znqr nyvraf jub jbhyq cebonoyl bcrengr ol GQG, zl cuvybfbcuvpny dhvooyrf jvgu GQG abgjvgufgnaqvat.

Comment author: lfghjkl 10 October 2013 10:46:54PM 0 points [-]

Hmm, just read that story before checking your spoiler and it was interesting, even despite the author's poor grasp of the physics he tried to explain. A light ray going from point A to point B is not taking the shortest path (measured in time) because it wants to reach B, the point B is merely a point on the geodesic curve the light ray is currently travelling along.

In other words, these light rays are taking the least time to reach the points they pass without intending to reach them, the points are just in the way.

That said, thanks for the recommendation! This story was still pretty good.

V qvfnterr gung gurfr nyvraf ner sbyybjvat GQG (be nal bgure qrpvfvba gurbel sbe gung znggre), fvapr gurl ner nyjnlf npgvat va n cerqrgrezvarq znaare naq arire npghnyyl znxr nal qrpvfvbaf. Gur jubyr pbaprcg bs n qrpvfvba gurbel jbhyq zrnavatyrff gb gurz.

What are your philosophical quibbles with TDT, if I may ask?

View more: Prev | Next