Meetup : MIRIxAtlanta - MIRI Research Guide + Corrigibility

1 Adele_L 16 November 2014 10:34PM

Discussion article for the meetup : MIRIxAtlanta - MIRI Research Guide + Corrigibility

WHEN: 22 November 2014 06:00:00PM (-0500)

WHERE: 2388 Lawrenceville Hwy. Unit L. Decatur, GA 30033

We'll go over the new research guide http://intelligence.org/research-guide/ which discusses which mathematical knowledge is necessary for doing FAI research, and goes over the major lines of research done at MIRI.

We will also look at a new line of research called corrigibility. From the research guide: "As artificially intelligent systems grow in intelligence and capability, some of their available options may allow them to resist intervention by their programmers. We call an AI system “corrigible” if it cooperates with what its creators regard as a corrective intervention, despite default incentives for rational agents to resist attempts to shut them down or modify their preferences."

There will also be snacks and cats! Hope to see you there!

Discussion article for the meetup : MIRIxAtlanta - MIRI Research Guide + Corrigibility

Comment author: Kaj_Sotala 11 November 2014 06:45:20AM 1 point [-]

Quines don't say anything about human working memory limitations or the amount of time a human would require for learning to understand the whole system, and furthermore only talk about printing the source code not understanding it, so I'm not sure how they're relevant for this.

Comment author: Adele_L 11 November 2014 03:17:05PM 2 points [-]

I wouldn't be too surprised if the hypothesis is true for unmodified humans, but for systems in general I expect it to be untrue. Whatever 'understanding' is, the diagonal lemma should be able to find a fixed point for it (or at the very least, an arbitrarily close approximation) - it would be very surprising if it didn't hold. Quines are just an instance of this general principle that you can actually play with and poke around and see how they work - which helps demystify the core idea and gives you a picture of how this could be possible.

Comment author: Fluttershy 11 November 2014 04:20:52AM 6 points [-]

In case it was not obvious, the correct takeaway from this article is that you should go and get a flu shot, if you haven't gotten one already this year. If you have already gotten a flu shot this year, and you reply to this comment with a message that states that you have done so, I would be more than happy to upvote you.

Comment author: Adele_L 11 November 2014 05:44:52AM 5 points [-]

I got one this year! I didn't get one last year, and someone else ended up getting very sick as a direct consequence... :(

Comment author: polymathwannabe 11 November 2014 05:24:22AM 0 points [-]

I have a hypothesis based on systems theory, but I don't know how much sense it makes.

A system can only simulate a less complex system, not one at least as complex as itself. Therefore, human neurologists will never come up with a complete theory of the human mind, because they'll not be able to think of it, i.e. the human brain cannot contain a complete model of itself. Even if collectively they get to understand all the parts, no single brain will be able to see the complete picture.

Am I missing some crucial detail?

Comment author: Adele_L 11 November 2014 05:41:34AM *  1 point [-]

Seems unlikely, given the existence of things like quines, and the fact that self-reference comes pretty easily. I recommend reading Godel Escher Bach, it discusses your original question in the context of this sort of self-referential mathematics, and is also very entertaining.

Comment author: gwern 08 November 2014 04:00:48AM 1 point [-]

Already has been, see Reddit.

Comment author: Adele_L 08 November 2014 04:33:34AM *  2 points [-]

What was the string that generated the hash, then?

ETA: See Lumifer's link above.

Comment author: Gunnar_Zarncke 06 November 2014 08:14:01AM 3 points [-]

Your poll is somewhat broken (last option missing). Note that ability to rotate in the mind is very differently expressed. Some people do it effortlessly, some even with multiple elements (Tesla was said to be able to animate whole machines in his mind). Therefore I'd recommend to provide a scaled or indexical poll ("not at all", "partial/limited", "single element single rotation", "single element, multiple motions/changes", "multiple elements interacting (gears)", "whole machines"). As only 4 people (me included) voted I recommend to repost the poll and extend it.

Comment author: Adele_L 06 November 2014 04:51:37PM 2 points [-]

Thanks for catching the error, and I think the rest of your suggestion is good, but unfortunately 32 people have taken it now (wow!) and I don't think I can change it without breaking it.

Comment author: Adele_L 06 November 2014 06:42:45AM *  7 points [-]

It's well known that men are better at mental rotation and other forms of spatial reasoning than women. I've always been pretty good at it - my default technique is to carefully check the relations (i.e. count the number of cubes in the segment, note the relative angle of the joint, and make sure they match). It was only recently that I realized that some people actually just rotated it in their head, and 'looked' to see if it was the same.

Anyway, I was wondering if maybe the technique used was correlated with gender.

What sex were you assigned at birth?

With what gender do you primarily identify?

What method do you use to do mental rotations?

(Something else}

Submitting...

Comment author: Gunnar_Zarncke 24 October 2014 06:20:14AM *  34 points [-]

Most comments show exactly one downvote without a clear pattern why. I'd guess that a single person downvoted all these short comments. Can it be that this user doesn't know the custom of upvoting survey-takers?

ADDED 2014-10-25T16:20 UTC: The single downvotes disappeared.

ADDED 2014-10-26T21:10 UTC: The single downvotes reappeared again (at least for a lot of high scoring comments).

Comment author: Adele_L 31 October 2014 05:47:25PM 0 points [-]

Almost everyone has a downvote again. What's more interesting is the short list of people who don't...

Comment author: TobyBartels 30 October 2014 05:44:06AM 5 points [-]

It has been reported here that largest volume, longest length, and largest mass all give the same result.

Comment author: Adele_L 31 October 2014 05:36:13PM 7 points [-]

That still doesn't help for the purposes of calibration, when you have uncertainty over whether these are all the same.

Comment author: Nornagest 30 October 2014 10:21:22PM *  3 points [-]

I agree about all of that except for contrarianism (and yes, I'm aware of the irony). You want to have some amount of contrarianism in your ecosystem, because people sometimes aren't satisfied with the hivemind and they need a place to go when that happens. Sometimes they need solutions that work where the mainstream answers wouldn't, because they fall into a weird corner case or because they're invisible to the mainstream for some other reason. Sometimes they just want emotional support. And sometimes they want an argument, and there's a place for that too.

What you don't want is for the community's default response to be "find the soft bits of this statement, and then go after them like a pack of starving hyenas tearing into a pinata made entirely of ham". There need to be safe topics and safe stances, or people will just stop engaging -- no one's always in the mood for an argument.

On the other hand, too much agreeableness leads to another kind of failure mode -- and IMO a more sinister one.

Comment author: Adele_L 31 October 2014 12:31:57AM 2 points [-]

The article talked about endless contrarianism, where people disagree as a default reaction, instead of because of a pre-existing difference in models. I think that is a problem in the LW community.

View more: Prev | Next