Vladimir_Nesov comments on Hard problem? Hack away at the edges. - Less Wrong

45 Post author: lukeprog 26 September 2011 10:03AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (30)

You are viewing a single comment's thread.

Comment author: Vladimir_Nesov 26 September 2011 01:46:38PM *  5 points [-]

(A lot of stuff seems potentially relevant only until you've studied the problem for a few years and learned that mostly it's actually not.)

Comment author: scav 26 September 2011 02:25:50PM 13 points [-]

I expect a lot of actually relevant stuff doesn't seem relevant until you've studied it in connection with the problem for a few years. But maybe you don't get that far, because it didn't seem relevant :(

Friendly AI is a monster problem partly because nearly everything any human experiences, believes, wants to believe or has any opinion at all on, is potentially relevant. You could be forgiven for thinking maybe there isn't a well-defined problem buried under all that mess after all. But there may be some useful sub-problems around the edges.

Personally, even if AI-that-goes-FOOM-catastrophically isn't very likely, I think we shouldn't even need that reason to study what sort of life and environment would be optimal for humans. It doesn't have to be about asking dangerous wishes of some technological genie-in-a-bottle. We already have supra-human entities such as governments and corporations making decisions with non-zero existential risk attached, and we probably want them to be a bit friendlier if possible.

Comment author: jsalvatier 26 September 2011 04:41:19PM 4 points [-]

Do you have specific examples in mind?

Comment author: Vladimir_Nesov 26 September 2011 06:26:23PM *  3 points [-]

Machine learning (in particular, graphical models), more general AI, philosophy, game theory, algorithmic complexity, cognitive science, neuroscience seem to be mostly useless (beyond the basics) for attacking friendliness content problem. Pure mathematics seems potentially useful.

Comment author: VNKKET 26 September 2011 09:15:43PM 5 points [-]

Of the things on your list, I'm most surprised by cognitive science and maybe game theory, unless you're talking about the fields' current insights rather than their expected future insights. In that case, I'm still somewhat surprised game theory is on this list. I'd love to learn what led you to this belief.

It's possible I only know the basics, so feel free to say "read more about what the fields actually offer and it'll be obvious if you've been on Less Wrong long enough."

Comment author: Will_Newsome 26 September 2011 09:58:24PM *  0 points [-]

Agreed but would add algorithmic information theory, deep theoretical computer science, and maybe quantum information theory. There are some interesting questions about hypercomputation, getting information from context, and concrete semi-"physical" AI coordination problems. (Also reversible computing is just trippy as hell. Intuitions, especially "moral" intuitions, gawk at it.) These are of course secondary to study of updateless-like decision theories.

Comment author: wedrifid 27 September 2011 12:52:48AM 1 point [-]

Also reversible computing is just trippy as hell. Intuitions, especially "moral" intuitions, gawk at it.

They do? Why? I haven't experienced moral trippiness myself. This may be because I haven't considered the same things you have or because my intuitions are eccentric. (Assume I mean 'eccentric in a different way to how your moral intuitions are eccentric' or not depending on whether you prefer to be seen as having typical moral intuitions or atypical ones.)

Comment author: nhamann 27 September 2011 12:39:47AM 1 point [-]

I agree on most of this, but would you mind explaining why you think neuroscience is "mostly useless?" My intuition is the opposite. Also agreed that pure mathematics seems useful.

Comment author: vallinder 27 September 2011 04:09:31PM 1 point [-]

I was also wondering about neuroscience. If we take a CEV approach, wouldn't neuroscience be useful for actually determining the volitions to be extrapolated?

Comment author: Vladimir_Nesov 29 September 2011 04:26:31PM *  3 points [-]

Even if we knew everything about brains, right now we lack conceptual/philosophical insight to turn that data into something useful. In turn, neuroscience is not even primarily concerned with getting such data, it develops its own generalizations that paint a picture of roughly how brains work, but this picture probably won't be detailed enough to capture the complexity of human (extrapolated) value, even if we knew how to interpret it, which we don't.

Comment author: Zetetic 27 September 2011 01:48:29AM 5 points [-]

I would really, really like to know: What areas of pure mathematics stand out to you now?

Comment author: aletheilia 28 September 2011 10:02:49AM 0 points [-]

He might have changed his mind till now, but in case you missed it: Recommended Reading for Friendly AI Research

Comment author: Zetetic 28 September 2011 05:14:48PM 1 point [-]

I've looked over that list, but the problem is that it essentially consists of a list of items to catch you up to the state of the discussion as it was a year ago, along with a list of general mathematics texts.

I'm pretty well acquainted with mathematical logic; the main item on the list that I'm particularly weak in would be category theory, and I'm not sure why category theory is on the list. I've a couple of ideas about the potential use of category theory in, maybe, knowledge representation or something along those lines, but I have no clue how it could be brought to bear on the friendliness content problem.

Comment author: Vladimir_Nesov 29 September 2011 04:16:42PM *  3 points [-]

The book list is somewhat obsolete (the list of LW posts is not), but I'm not ready to make the next iteration. The state of decision theory hasn't changed much since then.

Roughly, the central mystery seems to be the idea of acausal control. It feels like it might even be useful for inferring friendliness content, along the lines of what I described here. But we don't understand that idea. It first more or less explicitly appeared in UDT with its magical mathematical intuition module, and became more concrete in ADT, where proofs are used instead (at the cost of making it useless where complete proofs can't be expected, that is almost always outside very simple thought experiments).

The problem is this: given action-definition and utility-definition, agent can find a function between their sets of possible values and use it as a "utility function", but other "utility functions" are correct as well, the agent just isn't capable of finding them, but somehow it's a good thing, that's why it works (see this post). What makes some of the functions "better" than others? Can we generalize this to inference of dependencies between facts other than action and utility-value? What particular properties of agents constructed in one of the standard ways allows them to be controlled by some, but not other dependencies? What kinds of "facts" are relevant? What constitutes a "fact"? (In ADT, a "fact" is an axiomatic definition of a structure, which refers to some particular class of structures and not to other structures; decision theory then considers ways in which some of these "facts" can control other "facts", that is make the structures defined by certain definitions be a certain way, given control over other structures that contain agent's action.)

It feels like mathematics is the discipline for clarifying questions like this (and it's perhaps not useful to prioritize its areas, though some emphasis on foundations seems right). An important milestone would be to produce a useful problem statement about clarification of this idea of acausal dependence that can be communicated at least to mathematicians on LW.

Comment author: lukeprog 27 September 2011 12:58:36AM 1 point [-]

Of course you want to fail as quickly as you can, though you and I do seem to have slightly different intuitions about what is likely to end up being useful for friendliness content. Or rather, I have a slightly broader set of things that I think have a decent chance of being useful.