Wei Dai offered 7 tips on how to answer really hard questions:
- Don't stop at the first good answer.
- Explore multiple approaches simultaneously.
- Trust your intuitions, but don't waste too much time arguing for them.
- Go meta.
- Dissolve the question.
- Sleep on it.
- Be ready to recognize a good answer when you see it. (This may require actually changing your mind.)
Some others from the audience include:
- Transform the problem into a different domain.
- Ask people who have worked on similar problems.
- Decompose the problem into subproblems. (Analysis)
I'd like to offer one more technique for tackling hard questions: Hack away at the edges.
General history books compress time so much that they often give the impression that major intellectual breakthroughs result from sudden strokes of insight. But when you read a history of just one breakthrough, you realize how much "chance favors the prepared mind." You realize how much of the stage had been set by others, by previous advances, by previous mistakes, by a soup of ideas crowding in around the central insight made later.
It's this picture of the history of mathematics and science that makes me feel quite comfortable working on hard problems by hacking away at their edges.
I don't know how to build Friendly AI. Truth be told, I doubt humanity will figure it out before going extinct. The whole idea might be impossible or confused. But I'll tell you this: I doubt the problem will be solved by getting smart people to sit in silence and think real hard about decision theory and metaethics. If the problem can be solved, it will be solved by dozens or hundreds of people hacking away at the tractable edges of Friendly AI subproblems, drawing novel connections, inching toward new insights, drawing from others' knowledge and intuitions, and doing lots of tedious, boring work.
Here's what happened when I encountered the problem of Friendly AI and decided I should for the time being do research on the problem rather than, say, trying to start a few businesses and donate money. I realized that I didn't see a clear path toward solving the problem, but I did see tons of apparently relevant research that could be done around the edges of the problem, especially with regard to friendliness content (because metaethics is my background). Snippets of my thinking process look like this:
Friendliness content is about human values. Who studies human values, besides philosophers? Economists and neuroscientists. Let's look at what they know. Wow, neuroeconomics is far more advanced than I had realized, and almost none of it has been mentioned by anybody researching Friendly AI! Let me hack away at that for a bit, and see if anything turns up.
Some people approach metaethics/CEV with the idea that humans share a concept of 'ought', and figuring out what that is will help us figure out how human values are. Is that the right way to think about it? Lemme see if there's research on what concepts are, how much they're shared between human brains, etc. Ah, there is! I'll hack away at this next.
CEV involves the modeling of human preferences. Who studies that? Economists do it in choice modeling, and AI programmers do it in preference elicitation. They even have models for dealing with conflicting desires, for example. Let me find out what they know...
CEV also involves preference extrapolation. Who has studied that? Nobody but philosophers, unfortunately, but maybe they've found something. They call such approaches "ideal preference" or "full information" accounts of value. I can check into that.
You get the idea.
This isn't the only way to solve hard problems, but when problems are sufficiently hard, then hacking away at their edges may be just about all you can do. And as you do, you start to see where the problem is more and less tractable. Your intuitions about how to solve the problem become more and more informed by regular encounters with it from all angles. You learn things from one domain that end up helping in a different domain. And, inch by inch, you make progress.
Of course you want to be strategic about how you're tackling the problem. But you also don't want to end up thinking in circles because the problem is too hard to even think strategically about how to tackle it.
You also shouldn't do 3 months of thinking and never write any of it down because you know what you've thought isn't quite right. Hacking away at a tough problem involves lots of wrong solutions, wrong proposals, wrong intuitions, and wrong framings. Maybe somebody will know how to fix what you got wrong, or maybe your misguided intuitions will connect to something they know and you don't and spark a useful thought in their head.
Okay, that's all. Sorry for the rambling!
The book list is somewhat obsolete (the list of LW posts is not), but I'm not ready to make the next iteration. The state of decision theory hasn't changed much since then.
Roughly, the central mystery seems to be the idea of acausal control. It feels like it might even be useful for inferring friendliness content, along the lines of what I described here. But we don't understand that idea. It first more or less explicitly appeared in UDT with its magical mathematical intuition module, and became more concrete in ADT, where proofs are used instead (at the cost of making it useless where complete proofs can't be expected, that is almost always outside very simple thought experiments).
The problem is this: given action-definition and utility-definition, agent can find a function between their sets of possible values and use it as a "utility function", but other "utility functions" are correct as well, the agent just isn't capable of finding them, but somehow it's a good thing, that's why it works (see this post). What makes some of the functions "better" than others? Can we generalize this to inference of dependencies between facts other than action and utility-value? What particular properties of agents constructed in one of the standard ways allows them to be controlled by some, but not other dependencies? What kinds of "facts" are relevant? What constitutes a "fact"? (In ADT, a "fact" is an axiomatic definition of a structure, which refers to some particular class of structures and not to other structures; decision theory then considers ways in which some of these "facts" can control other "facts", that is make the structures defined by certain definitions be a certain way, given control over other structures that contain agent's action.)
It feels like mathematics is the discipline for clarifying questions like this (and it's perhaps not useful to prioritize its areas, though some emphasis on foundations seems right). An important milestone would be to produce a useful problem statement about clarification of this idea of acausal dependence that can be communicated at least to mathematicians on LW.