In How I do research, TurnTrout writes:

[I] Stare at the problem on my own, ignoring any existing thinking as much as possible. Just think about what the problem is, what's confusing about it, what a solution would look like. In retrospect, this has helped me avoid anchoring myself. Also, my prior for existing work is that it's confused and unhelpful, and I can do better by just thinking hard.

The MIRI alignment research field guide has a similar sentiment:

It’s easy to fall into a trap of (either implicitly or explicitly) conceptualizing “research” as “first studying and learning what’s already been figured out, and then attempting to push the boundaries and contribute new content.”

The problem with this frame (according to us) is that it leads people to optimize for absorbing information, rather than seeking it instrumentally, as a precursor to understanding. (Be mindful of what you’re optimizing in your research!) [...]

... we recommend throwing out the whole question of authority. Just follow the threads that feel alive and interesting. Don’t think of research as “study, then contribute.” Focus on your own understanding, and let the questions themselves determine how often you need to go back and read papers or study proofs.

Approaching research with that attitude makes the question “How can meaningful research be done in an afternoon?” dissolve. Meaningful progress seems very difficult if you try to measure yourself by objective external metrics. It is much easier when your own taste drives you forward.

And I'm pretty sure that I have also seen this notion endorsed elsewhere on LW: do your own thinking, don't anchor on the existing thinking too much, don't worry too much about justifying yourself to established authority. It seems like a pretty big theme among rationalists in general.

At the same time, it feels like there are fields where nobody would advise this, or where trying to do this is a well-known failure mode. TurnTrout's post continues:

I think this is pretty reasonable for a field as young as AI alignment, but I wouldn't expect this to be true at all for e.g. physics or abstract algebra. I also think this is likely to be true in any field where philosophy is required, where you need to find the right formalisms instead of working from axioms.

It is not particularly recommended that people try to invent their own math instead of studying existing math. Trying to invent your own physics without studying real physics just makes you into a physics crank, and most fields seem to have some version of "this is an intuitive assumption that amateurs tend to believe, but is in fact wrong, though the reasons are sufficiently counterintuitive that you probably won't figure it out on your own".

But "do this in young fields, not established ones" doesn't seem quite right either. For one, philosophy is an old field, yet it seems reasonable that we should indeed sometimes do it there. And it seems that even within established fields where you normally should just shut up and study, there will be particular open questions or subfields where "forget about all the existing work and think about it on your own" ought to be good advice.

But how does one know when that is the case?

New Answer
New Comment

8 Answers sorted by

Shmi

330

My field is theoretical physics, so this is where my views come from. (Disclaimer: I have not had a research position since finishing my PhD in General Relativity some 10 years ago.) Assuming you want to do original research, and you are not a genius like Feynman (in which case you would not be interesting in my views, anyway, what do you care what other people think?):

  • Map the landscape first. What is known, which areas of research are active, which are inactive. No need to go super deep, just get the feel for what is where.
  • Gain the basic understanding of why the landscape is the way it is. Why are certain areas being worked on? Is it fashion, ease of progress, tradition, something else? Why are certain areas being ignored or stagnate? Are they too hard, too boring, unlikely to get you a research position, just overlooked, or something else?
  • Find a promising area which is not well researched, does not appear super hard, yet you find interesting. Interdisciplinary outlook could be useful.
  • Figure out what you are missing to do a meaningful original contribution there. Evaluate what it would take to learn the prerequisites. Alternate between learning and trying to push the original research.
  • Most likely you will gain unexpected insights, not into the problem you are trying to solve, but into the reason why it's not being actively worked on. Go back and reevaluate whether the area is still promising and interesting. Odds are, your new perspective will lead you to get excited about something related but different.
  • Repeat until you are sure that you have learned something no one else has. Whether a question no one asked, or a model no one constructed or applied in this case, or maybe a map from a completely unrelated area.
  • Do a thorough literature search on the topic. Odds are, you will find that someone else tried it already. Reevaluate. Iterate.
  • Eventually you might find something where you can make a useful original contribution, no matter how small. Or you might not. Still, you will likely end up knowing more and having a valuable perspective and a skill set.

Physics examples: don't go into QFT, String theory or Loop quantum gravity. No way you can do better than, say, Witten and Maldacena and thousands of theorists with IQ 150+ and the energy and determination of a raging rhino. Quantum foundations might still have some low-hanging fruit, but the odds are against it. No idea about the condensed matter research. A positive example: Numerical relativity hit a sweet spot about 15 years ago, because the compute and the algorithms converged, and there were only a few groups doing it. Odds are something similar is possible again, just need to find where.

Also, Kaj, your research into multi-agent models of the mind, for example, might yield something really exciting and new, if looked at in a right way, whatever it is.

Rohin Shah

250

I basically disagree with the recommendation almost always, including for AI alignment. I do think that

The problem [...] is that it leads people to optimize for absorbing information, rather than seeking it instrumentally, as a precursor to understanding.

I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety". (Possible reasons for this: the 80K AI safety syllabus, CHAI's bibliography, a general sense that you have to be an expert before you can do research.) This sentiment seems wrong to me; you definitely can and should think about important questions before learning everything that could potentially be considered "background".

The advice

let the questions themselves determine how often you need to go back and read papers or study proofs.

sounds to me like "when you feel like existing research would be useful, then go ahead and look at it, but don't feel like it's necessary", whereas I would say "as soon as you have questions, which should be almost immediately, one of the first things you should do is find the existing research and read it". The justification for this is the standard one -- people have already done a bunch of work that you can take advantage of.

The main disadvantage of this approach is that you lose the opportunity to figure things out from first principles. When you figure things out from first principles, you often find many branches that don't work out, which helps build intuitions about why things are the way they are, which you don't get nearly as well by reading about research, and you can't go back and figure things out from first principles because you already know the answer. But this first-principles-reasoning is extremely expensive (in time), and is almost never worthwhile.

Another potential disadvantage is that you might be incorrectly convinced that a technique is good, because you don't spot the flaws in it when reading existing research, even though you could have figured it out from first principles. My preferred solution is to become good at noticing flaws (e.g. by learning how to identify and question all of the assumptions in an argument), rather than to ignore research entirely.

Side note: In the case of philosophy, if you're trying to get a paper, then I'm told you often want to make some novel argument (since that's what gets published), which makes existing research less useful (or only useful to figure out what not to think about). If you want to figure out the truth, I expect you would do well to read existing research.

TL;DR: Looking at existing research is great because you don't have to reinvent the wheel, but make sure you need the wheel in the first place before you read about it (i.e. make sure you have a question you are reading existing research to answer).

ETA: If your goal is "maximize understanding of X", then you should never look at existing research about X, and figure everything out from first principles. I'm assuming that you have some reason for caring about X that means you are willing to trade off some understanding for getting it done way faster.

I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety". (Possible reasons for this: the 80K AI safety syllabus, CHAI's bibliography, a general sense that you have to be an expert before you can do research.) This sentiment seems wrong to me

See also, my shortform post about this.

4Rohin Shah
+1, I agree with the "be lazy in the CS sense" prescription; that's basically what I'm recommending here.

I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety".

Yeah, that feels like a natural extension of "I'm not allowed to have thoughts on this yet, so let me get enough social markers to be allowed to think for myself." Or "...to be allowed a thinking license."

Vanessa Kosoy

190

IMO the correct rule is almost always: first think about the problem yourself, then go read everything about it that other people did, and then do a synthesis of everything you learned inside your mind. Some nuances:

  • Sometimes thinking about the problem yourself is not useful because you don't have all the information to start. For example: you don't understand even the formulation of the problem, or you don't understand why it is a sensible question to ask, or the solution has to rely on empirical data which you do not have.

  • Sometimes you can so definitively solve the problem during the first step (unprimed thinking) that the rest is redundant. Usually this is only applicable if there are very clear criteria to judge the solution, for example: mathematical proof (but, beware of believing you easily proved something which is widely considered a difficult open problem) or something easily testable (for instance, by writing some code).

  • As John S. Wentworth observed, even if the problem was already definitively solved by others, thinking about it yourself first will often help you learning the state of the art later, and is a good exercise for your mind regardless.

  • The time you should invest into doing the first step depends on (i) how fast progress you realistically expect to make and (ii) how much progress you expect other people to have made by now. If this is an open problem on which many talented people worked for a long time, then expecting to make fast progress yourself is unrealistic unless you have some knowledge to which most of those people had no access, or your talent in this domain is truly singular. In this case you should think about the problem enough to understand why it is so hard, but usually not much longer. If this is a problem on which only few people have worked, or only for a short time, or it is obscure so you doubt it got the attention of talented researchers, then making comparatively fast progress can be realistic. Still, I recommend proceeding to the second step (learning what other people did) once you reach the point when you feel stuck (on the "metacognitive" level when you don't believe you will get unstuck soon: beware of giving up too easily).

After the third step (synthesis), I also recommend doing some retrospective: what have those other researchers understood that I didn't, how did they understand it, and how can I replicate it myself in the future.

150

Trying to invent your own physics without studying real physics just makes you into a physics crank

This is demonstrably not (always) the case. Famously, Richard Feynman recommends that students always derive physics and math from scratch when learning. In fact his Nobel prize was for a technique (Feynman diagrams) which he developed on the fly in a lecture he was attending. What the speaker was saying didn’t make sense to him so he developed what he thought was the same theory using his own notation. Turns out what he made was more powerful for certain problems, but he only realized that much later when his colleagues questioned what he was doing on the whiteboard. (Pulled from memory from one of Feynman’s memoirs.)

One of the other comments here recommends against this unless you are a Feynman-level genius, but I think the causality is backwards on this. Feynman’s gift was traditional rationality, something which comes through very clearly in his writing. He tells these anecdotes in order to teach people how to think, and IMHO his thoughts on thinking are worth paying attention to.

Personally I always try to make sure I can derive again what I learn from first principles or the evidence. Only when I’m having particular trouble, or I have the extra time do I try to work it out from scratch in order to learn it. But when I do I come away with a far deeper understanding.

[-][anonymous]30

I like this answer, but do question the point about Feynman's gift being mainly traditional rationality.

One of the other comments here recommends against this unless you are a Feynman-level genius, but I think the causality is backwards on this. Feynman’s gift was traditional rationality, something which comes through very clearly in his writing. He tells these anecdotes in order to teach people how to think, and IMHO his thoughts on thinking are worth paying attention to.

I agree that Feynman portrays it that way in his memoirs, but accounts from other

... (read more)
This is demonstrably not (always) the case. Famously, Richard Feynman recommends that students always derive physics and math from scratch when learning.

What Feynman recommended was to learn a topic, then put the book aside and see if you can rederive what you have supposedly learned on your own. This has little to do with the thesis you had quoted. I can take a bet 1000:1 that anything a person who has not studied "real physics" can propose as a their own physics will be at best a duplication of long-ago models and most likely just straight up ... (read more)

7[anonymous]
I think we may be talking past each other. You say That's what I meant when I said "Feynman recommends that students always derive physics and math from scratch when learning." You know the context. You know the evidence. You know, in the form of propositional statements, what the answer is. So make an attempt at deriving it yourself from the evidence, not the other way around. In doing so, I often find that I didn't really understand the original theory. What is built up in the from-scratch derivation is an intuitional understanding that is far more useful than the "book knowledge" you get from traditional learning. So, I would say, you never really learned it in the first place. But now we're debating word definitions. The other thing that you develop is experience deriving things "from scratch," with just a couple of hints as necessary along the way, which sets you up for doing innovative research once you hit the frontier of knowledge. Otherwise you fall victim to hindsight bias in thinking that all those theorems you read in books seemed so obvious, but discovering something new seems so hard. In reality, there is a skill to research that you only pick up by doing, and not practicing that skill now when the answers could be looked up when you get stuck, is a lost opportunity.

Gordon Seidoh Worley

90

I suspect it's mostly proportional to the answer to the question "how much progress can you expect to make building on the previous work of others?" in a particular field. This is why (for example) philosophy is weird (you can make a lot of progress without paying attention to what previous folks have said), physics and math benefit from study (you can do a lot more cool stuff if you know what others know), and AI safety may benefit from original thinking (there's not much worth building off of (yet)).

Jan Kulveit

60

I basically agree with Vanessa:

the correct rule is almost always: first think about the problem yourself, then go read everything about it that other people did, and then do a synthesis of everything you learned inside your mind.

Thinking about the problem myself first often helps me understand existing work as it is easier to see the motivations, and solving solved problems is good as a training.

I would argue this is the case even in physics and math. (My background is in theoretical physics and during my high-school years I took some pride in not remembering physics and re-deriving everything when needed. It stopped being a good approach for physics ca since 1940 and somewhat backfired.)

The mistake members of "this community" (LW/rationality/AI safety) are sometimes making is skipping the second step / bouncing off the second step if it is actually hard.

Second mistake is not doing the third step in a proper way, which leads to somewhat strange and insular culture which may be repulsive for external experts. (E.g. people partially crediting themselves for discoveries which are know to outsiders)

kithpendragon

10

I think one important context for not reading the existing literature first is calibration. Examining the difference between how you are thinking about a question and how others have thought about the same question can be instructive in a couple of ways. You might have found a novel approach that is worth exploring, or you might be way off in your thinking. Perhaps you've stumbled upon an obsolete way of thinking about something. Figuring out how your own thinking process lines up with the field can be extremely instructional, and super useful if you want your eventual original work to be meaningful. At the very least, you can identify your own common failure modes and work to avoid them.

The fastest and easiest way to accomplish all this is by using a sort of research loop where you collect your own thoughts and questions, then compare them with the literature and try to reconcile the two, then repeat. If you just read all the literature first, you have no way to calibrate your explorations when you finally get there.

FactorialCode

10

I think this is mainly a function of how established the field is and how much time you're willing to spend on the subject. The point of thinking about a field before looking at the literature is to avoid getting stuck in the same local optima as everyone else. However, making progress by yourself is far slower than just reading what everyone has already figured out.

Thus, if you don't plan to spend a large amount of time in a field , it's far quicker and more effective to just read the literature. However, if you're going to spend a large amount of time on the problems in the field, then you want to be able to "see with fresh eyes" before looking at what everyone else is doing. This prevents everyone's approaches from clustering together.

Likewise, in a very well established field like math or physics, we can expect everyone to already have clustered around the "correct answer". It doesn't make as much sense to try and look at the problem from a new perspective, because we already have very good understanding of the field. This reasoning break down once you get to the unsolved problems in the field. In that case, you want to do your own thinking to make sure you don't immediately bias your thinking towards solutions that others are already working on.

4 comments, sorted by Click to highlight new comments since:

Possibly tangential, but I have found that the "try it yourself before studying" method is a very effective way to learn about a problem/field. It also lends a gut-level insight which can be useful for original research later on, even if the original attempt doesn't yield anything useful.

One example: my freshman year of college, I basically spent the whole month of winter break banging my head against 3-sat, trying to find an efficient algorithm to solve it and also just generally playing with the problem. I knew it was NP-complete, but hadn't studied related topics in any significant depth. Obviously I did not find any efficient algorithm, but that month was probably the most valuable-per-unit-time I've spent in terms of understanding complexity theory. Afterwards, when I properly studied the original NP-completeness proof for 3-sat, reduction proofs, the polynomial hierarchy, etc, it was filled with moments of "oh yeah, I played with something like this, that's a clever way to apply it".

Better example: I've spent a huge amount of time building models of financial markets, over the years. At one point I noticed some structures had shown up in one model which looked an awful lot like utility functions, so I finally got around to properly studying Arrow & Debreu-style equilibrium models. Sure enough, I had derived most of it already. I even had some pieces which weren't in the textbooks (pieces especially useful for financial markets). That also naturally lead to reading up on more advanced economic theory (e.g. recursive macro), which I doubt I would have understood nearly as well if I hadn't been running into the same ideas in the wild already.


Since you mention physics, it's worth noting Feynman was a big proponent of this for physics, and seemed to have multiple reasons for it.

A big proponent of people studying the existing material, or doing their own experiments?

It is not particularly recommended that people try to invent their own math instead of studying existing math.

It might be useful for people to start by figuring out a) what math they want to study or b) what problems they want to solve/what tools they could use. "Learn all of math" is a daunting task. (Though perhaps more useful than "Learn every programming language".)

Trying to invent your own physics without studying real physics just makes you into a physics crank,

(I'm curious about the probabilities here.)

It'd be slow going, though I wouldn't say failure would be guaranteed. (If I went to the leaning tower of pisa, dropped a piece of paper and a rock, and had someone else on the ground time* when they hit the ground (where), I might conclude that a rock hits the ground sooner when dropped from the leaning tower of pisa, than a piece of paper does.)

*Making videos would be ideal, actually. People put a lot of stock in writing things down, but if the whole process is also filmed (and made available live) that could be better than pre-registration** - and allay concerns regarding data tampering.

**absent concerns regarding censoring via the platform