nhamann comments on Best career models for doing research? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (999)
I believe that most people hoping to do independent academic research vastly underestimate both the amount of prior work done in their field of interest, and the advantages of working with other very smart and knowledgeable people. Note that it isn't just about working with other people, but with other very smart people. That is, there is a difference between "working at a university / research institute" and "working at a top university / research institute". (For instance, if you want to do AI research in the U.S., you probably want to be at MIT, Princeton, Carnegie Mellon, Stanford, CalTech, or UC Berkeley. I don't know about other countries.)
Unfortunately, my general impression is that most people on LessWrong are mostly unaware of the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years (I mention these two fields because I assume they are the most popular on LW, and also because I know the most about them). And I'm not talking about impressive-looking results that dodge around the real issues, I'm talking about fundamental progress towards resolving the key problems in artificial intelligence. Anyone planning to do AI research should probably at least understand these first, and what the remaining obstacles are.
You aren't going to understand this without doing a lot of reading, and by the time you've done that reading, you'll probably have identified a research group whose work clearly reflects your personal research goals. At this point it seems like the obvious next step is to apply to work with that group as a graduate student / post doc. This circumvents the problem of having to work on research you aren't interested in. As for other annoyances, while teaching can potentially be a time-sink, the rest of "wasted" time seems to be about publishing your work; I really find it hard to justify not publishing your work, since (a) other people need to know about it, and (b) writing up your results formally oftentimes leads to a noticeably deeper understanding than otherwise. Of course, you can waste time trying to make your results look better than they are, but this certainly isn't a requirement and has obvious ethical issues.
EDIT: There is the eventual problem that senior professors spend more and more of their time on administrative work / providing guidance to their lab, rather than doing research themselves. But this isn't going to be an issue until you get tenure, which is, if you do a post-doc, something like 10-15 years out from starting graduate school.
Could you point me towards some articles here? I fully admit I'm unaware of most of this progress, and would like to learn more.
A good overview would fill up a post on its own, but some relevant topics are given below. I don't think any of it is behind a paywall, but if it is, let me know and I'll link to another article on the same topic. In cases where I learned about the topic by word of mouth, I haven't necessarily read the provided paper, so I can't guarantee the quality for all of these. I generally tried to pick papers that either gave a survey of progress or solved a specific clearly interesting problem. As a result you might have to do some additional reading to understand some of the articles, but hopefully this is a good start until I get something more organized up.
Learning:
Online concept learning: rational rules for concept learning [a somewhat idealized situation but a good taste of the sorts of techniques being applied]
Learning categories: Bernoulli mixture model for document classification, spatial pyramid matching for images
Learning category hierarchies: nested Chinese restaurant process, hierarchical beta process
Learning HMMs (hidden Markov models): HDP-HMMs this is pretty new so the details haven't been hammered out, but the article should give you a taste of how people are approaching the problem, although I also haven't read this article; I forget where I read about HDP-HMMs, although another paper on HDPs is this one. I think the original article I read was one of Erik Sudderth's, which are here. Another older algorithm is the Baum-Welch algorithm.
Learning image characteristics: deep Boltzmann machines
Handwriting recognition: hierarchical Bayesian approach, basically the same as the previous research
Learning graphical models: a survey paper
Planning:
Planning in MDPs: value iteration, plus LQR trees for many physical systems
Planning in POMDPs: I don't actually know much about this; my impression is that we need to do more work in this area, but approaches include reinforcement learning. A couple interesting papers: Bayes risk approach, plus a survey of hierarchical methods