nhamann comments on Best career models for doing research? - Less Wrong

27 Post author: Kaj_Sotala 07 December 2010 04:25PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (999)

You are viewing a single comment's thread. Show more comments above.

Comment author: jsteinhardt 08 December 2010 02:24:21AM *  34 points [-]

I believe that most people hoping to do independent academic research vastly underestimate both the amount of prior work done in their field of interest, and the advantages of working with other very smart and knowledgeable people. Note that it isn't just about working with other people, but with other very smart people. That is, there is a difference between "working at a university / research institute" and "working at a top university / research institute". (For instance, if you want to do AI research in the U.S., you probably want to be at MIT, Princeton, Carnegie Mellon, Stanford, CalTech, or UC Berkeley. I don't know about other countries.)

Unfortunately, my general impression is that most people on LessWrong are mostly unaware of the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years (I mention these two fields because I assume they are the most popular on LW, and also because I know the most about them). And I'm not talking about impressive-looking results that dodge around the real issues, I'm talking about fundamental progress towards resolving the key problems in artificial intelligence. Anyone planning to do AI research should probably at least understand these first, and what the remaining obstacles are.

You aren't going to understand this without doing a lot of reading, and by the time you've done that reading, you'll probably have identified a research group whose work clearly reflects your personal research goals. At this point it seems like the obvious next step is to apply to work with that group as a graduate student / post doc. This circumvents the problem of having to work on research you aren't interested in. As for other annoyances, while teaching can potentially be a time-sink, the rest of "wasted" time seems to be about publishing your work; I really find it hard to justify not publishing your work, since (a) other people need to know about it, and (b) writing up your results formally oftentimes leads to a noticeably deeper understanding than otherwise. Of course, you can waste time trying to make your results look better than they are, but this certainly isn't a requirement and has obvious ethical issues.

EDIT: There is the eventual problem that senior professors spend more and more of their time on administrative work / providing guidance to their lab, rather than doing research themselves. But this isn't going to be an issue until you get tenure, which is, if you do a post-doc, something like 10-15 years out from starting graduate school.

Comment author: nhamann 10 December 2010 11:27:35AM 6 points [-]

... the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years... And I'm not talking about impressive-looking results that dodge around the real issues, I'm talking about fundamental progress towards resolving the key problems in artificial intelligence.

Could you point me towards some articles here? I fully admit I'm unaware of most of this progress, and would like to learn more.

Comment author: jsteinhardt 11 December 2010 03:56:43AM *  11 points [-]

A good overview would fill up a post on its own, but some relevant topics are given below. I don't think any of it is behind a paywall, but if it is, let me know and I'll link to another article on the same topic. In cases where I learned about the topic by word of mouth, I haven't necessarily read the provided paper, so I can't guarantee the quality for all of these. I generally tried to pick papers that either gave a survey of progress or solved a specific clearly interesting problem. As a result you might have to do some additional reading to understand some of the articles, but hopefully this is a good start until I get something more organized up.

Learning:

Online concept learning: rational rules for concept learning [a somewhat idealized situation but a good taste of the sorts of techniques being applied]

Learning categories: Bernoulli mixture model for document classification, spatial pyramid matching for images

Learning category hierarchies: nested Chinese restaurant process, hierarchical beta process

Learning HMMs (hidden Markov models): HDP-HMMs this is pretty new so the details haven't been hammered out, but the article should give you a taste of how people are approaching the problem, although I also haven't read this article; I forget where I read about HDP-HMMs, although another paper on HDPs is this one. I think the original article I read was one of Erik Sudderth's, which are here. Another older algorithm is the Baum-Welch algorithm.

Learning image characteristics: deep Boltzmann machines

Handwriting recognition: hierarchical Bayesian approach, basically the same as the previous research

Learning graphical models: a survey paper


Planning:

Planning in MDPs: value iteration, plus LQR trees for many physical systems

Planning in POMDPs: I don't actually know much about this; my impression is that we need to do more work in this area, but approaches include reinforcement learning. A couple interesting papers: Bayes risk approach, plus a survey of hierarchical methods