New to LessWrong?

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 4:00 PM

Author of Learning Deep Learning here.

How to solve a practical problems requires much more well-rounded skills that mastering one machine learning algorithm or another (in fact, some problems don't require ML at all).

For a more general introduction to data science, see http://p.migdal.pl/2016/03/15/data-science-intro-for-math-phys-background.html. So yes: discussing things with clients, getting data, cleaning data, realising it is not enough, so asking client if they have more/different data, exploring it, seeing that some of it is rubbish, semi-manually cleaning it, creating a model, seeing it's ok, discovering that it fitted to some artefact, ... (and dozens, dozens of steps).

"In fact, my biggest regret is delaying learning it, because of the perceived difficulty. To start, all you need is really basic programming, very simple mathematics and knowledge of a few machine learning concepts. I will explain where to start with these requirements. In my opinion, the best way to start is from a high-level interactive approach. For that reason, I suggest starting with image recognition tasks in Keras, a popular neural network library in Python."

Sometimes I feel that these deep-learning tutorials I encounter on the web (of which I've encountered a great many) usually don't mention how little time will be spent actually designing and running a deep-learning model.

The problem is that, in "the wild", you almost never encounter any situations that resemble the scenarios presented in these tutorials. For example, for our company, a typical project encountered might look like this:

  • Client gives you access to an enormous database with terabytes of unstructured text data like intracompany emails, computer network usage, key-card swipes, transaction data if it's a financial firm, etc., distributed over a wide array of servers that may or may not have the correct software you need in order to run your stuff.
  • They will ask you to find some vaguely-defined thing like "fraud" or "network intrusion".
  • Nothing is labeled, so you won't be able to make use of supervised learning techniques which is what these deep-learning models are most successful at.
  • The attributes defining the data are very opaque, and there is a separate expert for maintaining some subset of the attributes that has to be found, consulted with, and the knowledge from them then incorporated into your often complex understanding of how the data is generated and represented.
  • Most of the work is spent doing the above, and figuring out how to do the complex joins and transformations necessary to input the data into a machine-learning algorithm (and with deep nets, all fields have to be numeric and within a specific range).
  • Clients often demand deep-learning-level accuracy while also demanding complete model transparency. They almost never tolerate black-box models.
  • Clients would usually rather continue to use their archaic, hand-designed if-statement rather than modern AI techniques unless the latter meets all of their requirements perfectly - the burden of proof lies on you for proving your method is guaranteed to be a return on their investment with zero risk for them.

I will add a caveat that:

  • Occasionally we do encounter some projects that not only have labeled data but also enough data with enough straightforward, numeric features that neural networks can be used and are easily shown to be more successful. But this is usually the exception rather than the rule.

I sometimes wonder if we're getting an unusual subset of major corporations with these characteristics, but these are pretty major, large firms that seem to share many of the same business practices with each other, so I would somewhat doubt that.

But in general, I think that there seems to be a far larger share of articles covering how to do basic things with Keras and Tensorflow, and too few on the "hard problems" of data science work.