whpearson comments on Open Thread June 2010, Part 3 - Less Wrong

6 Post author: Kevin 14 June 2010 06:14AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (606)

You are viewing a single comment's thread. Show more comments above.

Comment author: whpearson 14 June 2010 04:22:04PM 0 points [-]

I've had a look, and I don't see anything much that will make the techniques easily generalize to my problems (or any problem that has similar characteristics to mine, such as very large amounts of possibly relevant data). Oh, I am planning to use bayesian techniques. But easy is not how I would characterize the translating of the problem.

Comment author: SilasBarta 14 June 2010 04:28:32PM *  3 points [-]

Now that you mention it, one of the reasons I'm trying to get acquainted with the methods Thrun uses is to see how much they rely on advance knowledge of exactly how the sensor works (i.e. its true likelihood function). Then, I want to see if it's possible to infer enough relevant information about the likelihood function (such as through unsupervised learning) so that I can design a program that doesn't have to be given this information about the sensors.

And that's starting to sound more similar to what you would want to do.

Comment author: whpearson 14 June 2010 04:47:26PM 1 point [-]

That'd be interesting. More posts on the real world use of bayesian models would be good for lesswrong I think.

But I'm not sure how relevant to my problem. I'm in the process of writing up my design deliberations and you can judge better once you have read them.

Comment author: SilasBarta 15 June 2010 12:01:05AM 1 point [-]

Looking forward to it!

The reason I say that our problems are related is that inferring the relevant properties of a sensor's likelihood function looks like a standard case of finding out how the probability distribution clusters. Your problem, that of identifying a file type from its binary bitstream, is doing something similar -- finding what file types have what PD clusters.