You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

bogus comments on Open thread, Sep. 14 - Sep. 20, 2015 - Less Wrong Discussion

3 Post author: MrMind 14 September 2015 07:10AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (192)

You are viewing a single comment's thread. Show more comments above.

Comment author: philh 14 September 2015 05:26:58PM 4 points [-]

I have a variant on linear regression. Can anyone tell me what it's called / point me to more info about it / tell me that it's (trivially reducible to / nothing like) standard linear regression?

Standard linear regression has a known matrix X = x(i,j) and a known target vector Y = y(j), and seeks to find weights W = w(i) to best approximate X * W = Y.

In my version, instead of knowing the values of the input variables (X), I know how much each contributes to the output. So I don't know x(i,j) but I kind of know x(i,j) * w(i), except that W isn't really a thing. And I know some structure on X: every value is either 0, or equal to every other value in its row. (I can tell those apart because the 0s contribute zero and the others contribute nonzero.) I want to find the best W to approximate X * W = Y, but that question will depend on what I want to do with the uncertainty in X, and I'm not sure about that.

I should probably avoid giving my specific scenario, so think widget sales. You can either sell a widget in a city or not. Sales of a widget will be well-correlated between cities: if widget sells well in New York, it will probably sell well in Detroit and in Austin and so on, with the caveat that selling well in New York means a lot more sales than selling well in Austin. I have a list of previous widgets, and how much they sold in each city. Received wisdom is that a widget will sell about twice as much in New York as in Detroit, and a third more than in Austin, but I want to improve on the received wisdom.

So I'm told that a widget will sell 10 million, and that it will be sold in (list of cities). I want to come up with the best estimate for its sales in New York, its sales in Austin, etc.

Hopefully this is clear?

Comment author: bogus 15 September 2015 10:25:57AM 2 points [-]

Sounds like your problem is fitting a sparse matrix, i.e. where you want many entries to be 0. This is usually called compressed sensing, and it's non-trivial.