You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Elo comments on Open Thread, Jul. 27 - Aug 02, 2015 - Less Wrong Discussion

5 Post author: MrMind 27 July 2015 07:16AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (220)

You are viewing a single comment's thread. Show more comments above.

Comment author: NancyLebovitz 01 August 2015 04:54:31PM 3 points [-]

How to tell if a process is out of control-- that weirdness might be random, but you should check in case it isn't.

Comment author: Elo 10 August 2015 12:15:48AM 0 points [-]

I received feedback from some friends to suggest that this is not applicable to large datasets - i.e. big data. I play with my own quantified self data sets of 100,000+ lines from time to time. (think minutised data at 1440 minutes a day for a year and counting). Can you discuss this more (maybe in the next open thread?)

Comment author: Vaniver 10 August 2015 01:09:54AM 1 point [-]

It shouldn't be too challenging to apply Nelson rules to 100k lines, but the point of statistical process control is continuous monitoring--if you weigh yourself every day, you would look at the two-week trend every day, for example. Writing a script that checks if any of these rules are violated and emails you the graph if that's true seems simple and potentially useful.

Comment author: Douglas_Knight 14 August 2015 01:37:38AM 0 points [-]

I think what Elo's friends mean is that the constants hard-coded into Nelson's rules reflect some assumption on sample size. With a big sample, you'll violate them all the time and it won't mean anything. But they are a good starting point for tuning the thresholds.

Comment author: Vaniver 14 August 2015 02:26:52AM 0 points [-]

I think what Elo's friends mean is that the constants hard-coded into Nelson's rules reflect some assumption on sample size. With a big sample, you'll violate them all the time and it won't mean anything. But they are a good starting point for tuning the thresholds.

If you have many parallel sensors, then yes, a flag that occurs 5% of the time due to noise will flag on at least one twentieth of your sensors. Elo's point, as I understood it, was that they have a long history--which is not relevant to the applicability of SPC.

Comment author: Douglas_Knight 14 August 2015 02:47:18AM 1 point [-]

The long history is not relevant, but the frequency. Most of Nelson's rules are 1/1000 events. If you don't expect trends to change more often than 1000 measurements, that's too sensitive. I don't know what Elo is measuring every minute, but that probably is too sensitive and most of the hits will be false positives. (Actually, many things will have daily cycles. If Nelson notices them, that's interesting, but after removing such patterns, it will probably be too sensitive.)

Comment author: Vaniver 14 August 2015 01:32:42PM 0 points [-]

I see what you mean--yes, if you're oversampling, you need to downsample / average / whatever to get to the reasonable frequency, otherwise you'll just be doing SPC on noise.

Comment author: NancyLebovitz 10 August 2015 12:43:46AM 0 points [-]

All I know about it is that the link looked like it was worth mentioning here. If you're interested in further discussion, you should bring it up yourself..