You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Four Short Stories on Error Checking

6 brilee 25 June 2012 04:58AM

 

(Cross-posted from my blag)

1. My microwave clock has been broken since a power outage. It now reads something like 7 hours behind the real time. I've been too lazy to fix it, but the bright side is that it's 7 hours behind, and not 15 minutes behind. If it were 15 minutes behind, then who knows - I might mistake it to be the correct time, and end up fifteen minutes late to an appointment.

 

2. During the early years of computing, scientists and engineers were responsible for running numerical simulations of various sorts. To do this, they needed randomized starting conditions. Von Neumann favored the "middle square" method. While this method was not a very good pseudo-random number generator, its speed made up for its shortcomings in those early days. Additionally, a useful property of the middle-square method was that when it fell into a short cycle, it was immediately obvious. While other methods may have had undetectable cycles of intermediate length, the middle-square method would invariably output legitimate pseudorandom numbers for some time, then fall into a cycle of length 1, 2 or 4. (1)

 

3. One day, I was playing with some traffic models. My goal was to be able to correctly model the behavior of a line of cars as they accelerated from a standstill when the light turned from red to green. (2) I had collected some actual data at an intersection, and was planning to test my model against the data, as well as to fit some parameters. I ran a short program to fit these parameters and plot the actual times vs. my program's predicted times. To my astonishment, I had almost a perfect fit! Upon deeper inspection, it turned out that my program had merely gotten the times right by coincidence - while the cars behaved nicely long enough to get to the intersection, afterwards, their velocities oscillated with exponentially growing amplitude. I would have missed this if I had not insisted on checking the raw numbers from the simulation. I recoded my simulation and got a worse fit, but at least it didn't blow up as it had before.

 

(This last story is fiction. Or so, I hope)

4. The US News and World Report was doing its annual ranking of universities. They had recently changed the weightings on some of the subscores. Upon running their algorithm, an unexpected candidate rose to the top - Caltech! (Zing.) They concluded that there must have been a mistake with their algorithm, readjusted their weightings, and reran their algorithm. This time, Harvard rose to the top, as it should have. Happy with their results, US News and World Report published their university rankings and raked in a lot of dough.

 

Notes:

(1) Test the middle-square story. Is it actually true that all cycles are short? My code, if you want to play with it.

(2) The time for the nth car to reach the intersection is about 2*n seconds

 

Startups as a Rationality Test Bed?

6 beoShaffer 22 January 2012 08:50PM

What attributes make a task useful for rationality verification?

After thinking it over I believe I have identified three main components.  The first is that the task should be as grounded in the real world as is possible. The second is that the task should be involve a wide variety of subtasks, preferably ones that involve decision making and/or forecasting.  This will help insure that the effect is from general rationality, rather than from the rationality training helping with domain specific skills.  The third is that there should be clear measure of successes for the task.

As I am not personally involved with the field I could be missing something important, but it seems like founding a successful startup would fulfill all three components.  I propose that investigating the effect of giving startup founders rationality training would be a good basis for an experiment. Unfortunately, I do not know if it would be feasible to run such an experiment in real life.  Thus, I am turning to the LW community to see if the people reading this have any suggestions.

-addendum 

I didn't go into details about exact exprimental methods for a couple of reasons.   Partially because I assumed, apparently incorrectly, that it was obvious that any experiment for testing rationality would be conducted with the best experimental protocols that we could manage.   But mostly, because I thought that it would be good to get feed back on the basic idea of rationally verification + startups ?= good before spending time going into detail about things like control groups, random assignment ect. 

I welcome suggestions along those lines, and given the attention this has received will try to go back and add some of my own ideas when I have time, but wanted to make cleat that I wasn't intending this post as a detailed experimental design.

Also does anyone have any idea why the first part of this post has different spacing from the second?  It's not intentional on my part.

Rationality Verification Opportunity?

0 beoShaffer 15 December 2011 10:11PM

One of the challenges of rationality verification is that most people who are willing to contribute personal data for it are already familiar with the techniques involved.  This makes it difficult to tell if their performance on any form of rationality test is due to their training or their innate abilities.  Does the start of a new sequence present a way around this for that sequence's content?

I believe that it might, and will propose some ideas on how we can take advantage of these opportunities.  But first I would suggest that you try to think through the problem for yourself (I know this is slightly different from what is talked about in that post, but I think the principle holds).

 

 

Did you think through the general problem of rationality verification for new sequences before thinking of any solutions. Did you then think of your own solutions before getting your mind contaminated with mine?  If yes, good.  If no, not so good.

 

If we had good measures of general rationality that could be retaken by the same person multiple times without losing reliability we could simply ask LWers to take them at various intervals and see if they improved after reading the new sequence.  As that is not the case I suspect we would have to create specific measures for each sequence.  It seems that most writers have a decent idea of what benefits they expect people to gain from their sequences' so perhaps they could try to come up with specific measures for the things that their sequences are supposed to improve.  Then before running the sequence main sequence they could put out a call for people to complete these measures and send them in.  They could then collect the data again from people who have read the completed sequence, preferably after they have had enough time to practice the material, but not long enough to have had to many other life changes. The necessity and viability of having additional experimental controls will vary between sequences.  But I think we will generally be fine with a simple before and after picture. 

While there are some time and talent limitations I would be willing to help with creating the measures, collecting and interpreting the data and any other necessary steps.

I declare Crocker's rules on the content and style of this post.  This includes the title.