Many users may have noticed that Less Wrong experienced about 6 hours of downtime on 16/7/2011.

CAUSE: The server was put under an unusual amount of load and started up a new instance to load-balance the traffic.  Unfortunately, there was a bug in the script that starts the new instance that caused it to use an inconsistent mix of old and new code.  The symptom seen by users was that any post with comments was inaccessible.

RESPONSE: A hotfix was deployed as soon as the problem was detected, unfortunately it was a Saturday so this reponse time was slower than we would like.  We have since implemented a proper fix for the particular bug that caused this problem.  We are also creating some extra monitoring probes so we'll be notified promptly of any similar problems in the future.

Apologies for the inconvenience.

New Comment
3 comments, sorted by Click to highlight new comments since:

Thanks for writing this up!

Thanks for the response!

Thanks for letting us know :)