This is a linkpost for Martyn Thomas's What Really Happened in Y2K and its associated talk recording.

Some spoke of a terrible catastrophe coming inevitably.

They said the death toll could be in the millions, if not higher.

An alarm was raised, but few paid attention.

But soon it became too big to ignore.

It felt like the entire world was focused on the problem.

And then....nothing happened.

Or so they say.


Y2K often gets portrayed as a false crisis turned into a boondoggle. But, but in truth quite a lot of critical infrastructure failed due to the bug – and we should be glad it got caught in testing.

My career is in programming tools and software maintenance, and one of my mentors built program transformation tools in that era. I've known for a long time that Y2K was a real problem –  and one that didn't end in 2000, with many companies implementing hacks to put it off by 20 years.

Today, I watched Martyn Thomas's excellent talk What Really Happened in Y2K . Thomas knows what he's talking about: he was on the ground as the leader of consulting teams addressing Y2K, and is now a professor of software engineering. Never have I seen such a clear and detailed description of the many things that almost went wrong.


As I watched, something struck me:

Like many others here, I watched in March 2020 as COVID presented humanity a practice run in global coordination to avoid scientific catastrophe, and humanity received a solid F. Yet Y2K is an example of humanity succeeding against a similarly inevitable catastrophe,  with the IY2KCC (International Y2K Coordination Committee) recommending sound countermeasures in exactly the way the WHO didn't.  As I and many readers are concerned that humanity may soon face another international coordination problem in the race for AGI, this story serves as an example of history that we should wish to repeat.

And we should ask: What made Y2K different from COVID? Could it be just that it wasn't politicized? I was not reading blog posts op-eds in 1999 and don't know how strong the Y2K naysayers were, so I'm not even sure that's true.

Below are highlighted quotes from the article. Let's hope that preventing unaligned AGI is more like Y2K and less like COVID.
 

There were early warnings

 

Some early failures helped to raise the alarm. It was said that sometime in the late 1980s the UK supermarket Marks and Spencer rejected a delivery of tinned meat because the stock control system detected that it was nearly 90 years old. The expiry date, in 2000, was read as 1900. In 1992, 104 year old Mary Bandar of Winona, USA was invited to join an infant class for four-year-old children because she was born in “88”.

 

As Y2K drew nearer, the problems mounted

 

Many Y2K failures occurred in the 1990s and were then corrected. A typical example was an insurer that sent out renewals offering insurance cover with dates that ran from 1996 to 1900 rather than to 2000.

 

Failures occurred throughout the 1990s as systems first encountered security passes, library cards and other documents that had expiry dates beyond 2000. Even as late as December 1999, a Racal credit card system failed and caused problems
 

 LONDON — A Y2K-triggered failure in credit card swipe machines caused frustrating delays for thousands of retailers and customers trying to ring up purchases across Britain on Wednesday. The machines, manufactured by Racal Electronics and supplied by HSBC, one of Britain’s largest four banks, improperly rejected credit cards because of a failure to recognize the year 2000, a bank 7 spokeswoman said. Retailers claimed they have so far lost $5 million in sales due to the problem, are reportedly threatening to bring a class action lawsuit against the bank. The glitch occurred because some of the bank’s new swipe card terminals are programmed to look ahead four working days in processing merchant transactions to ensure they are registered within that time period.

Testing showed the impending danger

 

Chrysler Corp. shut down its Sterling Heights Assembly Plant in mid-1997 and turned the plant’s clocks to December 31, 1999. “We got a lot of surprises. Nobody could get in or out of the plant. The security system absolutely shut down and wouldn’t let anyone in or out. And you couldn’t have paid people because the time clock systems didn’t work.” noted Chrysler Chairman Robert Eaton

 


The UK’s Rapier anti-aircraft missile system. The Ministry of Defense admitted that the millennium bug could have left Britain vulnerable to air attack. It discovered that the Rapier anti-aircraft missile would have failed to retaliate. The problem was identified inside the field equipment which activates the missiles and it would have made the system inoperable.

Swedish nuclear plant - In mid-1998, a nuclear utility in Sweden decided to see what would happen if it switched the clocks in its reactor’s computers to read January 1, 2000. The response surprised utility officials, who had expected business as usual. The reactor’s computers couldn’t recognize the date (1/1/00) and shut down the reactor

Pressure mounted from above

 

By 1997 most audit firms had written to their audit clients advising that they would need to ensure that all their business critical systems and equipment had been certified to work throughout Y2K, so that the auditors would be able to approve the accounts on a continuing business basis. This drove a flurry of activity to get assurances from suppliers and to correct or replace systems and equipment.

 

10% of VISA swipe-card machines were found not to handle cards that expired after 1999. Visa asked member banks to stop issuing cards with 00 expiry dates as a temporary measurexxi. Visa then prepared to fine banks up to £10,000 per month until their systems became Y2K compliant

 

There were still real problems, but they didn't cascade 

1. 15 nuclear reactor shut-downs (in Spain, Ukraine, Japan and the USA)
2. Many credit card systems rejected valid cards.
[...]
4. There were power cuts in Hawaii and cable television feeds failed.
[...]
12. A customer at a New York State video rental store had a bill for $91,250, the cost of renting the movie 'The General's Daughter' for 100 years
[...]
24. The Pentagon had a self-inflicted Y2K mis-fix that resulted in complete loss of ability to process satellite intelligence data for 2.5 hours at midnight GMT on the year turnover, with the fix for that leaving only a trickle of data from 5 satellites for several days afterward.
[...]
29. Birth certificates for British new-borns were for 1900.
[...]
31. Various people received bills for cumulative interest since 1900.
32. At least one person was temporarily rich, for the same reason.

Why do some people think it all didn't matter?
 

There are two narratives to why the world did not end on Y2K: One is that we were saved by all the investment preventing it; the other is that it was nothing to begin with. The evidence presented in favor of the latter argument is that countries and companies that did not invest in it also experienced few problems. Thomas is firmly in the first camp. His counterargument? Latecomers rode on the backs of those who spent a fortune.
 


In 1996, cost estimates were $1 to $2 per line of business mainframe software code to make Y2K repairs. (A typical accounting program has hundreds of thousands of lines of code.) The cost was expected to rise to over $4 per line by 1998 as a shortage of skilled programmers developed. In fact, the cost gradually fell to only a few pennies per line because automated Y2K tools became extremely accurate and efficient at fixing the code. This had the fortunate result that those countries and companies that had started their remediation programmes late were able to make up the lost time.

 

  • The companies and countries that seemed to have started their Y2K programmes too late were rescued by the progress that was made by others – in getting suppliers to correct software and equipment and in developing high-productivity tools for remediating legacy software.
  • The feared domino effect of cascade failures in interconnected systems did not happen because the major supply chains contained the largest companies and these companies had given high priority to finding and correcting problems in their systems and in interfaces.

 

 


Some ideas for why we succeeded against Y2K, but not COVID

  • Y2K remediation was handled by IT and software specialists; COVID required remediation from everyone.
    • However, plenty of ordinary people required upgrades.
      • "Many PC software products were not Y2K compliant, including Microsoft Windows 95."
      • "In June 1998, the building company Bovis wrote to the owners of 870 buildings it had built since 1984 warning them that integral building systems – ranging from ventilation and heating to intruder alarms and connections to the electricity supply network – might fail. This was because many building systems use microprocessors to identify dates for switching machinery on and off and to alert maintenance staff to the need for servicing."
  • Y2K was non-adversarial: We could steadily fix problems and move towards a 100% Y2K-compliant world. If a bank sent its Y2K consultants home, there was no "Y2K-Delta" mutation that would bring the problems back.
    • Correspondingly, it had a definite end date. The closest analogue to "eternal lock down" fears was fears that similar scare-mongering would be used to sell endless unnecessary upgrades. (Some of that did happen. "Inevitably many suppliers saw this as a business opportunity and forced their customers to upgrade unnecessarily, in some cases refusing to certify as Y2K compliant equipment that they knew had never contained any form of clock.")
  • Because we could test systems by changing a date, individuals could see what would happen to their systems were they to do nothing. If politicians could have seen the lockdowns and death counts coming their way on March 1st, 2020, they may have acted differently.
New Comment
5 comments, sorted by Click to highlight new comments since:

Because we could test systems by changing a date, individuals could see what would happen to their systems were they to do nothing. If politicians could have seen the lockdowns and death counts coming their way on March 1st, 2020, they may have acted differently.

This seems like it must've been the decisive factor. Cheaply and quickly evaluating counterfactuals makes decision-making many times easier.

Based on what you wrote here, Y2K mitigation happened over the course of 3 years (1997-1999). That means corporate budgets had time to adjust, POs could be written, new developers hired or contracted, etc. Also, each system needed a few days (or weeks) of work, which could be done at any point in the 3 year span.

COVID mitigation decisions happened over a span of about 3 weeks in March 2020, and required everyone to continue applying those mitigations indefinitely.

Acid Rain / Ozone depletion were also solved, and also were "companies have to make a one-time investment at some point in a multi-year window" type solutions.

Greenhouse Gas accumulation is a mixed bag, but we are seeing that one-time investments (eg installing solar power plants) are happening, and indefinite changes (extracting CO2 in equal proportion to what is created as an additional cost of burning fuel) are not

"Many thousands of date problems were found in commercial data processing systems and corrected. (The task was huge – to handle the work just for General Motors in Europe, Deloitte had to hire an aircraft hangar and local hotels to house the army of consultants, and buy hundreds of PCs)."

Sounds like more than a few weeks.

They said the death toll could be in the millions, if not hire.

hire -> higher

The feared domino effect of cascade failures in interconnected systems did not happen because the major supply chains contained the largest companies and these co

This quote is cut off.

Thanks; fixed both.