I go to Amazon, search for “air conditioner”, and sort by average customer rating. There’s a couple pages of evaporative coolers (not what I’m looking for), one used window unit (?), and then this:
Average rating: 4.7 out of 5 stars.
However, this air conditioner has a major problem. Take a look at this picture:
Key thing to notice: there is one hose going to the window. Only one.
Why is that significant?
Here’s how this air conditioner works. It sucks in some air from the room. It splits that air into two streams, and pumps heat from one stream to the other - making some air hotter, and some air cooler. The cool air, it blows back into the room. The hot air, it blows out the window.
See the problem yet?
Air is blowing out the window. In order for the room to not end up a vacuum, air has to come back into the room from outside. In practice, houses are very not airtight (we don’t want to suffocate), so air from outside will be pulled in through lots of openings throughout the house. And presumably that air being pulled in from outside is hot; one typically does not use an air conditioner on cool days.
The actual effect of this air conditioner is to make the space right in front of the air conditioner nice and cool, but fill the rest of the house with hot outdoor air. Probably not what one wants from an air conditioner!
Ok, that’s amusing, but the point of this post is not physics-101 level case studies in how not to build an air conditioner. The real fact of interest is that this is apparently the top rated new air conditioner on Amazon. How does such a bad design end up so popular?
One aspect of the story, presumably, is fake reviews. That phenomenon is itself a rich source of insight, but not the point of this post, and definitely not enough to account for the popularity of this air conditioner. The reviews shown on the product page are all “verified purchase”, and mostly 5-stars. There are only 4 one-star reviews (out of 104). If most customers noticed how bad this air conditioner is, I do not think a 4.7 rating would be sustainable. Customers actually do like this air conditioner.
And hey, this air conditioner has a lot going for it! There’s wheels on the bottom, so it’s very portable. Setup is super easy - only one hose to the window, much less fiddly than those two-hose designs where you attach one hose and the other pops off.
Sure, the air conditioner has a major problem, but it’s not a major problem which most people will notice. They may notice that most of the house is still hot, but the space right in front of the air conditioner will be cool, so obviously the air conditioner is doing its job. Very few people will realize that the air conditioner is drawing hot air into the rest of the house. (Indeed, I saw zero reviews which mentioned that the air conditioner pulls hot air into the house - even the 1-star reviewers apparently did not realize why the air conditioner was so bad.)
[EDIT: several commenters seem to think that I'm claiming this air conditioner does not work at all, so I want to clarify that it will still cool down a room on net. If the air inside is all perfectly mixed together, it will still end up cooler with the air conditioner than without. The point is not that it doesn't work at all. The point is that it's stupidly inefficient in a way which I do not think consumers would plausibly choose over the relatively-low cost of a second hose if they recognized the problems.]
Generalization
Major problems are only fixed when those problems are obvious. Problems which most people won’t notice (or won’t attribute correctly) tend to stick around. There’s no economic incentive to fix them.
And in practice, there are plenty of problems which most people won’t notice. A few more examples:
- Most charities have pretty mediocre impact. But the actual impact is very-not-visible to the person making donations, so people keep donating. (Also people care about things besides impact, but nonetheless I doubt low-impact charities would survive if their ineffectiveness were generally obvious.)
- Medical research has a replication rate below 50%. But when the effect sizes are expected to be small anyways, it’s hard to tell whether it’s working, so doctors (and patients) keep using crap treatments.
- Based on my firsthand experience with the B2B software industry, success is mostly determined by how good the product looks to managers making the decision to purchase. Successful B2B software (think “enterprise software”) is usually crap, but has great salespeople and great dashboards for the managers.
… and presumably this extends to lots of other industries which I’m less familiar with.
Two points to highlight here:
- Regulation does not fix the problem, just moves it from the consumer to the regulator. A regulator will only regulate a problem which is obvious to the regulator. A regulator may sometimes have more expertise than a layperson, but even that requires that the politicians ultimately appointing people can distinguish real from fake expertise, which is hard in general.
- Waiting longer does not fix the problem. All those people who did not notice their air conditioner pulling hot air into the house will not start noticing if we just wait a few years. Problems do not automatically become obvious over time.
How Does This Relate To Takeoff Speeds?
There’s a common view that, as long as AI does not take off too quickly, we’ll have time to see what goes wrong and iterate on it. It's a view with a lot of intuitive outside-view appeal: AI will work just like other industries. We try stuff, see what goes wrong, fix it. It worked like that in all the other industries, presumably it will work like that in AI too.
The point of the air conditioner is that other industries do not, in fact, work like that. Other industries are absolutely packed with major problems which are not fixed because they’re not obvious. Even assuming that AI does not take off quickly (itself a dubious assumption at best), we should expect the same to be true of AI.
… But Won’t Big Problems Be Obvious?
Most industries have major problems which aren’t fixed because they’re not obvious. But these problems can only be so bad. If they were really disastrous, the disasters would be obvious. Why not expect the same from AI?
Because AI will eventually be far more capable than human industries. It will, by default, optimize way harder than human industries are capable of optimizing.
What does it look like, when the optimization power is turned up to 11 on something like the air conditioner problem? Well, it looks really good. But all the resources are spent on looking good, not on actually being good. It’s “Potemkin village world”: a world designed to look amazing, but with nothing behind the facade. Maybe not even any living humans behind the facade - after all, even generally-happy real humans will inevitably sometimes appear less-than-maximally “good”.
… But Isn’t Solving The Obvious Problems Still Valuable?
The nonobvious problems are the whole reason why AI alignment is hard in the first place.
Think about the “game tree” of alignment - the basic starting points, how they fail, what strategies address the failures, how those fail, etc. The most basic starting points are generally of the form “collect data from humans on which things are good/bad, then train something to do good stuff and avoid bad stuff”. Assuming such a strategy could be implemented efficiently, why would it fail? Well:
- In cases where humans label bad things as “good”, the trained system will also be selected to label bad things as “good”. In other words, the trained AI will optimize for things which look “good'' to humans, even when those things are not very good.
- The trained system will likely end up implementing strategies which do “good”-labeled things in the training environment, but those strategies will not necessarily continue to do the things humans would consider “good” in other environments.
(Somewhat more detail on these failure modes here.) Optimizing for things which look “good” to humans obviously raises exactly the sort of failure which the air conditioner points to. Failure of systems to generalize in “good” ways is less centrally about obviousness, but note that if it were obvious that the system were going to generalize badly, this would also be a pretty easy issue to solve: just don’t deploy the system if it will generalize badly. Problem is, we can’t tell whether a system will do what we want in deployment just by looking at what it does in training; we can’t tell by looking at the system's behavior whether there’s problems in there.
Point is: problems which are highly visible to humans are already easy, from an alignment perspective. They will probably be solved by default. There’s not much marginal value in dealing with them. The value is in dealing with the problems which are hard to recognize.
Corollary: alignment is not importantly easier in slow-takeoff worlds, at least not due to the ability to iterate. The hard parts of the alignment problem are the parts where it’s nonobvious that something is wrong. That’s true regardless of how fast takeoff speeds are. And the ability to iterate does not make that hard part easier. Iteration mainly helps on the parts of the problem which were already easy anyway.
So I don't really care about takeoff speeds. The technical problems are basically similar either way.
... though admittedly I did not actually learn everything I need to know about takeoff speeds just from air conditioner ratings on Amazon. It took a lot of examples in different industries. Fortunately, there was no shortage of examples to hammer the idea into my head.
I still the 25-30% estimate in my original post was basically correct. I think the typical SACC adjustment for single-hose air conditioners ends up being 15%, not 25-30%. I agree this adjustment is based on generous assumptions (5.4 degrees of cooling whereas 10 seems like a more reasonable estimate). If you correct for that, you seem to get to more like 25-30%. The Goodhart effect is much smaller than this 25-30%, I still think 10% is plausible.
I admit that in total I’ve spent significantly more than 1.5 hours researching air conditioners :) So I’m planning to check out now. If you want to post something else, you are welcome to have the last word.
SACC for 1-hose AC seems to be 15% lower than similar 2-hose models, not 25-30%:
I agree the DOE estimate is too generous to 1-hose AC, though I think it’s <2x:
The SACC adjustment assumes 5.4 degrees of cooling on average, just as you say. I’d guess the real average use case, weighted by importance, is closer to 10 degrees of cooling. I’m skeptical the number is >10—e.g. 95 degree heat is quite rare in the US, and if it’s really hot you will be using real AC not a cheap portable AC (you can’t really cool most rooms from 95->80 with these Acs, so those can't really be very common). Overall the DOE methodology seems basically reasonable up to a few degrees of error.
Still looks similar to my initial estimate:
I’d bet that the simple formula I suggested was close to correct. Apparently 85->80 degrees gives you 15% lower efficiency (11% is the output from my formula). 90->80 would be 20% on my formula but may be more like 30% (e.g. if the gap was explained by me overestimating exhaust temp).
So that seems like it's basically still lining up with the 25-30% I suggested initially, and it's for basically the same reasons. The main thing I think was wrong was me saying "see stats" when it was kind of coincidental that the top rated AC you linked was very inefficient in addition to having a single hose (or something, I don't remember what happened).
The Goodhart effect would be significantly smaller than that: