I found pair programming pretty useful when starting a new project from scratch, when changes are likely to be interdependent. It is then better to work with, let's say, 1.5x the performance of a single developer on one thing a time, than to work separately and then try to reconcile the changes. Knowledge transfer is also very important at this stage (you get more people with the same vision of the fundamentals).
This generalizes to other cases when there is a "narrow front" - when few things can be worked on in parallel without stepping on each other's toes.
Even more generally, it seems there are three kinds of clear benefits:
1) Less change synchronization (fewer changes worked on at the time).
2) Knowledge transfer (see @FeepingCreature's answer).
3) Immediate, detailed review - probably fewer defects.
There is also a matter of raw throughput (or how much time is required to make a specific change, while the rest of the code is assumed to stay the same, ignoring the cost of syncing with any changes done in parallel). A naive baseline is that a pair has a throughput of a single developer (since they're working on one change at a time). Fortunately, it can be way better, because one person can just focus on the details on the code and the other on the slightly bigger picture and next steps, look up the relevant facts from the documentation etc. This eliminates a lot of context switching and limits the number of things that each developer needs to keep in working memory. Also a lot of typos and other simple problems get caught immediately, so there is less debugging to do. It's not so clear, what all of this stuff adds up to.
I was able to find some studies about the topic, including a meta-analysis by Hannay et al. TL;DR: it depends on the situation, including how experienced are the developers and how complex is the task). It's clearly not a silver bullet and generally it still seems to be a trade-off between person-hours spent and the quality of the produced software.
By personal subjective experience, the usecase where pair programming shines most is onboarding and skill transfer. Basically, you have a person who needs to learn a skill and a person who has the skill, so you sit the person who has the skill next to the other person and get the other person to do a task that needs the skill by instruction of the person who has the skill. This is way more efficient than lectures and slightly more efficient than reading documentation, because all parts of the skill are taught, only relevant information is taught, and the instruction is necessarily in actionable form. Important conditions are that the learning person is open to instruction and correction, and the teaching person avoids meandering into irrelevant information. Pair programming done like this relies on both parties being effective communicators.
(A related technique is mob programming, which is useful for collaborative design assuming your code environment is highlevel enough to keep pace with the discussion.)