R0 tells you how many others each person infects on average. So R0 is in one sense the measure of contagiousness--it just tells you how contagious people with the disease are on average.
Consider two different diseases with the same R0, let's say R0 = 2. So each person on average infects 2 others. For the first disease, almost all patients infect exactly two others, but for the second, plenty infect two, many infect one, and a much smaller number infect 10 or even more others. So the average is the same, but the distribution is very different.
Given some other assumptions, this paper shows that diseases more like disease one will end up infecting many more people in the end than diseases like disease two, even though they have the same R0. So it is important to understand the distribution of secondary infections in addition to the average when predicting the final outbreak size. Contact tracing (seeing who people with the disease came in contact with and checking to see whether they end up getting infected) allows epidemiologists to do that.
Someone on Reddit linked to this preprint paper arguing that the other moments of the secondary infection curve (variance, skewness, kurtosis) can overwhelm the mean (i.e., the R0) in predicting the number of people ultimately infected. With a high variance, right-skewed, high kurtosis curve (loosely, with relatively few "super-infectors" bringing up the average), there are more chances for the outbreak to stochastically die out before those super-infectors get their chance to keep things going. The authors conclude that "higher moments of the distribution of secondary cases can lead a disease with a lower R0 to more easily invade a population and to reach a larger final outbreak size than a disease with a higher R0. " I'm not positioned to evaluate all of their arguments, but their reasoning based on the models they provided made sense as far as I could tell, using some assumptions that seemed fairly reasonable to this layperson.
The practical consequence of this is that effective contact tracing in the early stages of an outbreak (before too many so-called "community spread" cases) would provide invaluable epidemiological data.
Specifically, this is known as a hubness effect (when the distribution of the number of times an item is one of the k nearest neighbors of other items becomes increasingly right skewed as the number of dimensions increases) and (with certain assumptions) should be related to the phenomenon of these being closer to the centroid.