Regarding the analogy for city and rural people, I think something in has been left out, it should be noted that the city nodes here don't just have more connections, they also have more transmissions. 4 connections that infect at 0.2p transmits, uh 0.8 Expected Culture. 8 connections that ping at 0.2 transmits 1.6 Expected Culture. To maintain the same amount of expected culture transmission, increasing connectedness like that would have to come with decreasing the transmission probability per edge to 0.1.
The model as it exists applies well to {seeing fashions in a crowded street}, but it doesn't apply to every instance of cultural transmission, for instance, when a long conversation is required for the transmission to take place. When some degree of social consensus is required (for instance, if a person needs to hear a recommendation from more than one of their friends before they'll try a piece of media then start recommending it to their friends as well, and if they have finite time for listening to media recommendations), cities would actually be much less hospitable for those memes, because they're less cliquish.
An interactive blogpost by Kevin Simler on network dynamics, with a final section on academia and intellectual progress. I generally think careful exploration of small-scale simulations like this can help quite well with understanding difficult topics, and this post seems like a quite good execution of that approach.
Also some interesting comments on intellectual progress and academia (though I recommend reading the whole post):