In my social circles, I frequently tell a joke making fun of the awful lot of cute kitten pictures available on the internet ("somewhere in the world, a whole server farm is doing nothing but storing pictures of cute kittens"). Joking apart, there are thousands of data centers around and the world's total data storage capacity is measured in Zettabytes.
How much of this memory do you think to be actually occupied by cute kitten pictures? What could be an effective way to make a Fermi estimate?
ImageNet was constructed to match the WordNet hierarchy, and is not representative of the distribution of images stored online. I'd guess that cat pics are10x--10Kx overrepresented.
I'd also be shocked if consumer images are even 0.1% of all data stored; there's a huge volume of other heavier datasets out there.