Different explanation from what I saw in the comments:
Maybe it isn't that cuteness causes us to care for children, but that it stops us from destroying all other life in the vicinity. Considering that a lot of males have an aggressive instinct (testosterone is connected with violent behaviors in both genders, but males are more likely to have high levels), what would uncivilized people with no sense of cuteness do to animal populations? I have practically no aggressive instincts myself, being female and having the stereotypically low testosterone, but here's how I think that might go:
They might think it's a good idea to practice hunting skills by killing everything in sight.
When they're angry would they shoot the first thing they see even if it's a baby deer? In contrast, if they go out into the forest to shoot something in anger a few times and encounter cute baby deer, which calms them down and makes them feel bad for wanting to shoot them, this may condition them not to develop a habit of shooting things when angry.
When a cave person with no cuteness instinct feels ambitious, do they set out to kill everything they see for a week as a way of showing dominance over the jungle?
If a non-cute experiencing cave person sees a family of bears, do they launch their spear at the mother, not caring whether all of the cubs die or do they feel concern about orphaning cubs and wait to find a lone male? This is very important because if the cave person allows the first scenario, their hunting practices will reduce the edible bear population substantially. In the second, the cave person has minimized their impact. (Few male bears are can impregnate many females, meaning that the bears can reproduce at a similar pace even after losing most of them, while fewer female bears will certainly mean less reproductive capacity for the bears.)
When a cave man meets a cave woman he finds sexy, does he catch every animal he can find to show her how good he is at catching animals? Do the other men kill even more in order to compete? If she has an instinctive respect for life, then lots of dead bunnies and baby deer will upset her. This may encourage them to channel their urges to compete for her into a "quality over quantity" strategy by finding one really good trophy instead.
I might question here whether cuteness was necessary if they had empathy. However, empathy is triggered for things like verbal explanations, tears and certain facial expressions - behaviors that animals are very disadvantaged at accomplishing. Also, these would be difficult to detect from the distance at which you'd start stalking them, and they would be very brief, as they'd start running as soon as they notice you, so after that, all you'd see was the back of them. Also, cuteness works even after the animal is dead - it can trigger "Oh no! I killed something cute!" remorse when an empathetic equivalent might not be triggered because expressiveness isn't a likely characteristic of an inanimate face.
This might also explain why babies are less cute - we spend enough time close up to them to notice their facial expressions and empathize with them, and they have various advantages in being able to trigger specific empathetic reactions, so since empathy is frequently triggered, cuteness is less important.
Though, a much simpler explanation is also possible: Maybe your notions about how common it is for humans to find animals cuter than babies is based on a biased sample. I bit, just now, not even thinking about it, because I agreed with your idea that most people find animals cuter but then it dawned on me: maybe it's not that common for people to find babies less cute than animals. There could be some other reason our cuteness websites seem to focus on animals - parents don't like putting up pictures of babies for security reasons, they're concerned they'll look like braggarts, and on a website dominated by animals they don't want to upload pictures of a child because it makes the dehumanizing implication that their baby is just an amusing little animal.
Also, it could have to do with the supply and demand of cute little animals to cute human babies. There are 300 million Americans and probably only a few million of them are babies. To contrast, Americans probably have millions of cats and dogs and gerbils, etc plus there must be many times more bunnies and squirrels and such that they might see in their lawns than there are human babies. Also, animals are cute for longer - many of them are cute as adults - whereas cute babies are, by definition, only cute while they're babies.
And we can add goofy words to animal pictures without worrying about it humiliating them, or take dozens of videos of them jumping into boxes (like with Maru cat, my personal favorite) without anyone worrying about it damaging their future reputations or near-term mental health. That we can take more pictures of animals and do more things with them may increase the ratio of animal pictures to baby pictures by quite a bit.
Come to think of it, how often do you see cute animals in real life vs. how often do you see cute babies? I work in front of a window that looks out into a garden. I see cute birds and squirrels all day. I probably see at least a hundred times as many cute animals as babies in my daily life. Multiply this by a lot because I think bumblebees are cute. When I see people carrying their babies around at the supermarket, I might smile at them, but I don't stare at them the way I might watch a cute birdie because that's rude. I spend a lot less time appreciating cute babies than animals for this reason.
If my mind fills with images of cute animals and leaves me at a loss when coming up with cute baby images when I think about whether animals or babies are cuter, maybe that's why.
These are all interesting ideas, but are they true ? That is, is there any evidence to support any of them ?
Daniel Dennett has advanced the opinion that the evolutionary purpose of the cuteness response in humans is to make us respond positively to babies. This does seem plausible. Babies are pretty cute, after all. It's a tempting explanation.
Here is one of the cutest baby pictures I found on a Google search.
And this is a bunny.
Correct me if I'm wrong, but the bunny is about 75,119 times cuter than the baby.
Now, bunnies are not evolutionarily important for humans to like and want to nurture. In fact, bunnies are edible. By rights, my evolutionary response to the bunny should be "mmm, needs a sprig of rosemary and thirty minutes on a spit". But instead, that bunny - and not the baby or any other baby I've seen - strikes the epicenter of my cuteness response, and being more baby-like along any dimension would not improve the bunny. It would not look better bald. It would not be improved with little round humanlike ears. It would not be more precious with thumbs, easier to love if it had no tail, more adorable if it were enlarged to weigh about seven pounds.
If "awwww" is a response designed to make me love human babies and everything else that makes me go "awwww" is a mere side effect of that engineered reaction, it is drastically misaimed. Other responses for which we have similar evolutionary psychology explanations don't seem badly targeted in this way. If they miss their supposed objects at all, at least it's not in most people. (Furries, for instance, exist, but they're not a common variation on human sexual interest - the most generally applicable superstimuli for sexiness look like at-least-superficially healthy, mature humans with prominent human sexual characteristics.) We've invested enough energy into transforming our food landscape that we can happily eat virtual poison, but that's a departure from the ancestral environment - bunnies? All natural, every whisker.1
It is embarrassingly easy to come up with evolutionary psychology stories to explain little segments of data and have it sound good to a surface understanding of how evolution works. Why are babies cute? They have to be, so we'll take care of them. And then someone with a slightly better cause and effect understanding turns it right-side-up, as Dennett has, and then it sounds really clever. You can have this entire conversation without mentioning bunnies (or kittens or jerboas or any other adorable thing). But by excluding those items from a discussion that is, ostensibly, about cuteness, you do not have a hypothesis that actually fits all of the data - only the data that seems relevant to the answer that presents itself immediately.
Evo-psych explanations are tempting even when they're cheaply wrong, because the knowledge you need to construct ones that sound good to the educated is itself not cheap at all. You have to know lots of stuff about what "motivates" evolutionary changes, reject group selection, understand that the brain is just an organ, dispel the illusion of little XML tags attached to objects in the world calling them "cute" or "pretty" or anything else - but you also have to account for a decent proportion of the facts to not be steering completely left of reality.
Humans are frickin' complicated beasties. It's a hard, hard job to model us in a way that says anything useful without contradicting information we have about ourselves. But that's no excuse for abandoning the task. What causes the cuteness response? Why is that bunny so outrageously adorable? Why are babies, well, pretty cute? I don't know - but I'm pretty sure it's not the cheap reason, because evolution doesn't want me to nurture bunnies. Inasmuch as it wants me to react to bunnies, it wants me to eat them, or at least be motivated to keep them away from my salad fixings.
1It is possible that the bunny depicted is a domestic specimen, but it doesn't look like it to me. In any event, I chose it for being a really great example; there are many decidedly wild animals that are also cuter than cute human babies.