The normal methods of explanation, and the standard definitions, for 'information', such as the 'resolution of uncertainty' are especially difficult to put into practice.
As these presuppose having knowledge already comprised, and/or formed from, a large quantity of information. Such as the concepts of 'uncertainty' and 'resolution'.
How does one know they've truly learned these concepts, necessary for recognizing information, without already understanding the nature of information?
This seems to produce a recursive problem, a.k.a, a 'chicken and egg' problem.
Additionally, the capability to recognize information and differentiate it from random noise must already exist, in order to recognize and understand any definition of information, in fact to understand any sentence at all. So it's a multiply recursive problem.
Since, presumably, most members of this forum can understand sentences, how does this occur?
And since presumably no one could do so at birth, how does this capability arise in the intervening period from birth to adulthood?
Not necessarily. There seems to be a category error in your question: "minimizing prediction error" = task, "pattern matching" = algorithm. Multiple algorithms could potentially be used for the same task (particularly in case you are just trying your best at an approximate solution, or if evolution is doing the best it is able to blindly stumble upon). For example, if you have a good model, you can use it to minimize prediction errors analytically.