"Why can I have little information but still have to search a huge state space, why can't I go straight to the conclusion/action?"
There's no answer for this question, if you find one, pick your prize of 1million dollars:
https://en.wikipedia.org/wiki/P_versus_NP_problem
Here is a video explaining it better: https://www.youtube.com/watch?v=YX40hbAHx3s
Thank you, I should have thought of it in that (Time complexity) context. Time complexity is not just about how long it takes but also about the nature of the problem. Chess is neither P nor NP, but the question of complexity is certainly related.
Maybe my question is: Why can there be a Heuristic that does fairly well and is nowhere near exponential? Even a count of the pieces left on the board usually says something that only a full search can prove.
The current state of a game is small, but the branching of future states based on legal moves is pretty huge. According to https://www.chessprogramming.org/Chess_Position, there are as many as 10^46 reachable positions, though presumably real games prune that pretty heavily. In fact, we probably have storage and processing nowadays to build the complete lookup table if someone wanted to spend the money.
Basically, the rules are such that the tree of possible future states (for which evaluation means solving for each future state, to find the leaf outcome if the player takes the "best" branch at each node and the opponent takes the "worst". So a very large amount of the future space needs to be searched.
How simple-seeming rules can generate large future state-spaces is a fascinating and important topic. Compare with https://en.wikipedia.org/wiki/Mandelbrot_set - it's a single, simple calculation: z^2 + c, repeated many times to determine whether it converges for any given point. https://en.wikipedia.org/wiki/Chaos_theory makes for a fun rabbit-hole.
Ok, let's go with chess. For that game there is an optimal balance between the tree search and the evaluation function. The search is exploratory. The evaluation is a score.
The evaluation can obviously predict the search to some degree. Humans are very bad at searching, still some can win against computers.
The search is decompressing the information to something more easily evaluated by a computer. A human can do it with much less expansion. Just a matter of Hardware or is it because the information was there all along and just needed a "smarter" analysis?
I am a bit confused and thought I'd rather ask and discuss here before thinking about it for long. As usual I am trying to compartmentalize, structure, make distinctions.
My confusion was triggered thinking about the evaluation function (heuristic to rate the certainty to win/loose) in chess. Clearly what it takes is all there on the board, actually the game is already decided based on that state and assuming both player play to force the best possible outcome.
Why do we need to process data when the information is obviously already in the input? (Yes, I know one can make wordy the distinction between data, information, knowledge, wisdom, understanding.)
Here a few cases, as examples:
Chess: The game state takes few bit to store, yet it takes a lot of processing to decide on a move.
Search: I know what I want but still have to go through a much bigger data set.
Statistical inference: I came up with an hypotheses and now I have to go through tons of data to see if it is consistent.
Automation: I am trying to control something and must derive actions from measurements.
I suppose any compute is just a replacement for a huge LUT, proving that the information is already in the input, but then searching that LUT would still take compute.
There are transformations that just change the perspective, but do not discard or compress.
There is compression were I either only care for decisive aspects (filter / lossy) or there is redundancy in the input (lossless).
In the end it is always about deriving efficient actions to reach goals. For that, I suppose, we can make a one fits all scheme: Make and maintain model. Measure current state. Filter measurement. Compute possible outcomes. Evaluate possible outcomes impact and likelihood. Decide on / search for best action.
But what I find so confusing is the contrast of starting out with little information or a lot but that does not explain alone how much there is to compute. Why can I have little information but still have to search a huge state space, why can't I go straight to the conclusion/action?
Obviously I am thinking about it wrong, there might not be an antagonism but a confusion of stages.