Well, I dunno.
If I had random C program print me out 2,3,5,7,11 , I would still assume VERY low probability that it is going to print out primes correctly up to reasonably big number. Even more so for Turing machines or anything of this kind. Ditto for any natural processes. Ditto for seeing those numbers in idk child doodling when child didn't learn the primes yet. Child might have invented primes but that list is not remotely enough evidence.
If a human writer of a test tells me this sequence, I would guess that he knows of primes, and is telling primes to see if I know of primes too. But if he didn't get taught primes, I would not assume he invented primes from just this sequence.
The Kolmogorov's complexity really is dependent to language. If we do it informally with human language, then 'primes' is an answer that is shorter than 'two three five seven eleven'. But then the complexity depends to language and expectations of what's more complex.
Consider the 'petals around the roses' joke as very extreme example. Most educated individuals just try to search some giant solution space and are blind to the dots themselves, seeing it as numbers. Unless they have encountered something of this kind before (e.g. the other joke about counting loops in numbers), in which case they solve it quite easily.
edit: There is other slightly less related example with regards to how distribution is different for humans. If you look at natural data, it most often begins with 1. If you look at human-forged data, the frequencies of first digits are much more equal unless the person committing the forgery is very clever.
All the cases in your first paragraph provide context. After the first few, the context essentially tells you whether it's possible for the sequence to be an enumeration of primes.
In the first few cases, of unknown computer programs, do you really think that the prime number hypothesis should be struck with a 40 decibel probability penalty? I'd love to bet with you. Lots and lots of money, as often as possible.
From
http://astrobio.net/pressrelease/4569/computers-that-think-like-humans
That's an awesome study.
I always thought the variations of continue series test (progressive matrices, number sequences, word A is to word B as word C is to ?? etc) are very culturally biased. You solve those best and easiest by sharing with the test maker the learning environment (and for visual ones, sharing visual environment), as well as sharing neural architecture. That lets you pick same choice as the test maker [edit: and do so easily and naturally]. And this research provides very good demonstration.
Of course there will be a correlation of ability to guess the same or secondguess the test maker with intelligence, but so does e.g. height correlate with intelligence (via effect of nutrition on both); perhaps we should add 'what is your height' question to IQ test and then let some giant robot score a genius.
Note: one might think of sequence guessing as task of minimizing Kolmogorov complexity. That's not quite so, sequences are too short, shorter than the generators. Consider sequence 2,3,5,7,11,? . Obviously the answer on IQ test would be 13 (primes). Good luck writing primes generating program that is simpler than this sequence itself, though [edit: i mean, simpler than a program which just prints those numbers followed by whatever garbage. Unless you have a language where 'print primes' is a basic command]. (and of course the length of program will be very dependent on the machine being used)