The discount factor can mess things up - you'll meet someone again, but after how long?
Comments on "When Bayesian Inference Shatters"?
I recently ran across this post, which gives a lighter discussion of a recent paper on Bayesian inference ("On the Brittleness of Bayesian Inference"). I don't understand it, but I'd like to, and it seems like the sort of paper other people here might enjoy discussing.
I am not a statistician, and this summary is based on the blog post (I haven't had time to read the paper yet) so please discount my summary accordingly: It looks like the paper focuses on the effects of priors and underlying models on the posterior distribution. Given a continuous distribution (or a discrete approximation of one) to be estimated from finite observations (of sufficiently high precision), and finite priors, the range of posterior estimates is the same as the range of the distribution to be estimated. Given models that are arbitrarily close (I'm not familiar with the total variance metric, but the impression I had was that, for finite accuracy, they produce the same observations with arbitrarily similar probability), you can have posterior estimates that are arbitrarily distant (within the range of the distribution to be estimated) given the same information. My impression is that implicitly relying on arbitrary precision of a prior can give updates that are diametrically opposed to the ones you'd get with different, but arbitrarily similar priors.
First, of course, I want to know if my summary's accurate, misses the point, or wrong.
Second, I'd be interested in hearing discussions of the paper in general and whether it might have any immediate impact on practical applications.
Some other areas of discussion that would be of interest to me: I'm also not entirely sure what 'sufficiently high precision' would be. I also have only a vague idea of the circumstances where you'd be implicitly relying on the arbitrary precision of a prior. I'm also just generally interest in hearing what people more experienced/intelligent than I am might have to say here.
I'm not sure I see your point. My reasoning was that if you meet the same person on average every thousand games in an infinite series of games, you'll end up meeting them an infinite number of times. Am I confusing the sample space with the event space?
Game Theory of the Immortals
I’m sure many others have put much more thought into this sort of thing -- at the moment, I’m too lazy to look for it, but if anyone has a link, I’d love to check it out.
Anyway, I ran into some interesting musings on game theory for immortal agents and I thought it was interesting enough to talk about.
Cooperation in games like the iterated Prisoner’s Dilemma is partly dependent on the probability of encountering the other player again. Axelrod (1981) gives the payoff for a sequence of 'cooperate's as R/(1-p) where R is the payoff for cooperating, and p is a discount parameter that he takes as the probability of the players meeting again (and recognizing each other, etc.). If you assume that both players continue playing for eternity in a randomly mixing, finite group of other players, then the probability of encountering the other player again approaches 1, and the payoff for an extended period of cooperation approaches infinity.
So, take a group of rational, immortal agents, in a prisoner’s dilemma game. Should we expect them to cooperate?
I realize there is no optimal strategy without reference to the other players’ strategies, and that the universe is not actually infinite in time, so this is not a perfect model on at least two counts, but I wanted to look at the simple case before adding complexities.
There's certainly mathematical models of rumors, which is a similar enough but not quite the same concept.
From memory, they model similar to epidemics, which I'm not sure how that related to genetic drift and selection.
I seem to remember more elaborate techniques that I think were trying to capture genetic drift and selection, but I can't find them at the moment.
A quick google along the lines of "mathematical model meme propagation" does tend to pop up quite a few models. Here are two that seemed interesting: http://cogprints.org/531/1/mav.htm and http://cfpm.org/jom-emit/2000/vol4/kendal_jr&laland_kn.html
"Meme" is not a model, it's a reference class of models, many of which are informal. In order to talk about testing it, you must first zoom in.
Could you elaborate on that?
Memes?
"All models are wrong, but some are useful" — George E. P. Box
As a student of linguistics, I’ve run into the idea of a meme quite a lot. I’ve even looked into some of the proposed mathematical models for how they transmit across generations.
And it certainly is a compelling idea, not least because the potential for modeling cultural evolution alone is incredible. But while I was researching the idea (and admittedly, this was some time ago; I could well be out of date) I never once saw a test of the model. Oh, there were several proposed applications, and a few people were playing around with models borrowed from population genetics, but I saw no proof of concept.
This became more of a problem when I tried to make the idea pay rent. I don’t think anyone disputes that ideas, behaviors, etc. are transmitted across and within generations, or that these ideas, behaviors, etc. change over time. As I understand it, though, memetics argues that these ideas and behaviors change over time in a pattern analogous to the way that genes change.
The most obvious problem with this is that genes can be broken down into discrete units. What’s the fundamental unit of an idea? Of course, in a sense, we could think of the idea as discrete, if we look at the neural pattern it’s being stored as. This exact pattern is not necessarily transmitted through whatever channel(s) you’re using to communicate it — the pattern that forms in someone else’s brain could be different. But having a mechanism of reproduction isn’t so important as showing a pattern to the results of that reproduction: after all, Darwin had no mechanism, and yet we think of him as one of the key figures in discovering evolution.
But I haven’t seen evidence for the assertion that memes change through time like genes. I have seen anecdotes and examples of ideas and behaviors that have spread through a culture, but no evidence that the pattern is the same. I haven’t even seen a clear way of identifying a meme, observing it’s reproduction, or tracking its offspring. Not so much as a study on the change of frequency of memes in an isolated population. Memetics today has less evidence than Darwin did when he started out; at least Darwin could point to discrete entities that were changing.
Without this sort of evidence, all the concept of a meme gives me is that ideas and behaviors can get transmitted, and that they can change. And I don’t need a new concept for that. Every now and then I’ll run a search on memetics just to see if anyone’s tried to address these problems — after all, a model describing how the frequency of ideas change in a population could be extremely useful to me — but so far I’ve seen nothing, and I don’t usually have the time to run a truly thorough search.
If any of you have, and if you know of evidence for the concept, please send me a link.
This sounds like a map/territory confusion. "Intelligence" is a concept in the map, used to summarize the common correlations in success across domains. There is no assumption that fully general cross-domain optimizers exist; it's an empirical observation that most of the variance in performance across cognitive tasks happens along a single dimension. Contrast this with personality, where most of the variance is along five dimensions. We could talk about how each person reacts in each possible situation or "island", but most of this information can be compressed into five numbers.
We could always drill down and talk about more factors, ie fluid vs crystallized intelligence or math vs verbal. More factors gives us more predictive power, though additional factors are increasingly less useful when chosen well.
Though a single-factor model works well for humans, this isn't necessarily the case for more general minds. I suspect the broad concept of intelligence carves reality at its joints fairly well, but assuming so would be a mistake.
Thanks for this! I've really found it helpful.
I suppose part of my confusion came from reading in Eyesenck about the alarmingly large number of geniuses that scored as prodigies, but over a longitudinal study, ended up living unhappy lives in janitor-level jobs. Eyesenck deals with this by discussing correlations between intelligence and some more negative personality traits, but I would have expected great enough intelligence to invent routines to compensate for that. In any case, I think this points to my further being confused about how 'success' was being defined.
I'm also puzzled at the apparent disconnect between solving problems in one's own life and solving problems on paper.
The citations in this comment are new science, so please take them with at least a cellar of salt:
There are recent studies, especially into Wernicke's area, which seem to implicate alternate areas for linguistic processing : http://explore.georgetown.edu/news/?ID=61864&PageTemplateID=295 (they don't cite the actual study, but I think it might be here http://www.pnas.org/content/109/8/E505.full#xref-ref-48-1); and this study (http://brain.oxfordjournals.org/content/124/1/83.full) is also interesting.
Terrence Deacon's 'The Symbolic Species' also argumes that Broca's area is not as constant across individuals as the other subsections being discussed are; interpretations of Broca's area in particular are shaky (argues Deacon) because this region is immediately adjacent to the motor controls for the equipment needed to produce speech. I have seen no studies attempting to falsify this claim, though, so unless anyone knows of actual evidence for it, we can safely shuffle this one into the realm of hypothesis for now.
In any case, Wernicke's and Broca's areas may not be the best examples of specialization in brain regions; I think we have a much clearer understanding (as these things go) of the sensory processing areas.
Let's Talk About Intelligence
I'm writing this because, for a while, I have noticed that I am confused: particularly about what people mean when they say someone is intelligent. I'm more interested in a discussion here than actually making a formal case, so please excuse my lack of actual citations. I'm also trying to articulate my own confusion to myself as well as everyone else, so this will not be as focused as it could be.
If I had to point to a starting point for this state, I'd say it was in psych class, where we talked about research presented by Eyesenck and Gladwell. Eyesenck is very clear to define intelligence as the ability to solve abstract problems, but not necessarily the motivation . In many ways, this matches Yudkowsky's definition, where he talks about intelligence as a property we can ascribe to an entity, which lets us predict that the entity will be able to complete a task, without ourselves necessarily understanding the steps toward completion.
The central theme I'm confused about is the generality of the concept: are we really saying that there is a general algorithm or class of algorithms that will solve most or all problems to within a given distance from optimum?
Let me give an example. Depending on what test you use, an autistic can look clinically retarded, but with 'islands' of remarkable ability, even up to genius levels. The classic example is “Rain Man,” who is depicted as easily solving numerical problems most people don't even understand, but having trouble tying his shoes. This is usually an exaggeration (by no means are all autistics savants), and these island skills are hardly limited to math. The interesting point, though, is that even someone with many such islands can have an abysmally low overall IQ.
Some tests correct for this – Raven's Pattern matching test, for instance, gives you increasingly complex patterns that you have to complete – and this tends to level out those islands, and give an overall score that seems commensurate with the sheer genius that can be found in some areas.
What I find confusing is why we're correcting this at all. Certainly, we know that some people, given a task, can complete that task, and of course, depending on the person, this task can be unfathomably complex. But do we really have the evidence to say that, in general, this task does not depend on the person as well? Or, more specifically, on the algorithms they're running? Is it reasonable to say that a person runs an algorithm that will solve all problems within an efficiency x (with respect to processing time and optimality of the solution)? Or should we be looking closer for islands in neurological baselines as well?
Certainly, we could change the question and ask how efficient are all the algorithms the person is running, and from that, we could give an average efficiency, which might serve as a decent rough estimate for the efficiency with which a person will solve a problem. And for some uses, this is exactly the information we're looking for, and that's fine. But, as a general property of the people we're studying, it seems like the measure is insufficient.
If we're trying to predict specific behavior, it seems like it would be useful to be aware of whatever 'islands' exist – for instance, the common separation between algebraic and geometric approaches to math. In my experience, using geometric explanations to someone with an algebraic approach may not be at all successful, but this is not predictive of what we might think of as the person's a priori probability of solving the problem: occasionally they seem to solve the problem with no more than a few algebraic hints. Of course, this is hardly hard evidence, but I think it points to what I'm getting at.
Looking at the specific algorithm that's being used (or perhaps, the class of algorithm?) can be considerably more predictive of the outcome. Actually, I can't really say that, either: looking at what could be a distinct algorithm can be considerably more predictive of the outcome. There are numerous explanations for these observations, one of which is of course that these are all the same algorithm, just trained on different inputs, and perhaps even constrained or aided by changes in the local neural architecture (as some studies on neurological correlates of autism might suggest). But computational power alone seems insufficient if we're going to explain phenomena like the autistic 'islands'. A savant doesn't want for computational power – but in some areas, they can want for intelligence.
Here's where I start getting confused: the research I've seen assumes intelligence is a single trait which could be genetically, epigenetically, or culturally transmitted. When correlates of intelligence are looked for, from what I've seen, the correlates are for the 'average' intelligence score, and largely disregard the 'islands' of ability. As I've said, this can be useful, but it seems like answering some of these questions would be useful for a more general understanding of intelligence, especially going into the neurological side of things, whether that's in wetware or hardware.
Then again, there's a good chance I'm missing something: in which case, I'd appreciate some help updating my priors.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
If you have a strong discount factor, then even if you meet the same person infinitely often, your gain is still bounded above (summing a geometric series), and can be much smaller than winning your current round.
face-palm Ah yes. Thanks.