Comment author: TCB 04 September 2011 08:56:05AM 0 points [-]

I'm not actually convinced that negative examples are really necessary for learning empirical clusters in thingspace, especially if you're just trying to teach someone a subcategory of a big class they're already familiar with. If someone is already familiar with the concept of "bird" and you want to inform them that there is such a thing as blue jays, it may suffice to show them just a few instances of a blue jay (assuming you don't care whether they learn the terminology). Source: this super cool paper about one-shot learning using hierarchical Bayesian models: http://www.mit.edu/~rsalakhu/papers/MIT-CSAIL-TR-2010-052.pdf

Comment author: PhilGoetz 09 July 2011 08:33:27PM -1 points [-]

Concepts exist without words, since words are just one part of a concept, and people with left temporal brain damage can lose access to a word without losing access to the concept.

A "symbol" sometimes means something atomic, which concepts are not. We probably have no symbols in our brains, in this strict sense.

Comment author: TCB 09 July 2011 08:50:45PM 0 points [-]

Interesting point. I certainly agree that concepts/words are not actually atomic, or Platonic ideals, or anything like that. Concrete concepts, in particular, seem to correspond to "empirical clusters in thing-space", or probability distributions over classes of objects in the real world (though of course, even objects in the real world aren't really atomic).

Despite this, most people still view themselves as thinking symbolically, and many people believe themselves to be logical reasoning agents. After reading the first couple chapters of Jaynes, I am very convinced that the mind works probabilistically and does not actually deal with absolutes. Yet at the level of conscious reasoning, we seem to perceive the world in terms of symbolic absolutes. It seems like this could be either verbal or visual, but either way I have difficulty imagining conscious reasoning without symbols, even if more complicated clusters or probability distributions underly those symbols at a subconscious level. I wonder why this is.

Comment author: XiXiDu 09 July 2011 07:08:48PM 0 points [-]

Has anyone developed techniques for thinking without words?

Has anyone developed techniques to do math without symbol manipulation?

Comment author: TCB 09 July 2011 07:36:03PM 3 points [-]

This article presents evidence that symbols exist in our minds independent of words. http://artksthoughts.blogspot.com/2009/07/concepts-cognition-and-anthropomorphism.html

Actually, it seems extremely unlikely that words would be required for symbolic thinking, considering that any animal advanced enough to base its actions on thought rather than pure reflex would need to have some kind of symbolic representation of the world.

Comment author: TCB 09 July 2011 07:17:02PM 3 points [-]

I did this a few years ago, but I'm not sure exactly how. I wanted to think less verbally because I worried that my thoughts were too constrained by words, which kept me at the very surface level of my consciousness and perhaps inhibited my access to deeper parts of my mind. I think that part of the transformation came about simply because I wanted it to (power of suggestion). It probably also helped that I started watching a lot more films and doing more math. I don't remember the exact process by which I transformed my thought-structure.

Something that I noticed was that my thoughts got much less verbal when I became more emotional. During a few particularly emotionally intense experiences this past year, I found myself less able to reason through thoughts verbally. At the time, I had an impression that my subconscious mind had taken over my thought process, closing my conscious mind out, and denying it the power of words that would let it interfere with the transformations going on inside me.

A few months ago, I decided to start thinking more verbally again, so that it would be easier to fulfill my dreams of being a novelist. I'll try to remember exactly how I did this, but the process is not wholly clear to me. I know that at some point I made a decision to "reprogram myself" to be more verbal, and I think that my desire to transform contributed significantly to the actual transformation. I also made a point of trying to express myself more verbally in my mind. One exercise I did involved looking at things in the world and trying to come up with eloquent verbal descriptions of them. I also started writing more in my journal, and reading a lot more. This wasn't a rigorous scientific experiment, and I didn't keep very careful track of the different things I was doing to reshape my thought-structure, but whatever I did worked, because I think very verbally now.

I'm not sure if this was helpful at all, but I figured I should comment here, since this is something I've actually done. I also don't know how similar these processes are between different people, or whether it matters that I'm female. Furthermore, I'll note that it was much easier to train myself to think verbally again than it was to make myself think less verbally; I was always a very verbal person growing up. Not sure if that's nature or nurture or both.

Comment author: JoshuaZ 06 July 2011 02:35:42AM 4 points [-]

I'm concerned that this sort of explanation could just as easily explain the exact opposite and so doesn't really give us much information. For example, one could imagine that in societies with less sexual repression, people are more likely to have sex or engage in sexual activity in a variety of circumstances and so have more opportunities to imprint on non-standard objects or situations. Moreover, the more open a society is about sex the more likely people are to hear about some fetish and decide to try it out just to see what it is like, and then get imprinted to it.

Comment author: TCB 06 July 2011 02:51:49AM 1 point [-]

Those are valid concerns. Regarding the first, that's why I emphasized the ritual component of sex in a repressed society. I suspect that such a society would have very strict rituals for sex: it must occur only at specific times in specific locations, and in the presence of certain stimuli. Some examples of stimuli are candles or lacy lingerie or dim lighting. An example of a time is night. I've heard lots of comments to the effect that having sex in the middle of the day would be strange, and that sex is strictly a nighttime activity. This could be classified as a "nighttime fetish", perhaps. The ritual component of sex would serve to highlight the ritual times/locations/stimuli, causing them to imprint more strongly than other, non-ritualized components of the sexual act.

Regarding your second objection, while that definitely seems like a possibility, the variations and experimentation would probably mean that no one thing would imprint strongly enough to become a fetish, because its presence wouldn't correlate strongly enough with the sexual act.

Comment author: TCB 06 July 2011 02:25:06AM 3 points [-]

Slightly off-topic thought regarding penny jars and fetish formation:

I've heard that fetishes are more prevalent in cultures where sex is repressed. I always wondered why this would be the case (assuming that it is in fact true). One explanation is associations: if people are raised to think sex is dirty, or that sex is a necessary but base bodily function akin to using the bathroom, then they might fetishize urine or excrement. And if people are raised to think that sex is beastly and animalistic, they might fetishize things that are related to animals and violence.

However, the penny jar experiment suggests another, more "innocent" explanation. Perhaps it's simply that, in sexually repressed cultures, people don't have sex very often, or they do it in a special ritualized setting. If this is the case, then accidents of that ritual setting might become associated with the sexual act itself. So, perhaps the neurons corresponding to the distinctive pink pillows on a lover's bed get wired up with the neurons that correspond to actually having sex. Then later, the pink pillows are enough to cause arousal, and perhaps in extreme cases pink pillows later become /required/ for arousal. This presumably wouldn't happen as much if the sex-location changed frequently, or if the setting was not seen as an important component of the ritual.

The first hypothesis seems to be a better explanation of things like poop fetishes, while the second hypothesis might better explain things like lacy pink lingerie. What do you think?

Also, it goes without saying that I enjoyed your article. =)

Comment author: [deleted] 04 July 2011 08:57:44AM 8 points [-]

For instance, suppose you believe that homogenizing milk makes it less healthy, and we should all drink our milk raw. (I picked this example because it's something decidedly non-mainstream that I believe with high probability.)

Homogenization or pasteurization? If it's the former, that's... unusual. I don't think I've seen anyone complain about that before. If it's the latter, I submit for your consideration that if you weren't aware of the difference between these two things (at least, not sufficiently aware to avoid mixing them up), then perhaps you don't know as much about them as you thought you did.

Comment author: TCB 04 July 2011 01:54:12PM 1 point [-]

Oops, you are right; I meant to type pasteurization! I also think that homogenizing milk is bad, but I believe that with lower probability. I'll edit my post, and thanks for the correction. =)

Are search engines perpetuating our biases?

17 TCB 04 July 2011 01:29AM

One of the great things about the internet is that there is a social group for almost every interest.  Pick an unusual hobby or ideology, and there is probably an online community centered around it.  This is especially wonderful for those of us who never quite fit in to mainstream society.

But there's also a downside to this aspect of the internet, which is that the more we immerse ourselves in these small online communities, the less exposure we get to the rest of the world.  And the less exposure we get to the rest of the world, the easier it is for us to hold onto false beliefs that the rest of the world rejects.  (Of course, it's also easier to hold onto true beliefs that the rest of the world rejects.)

For instance, suppose you believe that pasteurizing milk makes it less healthy, and we should all drink our milk raw.  (I picked this example because it's something decidedly non-mainstream that I believe with high probability.)  I'm fairly susceptible to social pressures, so at least for me, my belief in this proposition goes up when I'm hanging out with intelligent people who agree with me, and it goes down when I'm hanging out with intelligent people who look at me like I'm insane when I claim such a thing.  They don't need to state evidence in either direction to influence my belief-probability, though that certainly helps.  The important thing is that I think they're smart and therefore I trust their opinions.

Unsurprisingly, if I spend most of my time hanging out with normal, intelligent, scientifically-minded Americans, I start to question my beliefs regarding raw milk, but if I spend all my time on raw-milk-promoting websites, then my belief that raw milk is good for us is reaffirmed.

We like having our beliefs affirmed; it makes us happy when other people think we are right about things.  We'd rather seek out people who agree with us and can relate to our mindsets than seek out groups where everyone disagrees with us strongly.  This is normal and reasonable, and it's why all of us rationalists are hanging out here on LessWrong instead of lurking in creationist forums.  However, it does put us at risk of creating feedback loops: unusual ideas are proposed by people we respect, we affirm those ideas, others repeat those ideas, and so their prevalence and repetition causes them to be repeated more.  Many of those who disagree are hesitant to voice their disagreements for fear of rejection.  As a result, LessWrong perpetuates many ideas that the rest of the world considers somewhat odd.  Also, the rest of the world perpetuates many ideas that we at LessWrong consider extremely odd.

I'm not saying anything new here, I know.  Everything I've written so far has been discussed to death on LessWrong, and if I were less lazy this article would be full of links to the sequences.  If I recall correctly, the sequences recommend countering this problem by recognizing that we have these biases, and consciously trying to correct for them.

I try to do this, but I also tend to employ an additional solution to this problem.  Because I recognize that I'm easily influenced by others' beliefs, I make sure to expose myself to a myriad of different belief systems.  For instance, in politics, I read blogs by liberal feminist scientists as well as conservative anti-feminist traditionalists.  Since I respect the authors of all the blogs I read, and recognize that they are intelligent people who have thought deeply about their perspectives, I can't easily dismiss either perspective outright as lies spoken by a moron.  Since their beliefs differ so radically, I also can't just fall into the trap of believing everything I read.  So I'm forced to really think about the ideas, and question why their proponents believe them, and consolidate them (and other thoughts I might have) into my own coherent worldview.

Thus, I consider it important to be exposed to the ideas of people I disagree with.  Meeting intelligent people who think differently than I do keeps my mind open, and reminds me that there are things about the world that I don't know yet, and keeps me from overestimating the probability that my beliefs are true.

Unfortunately, search engines like Google are making it more difficult for me to do so.  About a week ago, I attended a lecture on information retrieval, and I was shocked to find out exactly how much our Google searches are customized to our own preferences.

Suppose John and Mary both Google something like "creationism".  Now suppose that John is an atheist who reads a lot of atheist forums, and Mary is a fundamentalist Christian who spends most of her time on Christian forums.  John's Google results might contain a lot of links to people on his favorite atheist website talking about how much creationism sucks, and Mary's Google results might contain a lot of links to her friends' blogs talking about how God created the earth.

In this example, John and Mary are both having their beliefs reaffirmed, because Google is presenting them with things they want to hear.  They will not be exposed to opposing viewpoints, and will be much less likely to change their minds about important issues.  In fact, their beliefs in their own viewpoints will probably grow stronger and stronger each time Google gives them back these results, and they will become less and less aware that another viewpoint exists.

Of course, this might happen without Google filtering its search results.  John might deliberately avoid reading the views of creationists, or dismiss them outright as moronic, or not ever Google anything that might lead him to their webpages, because he is convinced of his beliefs and would rather have them affirmed than contradicted.  Since he would skip past the fundamentalist Christian blog results anyway, Google is doing him a service by ranking the stuff he cares about higher.

But at least for me, this Google filtering is a bad thing.  I want to see other webpages which present other viewpoints, instead of being led back to the same places over and over again.  And if Google doesn't show them to me when I search for them, and I don't realize that my Google search results are being customized, I might never realize there's something I'm missing, or go to look for it.

I'm probably making this sound more dire than it actually is.  Obviously, I can try other search terms, or just ignore websites I've already been to.  Or I can follow links on other websites and wander off into regions of the internet without the help of Google.  But I still have a visceral reaction against search engines customizing their results to fit my individual ideological preferences, because they are perpetuating my biases without giving me any direct control over which pieces of information I receive.

What do you guys think?

Comment author: jsteinhardt 21 May 2011 05:20:21AM 14 points [-]

I think LessWrong could use more posts on actual technical topics in machine learning, and this is a nice first step. It would be good if there was a sequence on it.

You might want to include the link to the Wikipedia table of conjugate priors in your post, and at least a mention of exponential families.

If you're a smart Bayesian agent, then, you'll pick p(theta) to be a conjugate prior

While conjugate priors can be very useful computationally, it might also be the case that your data is not well-modeled by the conjugate prior (if you're using the Naieve bayes model then this might not seem like a huge problem, but once you start trying to build hierarchical models using conjugate priors, you have more potential to run into problems).

Comment author: TCB 21 May 2011 04:34:08PM 2 points [-]

I would love to see an LW sequence on machine learning! I imagine that LW would have a lot of interesting things to say about the philosophical aspects of ML in addition to the practical aspects.

I'm not sure I'd be qualified to contribute much to such a sequence, since I am just an undergrad, but I did have an outline in mind for an intuitive introduction to MLE and EM. If people would find that interesting, I could certainly post it on LW once it was written up!

I'm fairly inexperienced in ML, so all the models I've worked with are simple enough that they've had conjugate priors. (I think it's really cool that Dirichlet priors can be used for something as complicated as an HMM, but I guess the HMM is still just a whole bunch of multinomials.) I'm less familiar with hierarchical models. What is an example of a model for which is it difficult to use conjugate priors? The only hierarchical process I've heard about is the Dirichlet process, and I was under the impression (based on the name) that it involved Dirichlet priors somewhere; is this incorrect? I have been meaning to read about hierarchical models, so if you know of any good tutorials or papers on them, I would very much appreciate a link!

Comment author: Cyan 21 May 2011 03:33:49AM *  15 points [-]

This is a fantastic post! Well done.

That said, I have quibbles that relate to the philosophical import ascribed to the beta distribution:

  • the beta distribution is an excellent exemplar of the notion of the comparative weight of evidence in the prior vs the data, but the notion is much more general;
  • priors should ideally reflect the actual information at one's disposal, and thus should rarely actually be conjugate;
  • it's controversial to claim that alpha = beta = 1 expresses no prior knowledge; other proposals include the improper alpha = beta = 0 and Jeffreys' prior, alpha = beta = 0.5.

And one other complaint: using the notion of picking a "best" value of theta for prediction to motivate the subsequent discussion was a misstep. If prediction is the goal, then the Bayesian procedure is to formulate the joint distribution of theta and the as-yet-unobserved data and then treat theta as a nuisance parameter and integrate over it.

In spite of the above criticisms, I consider this post yeoman's work -- it deserves more upvotes than I can give it.

Comment author: TCB 21 May 2011 03:55:41PM 0 points [-]

After rereading this, I agree with you that I emphasized the beta distribution too heavily. This wasn't my intention; I just picked it because it was the simplest conjugate prior I could find. In the next draft of this document, I'll make sure to stress that the beta distribution is just one of many great conjugate priors!

I am a bit confused about what the second point means. Do you mean that conjugate priors are insufficient for capturing the actual prior knowledge possessed?

I did not know that it was controversial to claim that alpha = beta = 1 expresses no prior knowledge! I think I still prefer alpha = beta = 1 to the other choices, since the uniform distribution has the highest entropy of any continuous distribution over [0,1]. What are the benefits of the other two proposals?

Your last complaint is something I was worried about when I wrote this. Part of why I wrote it like that was because I figured people would be more familiar with the MLE/MAP style of prediction. Thanks to your feedback, though, I think I'll change that in my next draft of this document.

Again, thank you so much for the detailed criticism; it is very much appreciated! =)

View more: Prev | Next