You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Teaching Bayesianism

3 JQuinton 08 June 2012 08:18PM

I've had a bit of success with getting people to understand Bayesianism at parties and such, and I'm posting this thought experiment that I came up with to see if it can be improved or if an entirely different thought experiment would be grasped more intuitively in that context:

Say there is a jar that is filled with dice. There are two types of dice in the jar: One is an 8-sided die with the numbers 1 - 8 and the other is a trick die that has a 3 on all faces. The jar has an even distribution between the 8-sided die and the trick die. If a friend of yours grabbed a die from the jar at random and rolled it and told you that the number that landed was a 3, is it more likely that the person grabbed the 8-sided die or the trick die?

I originally came up with this idea to explain falsifiability which is why I didn't go with say the example in the better article on Bayesianism (i.e. any other number besides a 3 rolled refutes the possibility that the trick die was picked) and having a hypothesis that explains too much contradictory data, so eventually I increase the sides that the die has (like a hypothetical 50-sided die), the different types of die in the jar (100-sided, 6-sided, trick die), and different distributions of die in the jar (90% of the die are 200-sided but a 3 is rolled, etc.). Again, I've been discussing this at parties where alcohol is flowing and cognition is impaired yet people understand it, so I figure if it works there then it can be understood intuitively by many people.

Epistemic Utility Arguments for Probabilism [Link]

1 XiXiDu 26 September 2011 11:10AM

Stanford Encyclopedia of Philosophy

First published Fri Sep 23, 2011

In this entry, we explore a particular strategy that we might deploy when we wish to establish an epistemic norm such as Probabilism or Conditionalization. It is called epistemic utility theory, or sometimes cognitive decision theory. I will use the former. Epistemic utility theory is inspired by traditional utility theory, so let's begin with a quick summary of that.

Traditional utility theory (also known as decision theory) explores a particular strategy for establishing the norms that govern which actions it is rational for us to perform in a given situation. The framework for the theory includes states of the world, actions, and, for each agent, a utility function, which takes a state of the world and an action and returns a measure of the extent to which the agent values the outcome of performing that action at that world. We call this measure the utility of the outcome at the world.

[...] we might say that an agent ought to perform an action that has maximal expected utility, where the expected utility of an action is obtained by weighting its utility at each state of the world by the credence assigned to that state of the world, and summing. This norm is called Maximize Expected Utility.

Link: plato.stanford.edu/entries/epistemic-utility/

An introduction to Bayesianism [links]

9 lukeprog 29 August 2011 07:06PM

Two recent papers in Philosophy Compass summarize the arguments for and against Bayesianism:

Easwaran, Bayesianism I: Introduction and Arguments in Favor

Easwaran, Bayesianism II: Applications and Criticisms

Philosophy Compass is a journal of review articles, my favorite reading material.

Kenny Easwaran got his PhD from UC Berkeley under formal epistemologist Brandon Fitelson, did a post-doc at ANU, and is now an assistant professor at USC.

A potential problem with using Solomonoff induction as a prior

13 JoshuaZ 07 April 2011 07:27PM

There's a problem that has occurred to me that I haven't seen discussed anywhere: I don't think people actually wants to assign zero probability to all hypotheses which are not Turing computable. Consider the following hypothetical: we come up with a theory of everything that seems to explain all the laws of physics but there's a single open parameter (say the fine structure constant). We compute a large number of digits of this constant, and someone notices that when expressed in base 2, the nth digit seems to be 1 iff the nth Turing machine halts on the blank tape for some fairly natural ordering of all Turing machines. If we confirm this for a large number of digits (not necessarily consecutive digits- obviously some of the 0s won't be confirmable) shouldn't we consider the hypothesis the digits really are given by this simple but non-computable function? But if our priors assign zero probability to all non-computable hypotheses then this hypothesis must always be stuck with zero probability.

If the universe is finite we could approximate this function with a function which was instead "Halts within K" steps where K is some large number, but intutively this seems like a more complicated hypothesis than the original one.

I'm not sure what is a reasonable prior in this sort of context that handles this sort of thing. We don't want an uncountable set of priors. It might make sense to use something like hypotheses which are describable in Peano arithmetic or something like that.

 

Bayesianism in the face of unknowns

1 rstarkov 12 March 2011 08:54PM

Suppose I tell you I have an infinite supply of unfair coins. I pick one randomly and flip it, recording the result. I've done this a total of 100 times and they all came out heads. I will pay you $1000 if the next throw is heads, and $10 if it's tails. Each unfair coin is entirely normal, whose "heads" follow a binomial distribution with an unknown p. This is all you know. How much would you pay to enter this game?

I suppose another way to phrase this question is "what is your best estimate of your expected winnings?", or, more generally, "how do you choose the maximum price you'll pay to play this game?"

Observe that the only fact you know about the distribution from which I'm drawing my coins is those 100 outcomes. Importantly, you don't know the distribution of each coin's p in my supply of unfair coins. Can you reasonably assume a specific distribution to make your calculation, and claim that it results in a better best estimate than any other distribution?

Most importantly, can one actually produce a "theoretically sound" expectation here? I.e. one that is calibrated so that if you pay your expected winnings every time and we perform this experiment lots of times then your average winnings will be zero - assuming I'm using the same source of unfair coins each time.

I suspect that the best one can do here is produce a range of values with confidence intervals. So you're 80% confident that the price you should pay to break even in the repeated game is between A80 and B80, 95% confident it's between A95 and B95, etc.

If this is really the best obtainable result, then what is a bayesianist to do with such a result to make their decision? Do you pick a price randomly from a specially crafted distribution, which is 95% likely to produce a value between A95..B95, etc? Or is there a more "bayesian" way?