Comment author: edanm 16 December 2012 04:03:48PM 0 points [-]

That's an interesting idea. Where can I read more about it? Any specific Pinker book?

Comment author: alexflint 26 December 2012 05:08:22PM 0 points [-]

This was from How The Mind Works: http://www.amazon.com/dp/1469228424

In response to Causal Universes
Comment author: alexflint 29 November 2012 10:43:34PM 0 points [-]

There would be no hypothesis in your hypothesis-space to describe the standard model of physics, where space is continuous, indefinitely divisible, and has complex amplitude assignments over uncountable cardinalities of points.

I'm not sure this is necessarily correct. We typically model quantum configurations as functions defined over a continuous domain, but it's yet possible that quantum configurations could be representable by a finite set of numbers (more precisely: that all possible configurations of our universe could be expressed as f(x) for some arbitrary but fixed f and some finite vector x). This would follow if the amount of information in the universe is finite, since we know that information is neither created nor destroyed over time. In this case we could represent states of the universe as a finite set of numbers and draw causal arrows between these arrows over time. Of course, such a representation might be much less convenient than thinking about continuous wavefunctions etc.

Comment author: alexflint 01 November 2012 09:31:30PM 2 points [-]

These theorems, however, ignore the issue of computation --- while the best decision procedure may be Bayesian, the best computationally-efficient decision procedure could easily be non-Bayesian.

This raises another important point against Bayes, which is that the proper Bayesian interpretation may be very mathematically complex.

if we are trying to build a software package that should be widely deployable, we might want to use a frequentist method because users can be sure that the software will work as long as some number of easily-checkable assumptions are met.

I think these are the strongest reasons you've raised that we might want to deviate from pure Bayesianism in practice. We usually think of these (computation and understandability-by-humans) as irritating side issues, to be glossed over and mostly considered after we've made our decision about which algorithm to use. But in practice they often dominate all other considerations, so it would be nice to find a way to rigorously integrate these two desiderata with the others that underpin Bayesianism.

Comment author: alexflint 01 November 2012 09:23:32PM 2 points [-]

Support vector machines [2], which try to pick separating hyperplanes that minimize generalization error, are one example of this where the algorithm is explicitly trying to maximize worst-case utility.

Could you expand on this a little? I've always thought of SVMs as minimizing an expected loss (the sum over hinge losses) rather than any best-worst-case approach. Are you referring to the "max min" in the dual QP? I'm interested in other interpretations...

Comment author: alexflint 01 November 2012 10:49:52AM 3 points [-]

I think you've missed an important piece of this picture, or perhaps have not emphasized it as much as I would. The real real reason we can elucidate causation from correlation is that we have a prior that prefers simple explanations over complex ones, and so when some observed frequencies can be explained by a compact (simple) bayes net we take the arrows in that bayes net to be causation.

A fully connected bayes net (or equivalently, a causal graph with one hidden node pointing to all observed nodes) can represent any probability distribution whatsoever. Such a Bayes net can never be flat-out falsified. Rather it is our preference for simple explanations that sometimes gives us reason to infer structure in the world.

This contradicts nothing you've said, but I guess I read this article as suggesting there is some fundamental rule that gives us a crisp method for extracting causation from observations, whereas I would look at it as a special case of inference-with-prior-and-likelihood, just like in other forms of Bayesian reasoning.

Comment author: alexflint 17 August 2012 03:34:28AM *  2 points [-]

I'm working for a mid-size startup and have been gathering insight into successful startups for a couple of years. Here is what I think is important.

Create value. Make sure your idea actually creates value in the world. Lots of value. It should conceivably be useful to many of people, and it should conceivably be of significant value to them. Value means your product would be important enough that, if forced to, they would give up other things in exchange for it.

Don't focus on monetization. Startups are subject to all sorts of counter-intuitive economics; it's unrealistic to plan exactly how you will make money. Make sure you're creating value, and check that there's nothing that would prevent you from ever collecting any of that value. Then go back to creating value.

Iteration beats brilliance. The speed at which you iterate is more important that the brilliance of the initial idea. Trying out a product in the real market is an experiment: the feedback your receive entangles your startup with other players in the market. Each experiment steers you towards a local optimum. To win you need (1) to start in the general vicinity of a good local optima and (2) rapid convergence to that optima.

The quality of the team is key. Early stage investors invest largely in the perceived quality of a team, and so should you invest your time alongside great people. An early stage startup should never hire consultants (wrong incentives), should never live in different cities (bad communication). Entering into a startup is like a marriage: it's very hard to get out.

Choose investors cautiously. You're also "married" to your investors on the day you sign a term sheet. Pick ones that you trust, that share your goals, and that can help you in ways other than by providing capital.

Comment author: Kawoomba 28 June 2012 06:34:36PM 10 points [-]

This method of alternating moves in a branching tree matches both our intuitive thought processes during a chess game (“Okay, if I do this, then Black's going to do this, and then I'd do this, and then...”) and the foundation of the algorithms chess computers like Deep Blue use.

Could you include a reference to alpha-beta pruning, since that is precisely what you're describing? Some readers may be more familiar with that subject domain and appreciate explicitly linking game theory to an established search algorithm.

game theorists call this a gnash equilibrium

Did you mean "Nash equilibrium"? If it was a deliberate pun, you might want to indicate it for those who are looking up new concepts, this being an introductory series.

Comment author: alexflint 30 June 2012 05:41:53PM 4 points [-]

Could you include a reference to alpha-beta pruning, since that is precisely what you're describing? Some readers may be more familiar with that subject domain and appreciate explicitly linking game theory to an established search algorithm.

I think you mean minimax. Alpha-beta pruning is the optimization to minimax that prunes branches as soon as any max (min) node evaluates lower (higher) than the highest (lowest) opposite-colored node evaluated so far among the grandparents' children.

Comment author: alexflint 21 April 2012 04:38:38PM *  24 points [-]

The first essay is by far the best introduction to TDT-like reasoning that I've ever read. In fact this paragraph sums up the whole informal part of the idea:

This solution depends in no way on telepathy or bizarre forms of causality. It’s just that the statement I’ll choose C and then everyone will, though entirely correct, is somewhat misleadingly phrased. It involves the word choice, which is incompatible with the compelling quality of logic. Schoolchildren do not choose what 507 divided by 13 is; they figure it out. Analogously, my letter really did not allow choice; it demanded reasoning. Thus, a better way to phrase the voodoo statement would be this: If reasoning guides me to say C, then, as I am no different from anyone else as far as rational thinking is concerned, it will guide everyone to say C.

Hofstadter's comparison of "choice" and "reasoning" is getting at the idea that people have decision routines rooted in physics, which can themselves be reasoned about, including reasoning that they are similar to one's own. I think this is really the core insight of the TDT idea.

And then the one-sentence:

Likewise, the argument "Whatever I do, so will everyone else do" is simply a statement of faith that reasoning is universal, at least among rational thinkers, not an endorsement of any mystical kind of causality.

Comment author: alexflint 13 April 2012 01:13:15AM 4 points [-]

Wow, I had no idea that we really really knew why CBT worked. Thank you for this post.

Comment author: selylindi 06 February 2012 09:28:59PM 0 points [-]

Clearly people who don't know about Occam's Razor, and people who explicitly reject it, still believe in the future. Just as clearly, we can use Occam's Razor or other principles in evaluating theories about what happened in the past. Your claim appears wholly unjustified. Was it just a vague hifalutin' metaphysical claim, or are there some underlying points that you're not bringing out?

Comment author: alexflint 08 February 2012 10:41:00PM 0 points [-]

People who don't know about Newtonian mechanics still believe that rocks fall downwards, but people who reject it explicitly will have a harder time reconciling their beliefs with the continued falling of rocks. It would be a mistake to reject Newtonian mechanics, then say "people who reject Newtonian mechanics clearly still believe that rocks fall", then to conclude that there is no problem in rejecting Newtonian mechanics. Similarly, if you reject Occam's razor then you need to replace it with something that actually fills the explanatory gap -- it's not good enough to say "well people who reject Occam's razor clearly still believe Occam's razor", and then just carry right on.

View more: Prev | Next