All of adam_strandberg's Comments + Replies

Yes, thank you for writing this- I've been meaning to write something like it for a while and now I don't need to! I initially brushed Newcomb's Paradox off as an edge case and it took me much longer than I would have liked to realize how universal it was. A discussion of this type should be included with every introduction to the problem to prevent people from treating it as just some pointless philosophical thought experiment.

As far as I can tell from the evidence given in the talk, contagious spreading of obesity is a plausible but not directly proven idea. Its plausibility comes from the more direct tests that he gives later in the talk, namely the observed spread of cooperation or defection in iterated games.

However, I agree that it's probably important to not too quickly talk about contagious obesity because (a) they haven't done the more direct interventional studies that would show whether this is true, and (b) speculating about contentious social issues in public before ... (read more)

The Moire Eel - move your cursor around and see all the beautfiul, beautiful moire patterns.

Social Networks and Evolution: a great Oxford neuroscience talk. I will also shamelessly push this blog post that I wrote about the connection between the work in the lecture and Jared Diamond's thesis that agriculture was the worst mistake in human history.

0NancyLebovitz
A couple of minutes in, the podcast mentions the somewhat dubious idea that obesity spreads through social networks. Does this cast much doubt on the rest of the piece?

This is exactly what I was thinking the whole time. Is there any example of supposed "ambiguity aversion" that isn't explained by this effect?

Can you imagine a human being saying "I'm sorry, I'm too low-level to participate in this discussion"? There may be a tiny handful of people wise enough to try it.

This is precisely why people should be encouraged to do it more. I've found that the more you admit to a lack of ability where you don't have the ability, the more people are willing to listen to you where you do.

I also see interesting parallels to the relationship between skeptics and pseudoscience, where we replace skeptics -> rationalists, pseudoscience -> religion. Namely, ... (read more)

1) This is fantastic- I keep meaning to read more on how to actually apply Highly Advanced Epistemology to real data, and now I'm learning about it. Thanks!

2) This should be on Main.

3) Does there exist an alternative in the literature to the notation of Pr(A = a)? I hadn't realized until now how much the use of the equal sign there makes no sense. In standard usage, the equal sign either refers to literal equivalence (or isomorphism) as in functional programming, or variable assignment, as in imperative programming. This operation is obviously not literal ... (read more)

1Bobertron
The "A=a" stands for the event that the random variable A takes on the value a. It's another notation for the set {ω ∈ Ω | A(ω) = a}, where Ω is your probability space and A is a random variable (a mapping from Ω to something else, often R^n). Okay, maybe you know that, but I just want to point out that there is nothing vague about the "A=a" notation. It's entirely rigorous.
9IlyaShpitser
I agree p(A = a) is imprecise. ---------------------------------------- Good notation for interventions has to permit easy nesting and conflicts for [ good reasons I don't want to get into right now ]. do(.) actually isn't very good for this reason (and I have deprecated it in my own work). I like various flavors of the potential outcome notation, e.g. Y(a) to mean "response Y under intervention do(a)". Ander uses Y^a (with a superscript) for the same thing. With potential outcomes we can easily express things like "what would happen to Y if A were forced to a, and M were forced to whatever value M attained had A been instead forced to a' ": Y(a,M(a')). You can't even write this down with do(.).

That is the general approach I've been taking on the issue so far- basically I'm interested in learning about consciousness, and I've been going about it by reading papers on the subject.

However, part of the issue that I have is that I don't know what I don't know. I can look up terms that I don't know that show up in papers, but in the literature there are presumably unspoken inferences being made based on "obvious" information.

Furthermore, since I have a bias toward novelty or flashiness, I may miss things that blatantly and obviously contradic... (read more)

0ChristianKl
That's not true. At the moment the rate of new questions is 2.7 per day. That's still low but enough to go there to post your questions. Just go ahead and ask your questions.
0John_Maxwell
I don't know about neuro/cog sci in particular, but you might try Quora or http://en.wikipedia.org/?title=Wikipedia:Reference_desk/Science

(How many different DAGs are possible if you have 600 nodes? Apparently, >2^600.)

Naively, I would expect it to be closer to 600^600 (the number of possible directed graphs with 600 nodes).

And in fact, it is some complicated thing that seems to scale much more like n^n than like 2^n: http://en.wikipedia.org/wiki/Directed_acyclic_graph#Combinatorial_enumeration

6gwern
It appears I've accidentally nerdsniped everyone! I was just trying to give an idea that it was really really big. (I had done some googling for the exact answer but they all seemed rather complicated, and rather than try and get an exact answer wrong, just give a lower bound.)
topynate100

There's an asymptotic approximation in the OEIS: a(n) ~ n!2^(n(n-1)/2)/(M*p^n), with M and p constants. So log(a(n)) = O(n^2), as opposed to log(2^n) = O(n), log(n!) = O(n log(n)), log(n^n) = O(n log(n)).

5IlyaShpitser
If we allow cycles, then there are three possibilities for an edge between a pair of vertices in a directed graph: no edge, or an arrow in either direction. Since a graph of n vertices has n choose 2 pairs, the total number of DAGs of n vertices has an upper bound of 3^(n choose 2). This is much smaller than n^n. edit: the last sentence is wrong. ---------------------------------------- Gwern, thanks for writing more, I will have more to say later.

I wrote a blog post describing the article, talking about criticisms of Crick and Koch's theory, and describing related research involving salvia:

http://the-lagrangian.blogspot.com/2014/07/epilepsy-consciousness-and-salvia.html

Enjoy.

The 100 Questions link is really nice- I particularly liked this question: "How random are synaptic events? And why (both from a functional as well as from a biophysical point of view)?" I am not sure why this question hadn't already occurred to me, but I'm glad I have it now.

2Shmi
It works, but intermittently. I suspect issues at Science Direct. Abstract, just in case: The paper itself is paywalled, if someone feels like posting full text some place accessible, by all means.

Even better than that is this series of blog posts, which talks about color identification across languages, the way that color-space is in a sense "optimally" divided by basic color words, and how children develop a sense for naming colors:

http://www.wired.com/wiredscience/2012/06/the-crayola-fication-of-the-world-how-we-gave-colors-names-and-it-messed-with-our-brains-part-i/ http://www.wired.com/wiredscience/2012/06/the-crayola-fication-of-the-world-how-we-gave-colors-names-and-it-messed-with-our-brains-part-ii/

Also, this from his summary of Nietzsche's "Thus Spoke Zarathustra":

Humanity isn't an end, it's a fork in the road, and you have two options: "Animal" and "Superman". For some reason, people keep going left, the easy way, the way back to where we came from. Fuck 'em. Other people just stand there, staring at the signposts, as if they're going to come alive and tell them what to do or something. Dude, the sign says fucking "SUPERMAN". How much more of a clue do these assholes want?

I am deeply confused by your statement that the complete class theorem only implies that Bayesian techniques are locally optimal. If for EVERY non-Bayesian method there's a better Bayesian method, then the globally optimal technique must be a Bayesian method.

1jsteinhardt
There is a difference between "the globally optimal technique is Bayesian" and "a Bayesian technique is globally optimal". In the latter case, we now still have to choose from an infinitely large family of techniques (one for each choice of prior). Bayes doesn't help me know which of these I should choose. In contrast there are frequentist techniques (e.g. minimax) that will give me a full prescription of what I ought to. Those techniques can in many (but not all) cases be interpreted in terms of a prior, but "choose a prior and update" wasn't the advice that led me to that decision, rather it was "play the minimax decision rule". As I said in my post:

In section 8.1, your example of the gambler's ruin postulates that both agents have the same starting resources, but this is exactly the case in which the gambler's ruin doesn't apply. That might be worth changing.

According to Wikipedia, there are at least 4 groups currently working on LFTRs, one of which is China: http://en.wikipedia.org/wiki/LFTR#Recent_developments

0Eliezer Yudkowsky
Right. They're hiring 150 PhD students and it's still supposed to take 20 years. This seems like a prime instance of the We Can't Do Anything Effect.

Even a few years of delay can make a big difference if you are in the middle of a major war. If Galston hadn't published his results and they weren't found until a decade or two later, the US probably wouldn't have used Agent Orange in Vietnam. Similarly with chlorine gas in WWI, atomic bombs in WWII, etc. Granted, delaying the invention doesn't necessarily make the overall outcome better. If the atomic bomb wasn't invented until the 1950s and we didn't have the examples of Hiroshima and Nagasaki, then the US or USSR would probably have been more likely to use them against each other.

2A1987dM
Huh. I had never thought about that from that angle.
2Desrtopa
For that matter, if we didn't use the atom bombs in Hiroshima and Nagasaki, then we would have gone ahead with the land invasion, resulting in far more fatalities. When wars are fought until a decisive victory, a huge technological edge may serve to decrease the death toll, as the side at a disadvantage will be more easily persuaded to give up.

Even if the BB and the psychic are in causally disconnected parts of your model, them having the same probability of being correlated with the card doesn't imply that the Causal Markov Condition is broken. In order to show that, you would need to specify all of the parent nodes to the BB in your model, calculate the probability of it being correlated with the card, and then see whether having knowledge of the psychic would change your probability for the BB. Since all physics currently is local in nature, I can't think of anything that would imply this is ... (read more)

0pragmatist
I'm having trouble parsing this comment. You seem to be granting that the BB's state is correlated with the top card (I'm assuming this is what you mean by "having the same probability"), that there is no direct causal link between the BB and the psychic, and that there are no common causes, but saying that this still doesn't necessarily violate the CMC. Am I interpreting you right? If I'm not, could you tell me which one of those premises does not hold in my example? If I am interpreting you correctly, then you are wrong. The CMC entails that if X and Y are correlated, X is not a cause of Y and Y is not a cause of X, then there are common causes of X and Y such that the variables are independent conditional on those common causes.

Can you provide an example? I would claim that for any model in which you have a mathematical truth as a node in a causal graph, you can replace that node by whatever series of physical events caused you to believe that mathematical truth.

2Peterdjones
Why would you want a mathematical truth on a causal graph? Are the transation probabilities ever going to be less than 1.0?
3Eugine_Nier
I add 387+875 to get 1262, from this I can conclude that anyone else doing the same computation will get the same answer despite never having interacted with them.

The CMC is not strictly violated in physics as far as we know. If you specify the state of the universe for the entire past light cone of some event, then you uniquely specify the event. The example that you gave of the rock shooting out of the pond indeed does not violate the laws of physics- you simply shoved the causality under the rug by claiming that the edge of the pond fluctuated "spontaneously". This is not true. The edge of the pond fluctuating was completely specified by the past light cone of that event. This is the sense in which the ... (read more)

2pragmatist
I meant "spontaneous" in the ordinary thermodynamic sense of spontaneity (like when we say systems spontaneously equilibriate, or that spontaneous fluctuations occur in thermodynamic systems), so no violation of microphysical law was intended. Spontaneous here just means there is no discernable macroscopic cause of the event. Now it is true that everything that happened in the scenario I described was microscopically determined by physical law, but this is not enough to satisfy the CMC. What we need is some common cause account of the macroscopic correlation that leads to a coherent inward-directed wave, and simply specifying that the process is law-governed does not provide such an account. I guess you could just say that the common cause is the initial conditions of the universe, or something like that. If that kind of move is allowed, then the CMC is trivially satisfied for every correlation. But when people usually appeal to the CMC they intend something stronger than this. They're usually talking about a spatially localized cause, not an entire spatial hypersurface. If you allow entire hypersurfaces as nodes in your graph, you run into trouble. In a deterministic world, any correlation between two properties isn't just screened off by the contents of past hypersurfaces, it's also screened off by the contents of future hypersurfaces. But a future hypersurface can't be a common cause of the correlated properties, so we have a correlation screened off by a node that doesn't d-separate the correlated variables. This doesn't violate the CMC per se, but it does violate the Faithfulness Condition, which says that the only conditional independencies in nature are the ones described by the CMC. If the Faithfulness Condition fails, then the CMC becomes pretty useless as a tool for discerning causation from correlation. The lessons of Eliezer's posts would no longer apply. So to rule out radical failure of the Faithfulness Condition in a deterministic setting, we have to

An omniscient agent could still describe a causal structure over the universe- it would simply be deterministic (which is a special case of a probabilistic causal structure). For instance, consider a being that knew all the worldlines of all particles in the universe. It could deduce a causal structure by re-describing these worldlines as a particular solution to a local differential equation. The key difference between causal vs. acausal descriptions is whether or not they are local.

I think it makes more sense to say that this test rules out ideas that can't actually be tested as hypotheses. An idea can only be tested by observation once it is formulated as a causal network. Once it's formulated as a testable hypothesis, you can simply discard this epiphenomenal example by Solomonoff induction.