This claim doesn't make much sense from the outset. Look at your specific example of transistors. In 1965, an electronics magazine wanted to figure out what would happen over time with electronics/transistors so they called up an expert, the director of research of Fairchild semiconductor. Gordon Moore (the director of research), proceeded to coin Moore's law and tell them the doubling would continue for at least a decade, probably more. Moore wasn't an outsider, he was an expert.
You then generalize from an incorrect anecdote.
I'm not sure the connotation of the term (i.e. a black person being successful at anything is so shocking it's entertainment value all on it's own) makes the statement any better. Especially when discussing, say, one of the most important American musicians of all time (among others).
I thought the heuristic was "if I think I passed the hotel, I was going too fast to notice. I better slow down so I see it when I come up on it, or so I might recognize a landmark/road that indicates I went too far." We slow down not because we are splitting the difference between turning around and continuing on. We slow down to make it easier to gather more information, a perfectly rational response.
Sure, not 100% unique to academia, there are also industrial research environments.
My phd was in physics, and there were lots of examples. Weird tricks for aligning optics benches, semi-classical models that gave good order of magnitude estimates despite a lack of rigour, which estimates from the literature were trust worthy (and which estimates were garbage). Biophysics labs and material science lab all sorts of rituals around sample and culture growth and preparation. Many were voodoo, but there were good reasons for a lot of them as well.
Even tr...
In STEM fields, there is a great deal of necessary knowledge that simply is not in journals or articles, and is carried forward as institutional knowledge passed around among grad students and professors.
Maybe someday someone clever will figure out how to disseminate that knowledge, but it simply isn't there yet.
No, the important older theories lead to better theories.
Newton's gravitational physics made correct predictions of limited precision, and Newton's laws lead to the development of Navier-Stokes, kinetic theories of gasses,etc. Even phlogiston lead to the discovery of oxygen and the modern understanding of oxidation. You don't have to be 100% right to make useful predictions.
Vitalism, on the other hand, like astrology, didn't lead anywhere useful.
But quantum theory also makes correct predictions, and mainstream physics does not en masse advocate quackery. Vitalism never worked, and it lead the entire medical community to advocate actively harmful quackery for much of the 19th century.
No, vitalism wasn't just a dead end, it was a wrong alley that too many people spent time wandering down. Vital theories were responsible for a lot of the quack ideas of medical history.
I don't think that is true? There is a huge contingent of evangelicals (last I checked, a bit under half of Americans believe in creationism), it only takes a few non-creationist but religious Christians to get to a majority.
There is a lot of statistical literature on optimal experimental design, and it's used all the time. Years ago at Intel, we spent a lot of time on optimal design of quality control measurements, and I have no doubt a lot of industrial scientists in other companies spend their time thinking about such things.
The problem is, information is a model dependent concept (derivatives of log-likelihood depend on the likelihood), so if your prior isn't fairly strong, there isn't a lot of improvement to be had. A lot of science is exploratory, trying to optimize ...
I don't understand the improvement you think is possible here. In a lot of cases, the math isn't the problem, the theory is known. The difficulty is usually finding a large enough sample size,etc.
You'd think so, but office hours and TA sections without attendance grades are very sparsely attended.
How hard your quals are depends on how well you know your field. I went to a top 5 physics program, and everyone passed their qualifying exams, roughly half of whom opted to take the qual their first year of grad school. Obviously, we weren't randomly selected though.
Fellowships are a crapshoot that depend on a lot of factors outside your control, but getting funding is generally pretty easy in the sciences. When you work as an "RA" you are basically just doing your thesis research. TAing can be time consuming, but literally no one cares if...
I don't think medicine is a junk investment when you consider the opportunity cost, at least in the US.
Consider my sister, a fairly median medical school graduate in the US. After 4 years of medical school (plus her undergrad) she graduated with 150k in debt (at 6% or so). She then did a residency for 3 years making 50k a year, give or take. After that she became an attending with a starting salary of $220k. At younger than 30, she was in the top 4% of salaries in the US.
The opportunity cost is maybe ~45k*4 years, 180k + direct cost of 150k or so....
I don't see how Eliezer is correct here. Conservation of energy just isn't deeply related to the deeper structure of quantum mechanics in the way Harry suggests. It's not related to unitarity, so you can't do weird non-unitary things.
Hold on- aren't you saying the choice of experimental rule is VERY important (i.e. double blind vs. not double blind,etc)?
If so you are agreeing with VAuroch. You have to include the details of the experiment somewhere. The data does not speak for itself.
My point was only that nothing in the axioms prevents macroscopic superposition.
The part that is new compared to Cromwell's rule is that Yudkowsky doesn't want to give probability 1 to logical statements (53 is a prime number).
Because he doesn't want to treat 1 as a probability, you can't expect complete sets of events to have total probability 1, despite them being tautologies. Because he doesn't want probability 0, how do you handle the empty set? How do you assign probabilities to statements like "A and B" where A and B are logical exclusive? (the coin lands heads AND the coin lands tails).
Removing 0 and 1 from the math of probability breaks most of the standard manipulations. Again, it's best to just say "be careful with 0 and 1 when working with odds ratios."
I think the issue at hand is that 0 and 1 aren't special cases at all, but very important for the math of probability theory to work (try and construct a probability measure where some subset doesn't have probability 1 or 0).
This is incredibly necessary for the mathematical idea of probability ,and EY seems to be confusing "are 0 and 1 probabilities relevant to Bayesian agents?" with "are 0 and 1 probabilities?" (yes, they are, unavoidably, not as a special case!).
So there are obviously a lot of different things you could mean by "Copenhagen" or "in the back of a lot of copenhagenist minds" but the way it's usually used by physicists nowadays is to mean "the Von Neumann axioms" because that is what is in 90+% of the textbooks.
There is nothing in Copenhagen that forbids macroscopic superposition. The experimental results of macroscopic superposition in SQUIDs are usually calculated in terms of copenhagen (as are almost all experimental results).
How are you defining territory here? If the territory is 'reality' the only place where quantum mechanics connects to reality is when it tells us the outcome of measurements. We don't observe the wavefunction directly, we measure observables.
I think the challenge of MWI is to make the probabilities a natural result of the theory, and there has been a fair amount of active research trying and failing to do this. RQM side steps this by saying "the observables are the thing, the wavefunction is just a map, not territory."
How would this affect a frequentist?
It doesn't the frequentist is already measuring with the sample distribution. That is how frequentism works.
I was mainly trying to convince you that nothing's actually wrong with having 33% false positive rate in contrived cases.
I mean it's not "wrong" but if you care about false positive rates and there is a method had has a 5% false positive rate, wouldn't you want to use that instead?
No, there's a limit on that as well. See http://www.ejwagenmakers.com/2007/StoppingRuleAppendix.pdf
I can check my simulation for bugs. I don't have the referenced textbook to check the result being suggested.
It is my thesis that every optional stopping so-called paradox can be converted into a form without optional stopping, and those will be clearer as to whether the problem is real or not.
The first part of this is trivially true. Replace the original distribution with the sampling distribution from the stopped problem, and it's not longer a st...
I think this is problem dependent.
In simulation, I start to asymptote to around 20%, with a coin flip, but estimating mean from a normal distribution (with the null being 0) with fixed variance I keep climbing indefinitely. If you are willing to sample literally forever it seems like you'd be able to convince the Bayesian that the mean is not 0 with arbitrary Bayes factor. So for large enough N in a sample, I expect you can get a factor of 3 for 99/100 of the Bayesians in cages (so long as that last Bayesian is really, really sure the value is 0).
But it doesn't change the results if we switch and say we fool 33% of the Bayesians with Bayes factor of 3. We are still fooling them.
Before I analyse this case, can you clarify whether the hypothesis happens to be true, false, or chosen at random? Also give these Bayesians' priors, and perhaps an example of the rule you'd use.
Again, the prior doesn't matter, they are computing Bayes factors. We are talking about Bayes factors. Bayes factors. Prior doesn't matter. Bayes factors. Prior.Doesn't.Matter. Bayes factors. Prior.Doesn't.Matter. Bayes.factor.
Let's say the null is true, but the frequentist mastermind has devised some data generating process that (let's say he has infinite data at his disposal) that can produce evidence in favor of competing hypothesis at a Bayes factor of 3, 99% of the time.
I'm saying that all inferences are still correct. So if your prior is correct/well calibrated, then your posterior is as well. If you end up with 100 studies that all found an effect for different things at a posterior of 95%, 5% of them should be wrong.
But that is based on the posterior.
When I ask for clarification, you seem to be doing two things:
I think 2 is uncontroversial, other than if you have a perfect prior why do any experiment at all? But i...
That paper only calculates what happens to the bayes factor when the null is true. There's nothing that implies the inference will be wrong.
That is the practical problem for statistics (the null is true, but the experimenter desperately wants it to be false). Everyone wants their experiment to be a success. The goal of this particular form of p-hacking is to increase the chance that you get a publishable result. The goal of the p-hacker is to increase the probability of type 1 error. A publication rule based on Bayes factors instead of p-values is still susceptible to optional stopping.
You seem to be saying that a rule based on posteriors would not be susceptible to such hacking?
It depends only on the prior. I consider all these "stopping rule paradoxes" disguised cases where you give the Bayesian a bad prior, and the frequentist formula encodes a better prior.
Then you are doing a very confusing thing that isn't likely to give much insight. Frequentist inference and Bayesian inference are different and it's useful to at least understand both ideas(even if you reject frequentism).
Frequentists are bounding their error with various forms of the law of large numbers, they aren't coherently integrating evidence. So sa...
In practice what p-hacking is about is convincing the world of an effect, so you are trying to create bias toward any data looking like a novel effect. Stopping rules/data peeking accomplish this just as much for Bayes as for frequentist inference (though if the frequentist knows about the stopping rule they can adjust in a way that bayesians can't), which is my whole point.
Whether or not the Bayesian calibration is overall correct depends not just on the Bayes factor but the prior.
Reminds of this bit from a Wasserman paper http://ba.stat.cmu.edu/journal/2006/vol01/issue03/wasserman.pdf
...van Nostrand: Of course. I remember each problem quite clearly. And I recall that on each occasion I was quite thorough. I interrogated you in detail, determined your model and prior and produced a coherent 95 percent interval for the quantity of interest.
Pennypacker: Yes indeed. We did this many times and I paid you quite handsomely.
van Nostrand: Well earned money I’d say. And it helped win you that Nobel.
Pennypacker: Well they retracted the Nobel a
...What makes Bayesian "lose" in the cases proposed by Mayo and Simonsohn isn't the inference, it's the scoring rule. A Bayesian scores himself on total calibration, "number of times my 95% confidence interval includes the truth" is just a small part of it. You can generate an experiment that has a high chance (let's say 99%) of making a Bayesian have a 20:1 likelihood ratio in favor of some hypothesis. By conservation of expected evidence, the same experiment might have 1% chance of generating close to a 2000:1 likelihood ratio against t
If you look at the paper, what you call optional stopping is what the authors called "data peeking."
In their simulations, the authors first took in a sample of 20 and calculated it, and then could selectively continue to add data up to 30 (stopping when they reach "effect" or 30 samples). The papers point is that this does skew the Bayes factor (doubles the chances of managing to get a Bayes factor > 3).
It is true that optional stopping won't change Bayes rule updates (which is easy enough to show). It's also true that optional stopping does affect frequentist tests (different sampling distributions). The broader question is "which behavior is better?"
p-hacking is when statisticians use optional stopping to make their results look more significant (by not reporting their stopping rule). As it turns out you in fact can "posterior hack" Bayesians - http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2374040
Edit: Also Debrah Mayo's Er...
The existence of the Higg's is one of the rare bits of physics that doesn't average out under renormalization.
The reason is that the Higgs is deeply related to the overall symmetry of the whole standard model- you start with a symmetry group SU(2)xU(1) and then the Higgs messes with the symmetry so you end up with just U(1) symmetry. What the theory predicts is relationships between the Higgs, the W and Z boson, but not the absolute scale. The general rule is RG flow respects symmetries, but other stuff gets washed out.
This is why the prediction was...
I think I'm communicating a little poorly. So start with atomic level physics- it's characterized by energy scales of 13.6 eV or so. Making measurements at that scale will tell you a lot about atomic level physics, but it won't tell you anything about lower level physics- there is an infinite number of of lower level physics theories that will be compatible with your atomic theory (which is why you don't need the mass of the top quark to calculate the hydrogen energy levels- conversely you can't find the mass of the top quark by measuring those levels)....
The point of RG is that "higher level" physics is independent of most "lower level" physics. There are infinitely many low level theories that could lead to a plane flying.
There are infinitely many lower level theories that could lead to quarks behaving as they do,etc. So 1. you can't deduce low level physics from high level physics (i.e. you could never figure out quarks by making careful measurements of tennis balls), and you can never know if you have truly found the lowest level theory (there might be a totally different theory ...
The whole point of the renormalization group is that lower level models aren't more accurate, the lower level effects average out.
The multiple levels of reality are "parallel in a peculiar way" governed by RG. It might be "more complex" but it's also the backbone of modern physics.
Heck climate scientists aren't even that sparing about basic facts. They'll mention that CO2 is a greenhouse gas, but avoid any more technical questions. For example, I only recently found out that (in the absence of other factors or any feedback) temperature is a logarithmic function of CO2 concentration.
So this seems like you've never cracked open any climate/atmospheric science textbook? Because that is pretty basic info. It seems like you're determined to be skeptical despite not really spending much time learning about the state of the science. ...
It's a quite bit more general than Lagrangian mechanics. You can extend it to any functional that takes functions between two manifolds to complex numbers.
General question- does combining the 2013 and 2014 survey make sense, given that we expect a lot of overlap (same people surveyed)?
Also, why treat EA as a latent variable when it was directly surveyed? Shouldn't we just combine by saying if you answered Yes to an EA questions, you are EA?
Well, one thing was definitely changed was my approach to the coursework. I started taking a lot of notes as a memory aid, but then when I worked through problems I relied on what I remembered and refused to look things up in the text book or my notes. This forced me to figure out ways to solve problems in ways that made sense to me- it was really slow going at first but I slowly built up my own bag of tricks.
Glancing at the data, it looks like the median EA at several ages gives 0 as well as the median non-EA. You might want to separate the 0 set from everything else and then answer two questions:
what percentage of EAs/non-EAs donate any money when they give, how much do EAs give, how much do non-EAs give.
I think this makes more sense then what is happening now- the lines don't seem to fit the data very well.
Can you give an example of the level where things suddenly become more difficult?
As I said in another post, I struggled quite a bit with early calculus classes, but breezed through later "more difficult" classes that built on them.
Also, I disagree with the math and stats thing. Many of the best statisticians I know have strong grounding in mathematics, as do many of the best data scientists I know.
Of course he would have gotten an A. The difference between being good and bad at math is whether you need to "spent all waking hours talking about calculus" to get an A.
Extrapolating from 1 course is silly. I worked like a demon to do mediocre (low Bs) in both calc 1 and physics 1, but somewhere towards the end of my freshman year of college something fell into place for me. By my first year of grad school I was breezing through quantum field theory and math methods for string theory courses with minimal effort.
I worked it out back of the envelope, and the probability of being kidnapped when you blink is only 1/5^^^5.
Both numbers seem basically arbitrarily small (probability 0).
Since the planet has so many distinct people, and they blink more than once a day, you are essentially asserting that on that planet, multiple people are kidnapped and tortured for more than 50 years several times a day.
However, 3^^^3 just so unimaginably enormous that blinking for even the tiniest fraction of a second increases the probability that you will be captured by a madman during that blink and tortured for 50 years by more than 1/3^^^3.
This seems pretty unlikely to be true.
I think the bigger issue with metamed is that they are trying to fill a role that already exists in the market. UpToDate is a bigger, more ambitious company (instead of generating per-patient reports they have a huge database of research summaries curated by actual experts for each disease) that almost all academic medical centers already pay for.
It wouldn't have made a lot of sense to predict any doublings for transistors in an integrated circuit before 1960, because I think that is when they were invented.