My government name is Mack Gallagher. I am an underfunded "alignment" "researcher". Crocker's Rules. DM me if you'd like to fund my posts.
Try using Marginalia [ I'm unaffiliated ] instead of Google for stuff other than currency conversion and the hours of local businesses.
Newton's laws of motion are already laws of futurology.
"The antithesis is not so heterodox as it sounds, for every active mind will form opinions without direct evidence, else the evidence too often would never be collected."
I already know, upon reading this sentence [source] that I'm going to be quoting it constantly.
It's too perfect a rebuttal to the daily-experienced circumstance of people imagining that things - ideas, facts, heuristics, truisms - that are obvious to the people they consider politically "normal" [e.g., 2024 politically-cosmopolitan Americans, or LessWrong], must be or have been obvious to everyone of their cognitive intelligence level, at all times and in all places -
- or the converse, that what seems obvious to the people they consider politically "normal", must be true.
Separately from how pithy it is, regarding the substance of the quote: it strikes me hard that of all people remembered by history who could have said this, the one who did was R.A. Fisher. You know, the original "frequentist"? I'd associated his having originated the now-endemic tic of "testing for statistical significance" with a kind of bureaucratic indifference to unfamiliar, "fringe" ideas, which I'd assumed he'd shared.
But the meditation surrounding this quote is a paean to the mental process of "asking after the actual causes of things, without assuming that the true answers are necessarily contained within your current mental framework".
"That Charles Darwin accepted the fusion or blending theory of inheritance, just as all men accept many of the undisputed beliefs of their time, is universally admitted. [ . . . ] To modern readers [the argument from the variability within domestic species] will seem a very strange argument with which to introduce the case for Natural Selection [ . . . ] It should be remembered that, at the time of the essays, Darwin had little direct evidence on [the] point [of whether variation existed within species] [ . . . ] The antithesis is not so heterodox as it sounds, for every active mind will form opinions without direct evidence, else the evidence too often would never be collected."
This comes on the heels of me finding out that Jakob Bernoulli, the ostensible great-granddaddy of the frequentists, believed himself to be using frequencies to study probabilities, and was only cast in the light of history as having discovered that probabilities really "were" frequencies.
"This result [Jakob Bernoulli's discovery of the Law of Large Numbers in population statistics] can be viewed as a justification of the frequentist definition of probability: 'proportion of times a given event happens'. Bernoulli saw it differently: it provided a theoretical justification for using proportions in experiments to deduce the underlying probabilities. This is close to the modern axiomatic view of probability theory." [ Ian Stewart, Do Dice Play God, pg 34 ]
Bernoulli:
"Both [the] novelty [ of the Law of Large Numbers ] and its great utility combined with its equally great difficulty can add to the weight and value of all the other chapters of this theory. But before I convey its solution, let me remove a few objections that certain learned men have raised. 1. They object first that the ratio of tokens is different from the ratio of diseases or changes in the air: the former have a determinate number, the latter an indeterminate and varying one. I reply to this that both are posited to be equally uncertain and indeterminate with respect to our knowledge. On the other hand, that either is indeterminate in itself and with respect to its nature can no more be conceived by us than it can be conceived that the same thing at the same time is both created and not created by the Author of nature: for whatever God has done, God has, by that very deed, also determined at the same time." [ Jakob Bernoulli's "The Art of Conjecturing", translated by Edith Dudley Sylla ]
It makes me wonder how many great names modern "frequentism" can even accurately count among its endorsers.
Edit:
Fisher on the philosophy of probability [ PLEASE click through, it's kind of a take-your-breath-away read if you're familiar with the modern use of "p-values" ]:
"Now suppose there were knowledge a priori of the distribution of μ. Then the method of Bayes would give a probability statement, probably a different one. This would supersede the fiducial value, for a very simple reason. If there were knowledge a priori, the fiducial method of reasoning would be clearly erroneous because it would have ignored some of the data. I need give no stronger reason than that. Therefore, the first condition [of employing the frequentist definition of probability] is that there should be no knowledge a priori.
[T]here is quite a lot of continental influence in favor of regarding probability theory as a self-supporting branch of mathematics, and treating it in the traditionally abstract and, I think, fruitless way [ . . . ] Certainly there is grave confusion of thought. We are quite in danger of sending highly-trained and highly intelligent young men out into the world with tables of erroneous numbers under their arms, and with a dense fog in the place where their brains ought to be. In this century, of course, they will be working on guided missiles and advising the medical profession on the control of disease, and there is no limit to the extent to which they could impede every sort of national effort."
[ R.A. Fisher, 1957 ]
I made the Pascal's triangle smaller, good idea.
Thank you!
Thank you for your kind comment! I disagree with the johnswentworth post you linked; it's misleading to frame NN interpretability as though we started out having any graph with any labels, weird-looking labels or not. I have sent you a DM.
While writing a recent post, I had to decide whether to mention that Nicolaus Bernoulli had written his letter posing the St. Petersburg problem specifically to Pierre Raymond de Montmort, given that my audience and I probably have no other shared semantic anchor for Pierre's existence, and he doesn't visibly appear elsewhere in the story.
I decided Yes. I think the idea of awarding credit to otherwise-silent muses in general is interesting.
Footnote to my impending post about the history of value and utility:
After Pascal's and Fermat's work on the problem of points, and Huygens's work on expected value, the next major work on probability was Jakob Bernoulli's Ars conjectandi, written between 1684 and 1689 and published posthumously by his nephew Nicolaus Bernoulli in 1713. Ars conjectandi had 3 important contributions to probability theory:
[1] The concept that expected experience is conserved, or that probabilities must sum to 1.
Bernoulli generalized Huygens's principle of expected value in a random event as
[ where is the probability of the th outcome, and is the payout from the th outcome ]
and said that, in every case, the denominator - i.e. the probabilities of all possible events - must sum to 1, because only one thing can happen to you
[ making the expected value formula just
with normalized probabilities! ]
[2] The explicit application of strategies starting with the binomial theorem [ known to ancient mathematicians as the triangle pattern studied by Pascal
and first successfully analyzed algebraically by Newton ] to combinatorics in random games [which could be biased] - resulting in e.g. [ the formula for the number of ways to choose k items of equivalent type, from a lineup of n [unique-identity] items ] [useful for calculating the expected distribution of outcomes in many-turn fair random games, or random games where all more-probable outcomes are modeled as being exactly twice, three times, etc. as probable as some other outcome],
written as :
[ A series of random events [a "stochastic process"] can be viewed as a zig-zaggy path moving down the triangle, with the tiers as events, [whether we just moved LEFT or RIGHT] as the discrete outcome of an event, and the numbers as the relative probability density of our current score, or count of preferred events.
When we calculate , we're calculating one of those relative probability densities. We're thinking of as our total long-run number of events, and as our target score, or count of preferred events.
We calculate by first "weighting in" all possible orderings of , by taking , and then by "factoring out" all possible orderings of ways to achieve our chosen W condition [since we always take the same count of W-type outcomes as interchangeable], and "factoring out" all possible orderings of our chosen L condition [since we're indifferent between those too].
[My explanation here has no particular relation to how Bernoulli reasoned through this.] ]
Bernoulli did not stop with and discrete probability analysis, however; he went on to analyze probabilities [in games with discrete outcomes] as real-valued, resulting in the Bernoulli probability distribution.
[3] The empirical "Law of Large Numbers", which says that, after you repeat a random game for many turns and add up all the outcomes, the total final outcome will approach the number of turns, times the expected distribution of outcomes in a single turn. E.g. if a die is biased to roll
a 6 40% of the time
a 5 25% of the time
a 4 20% of the time
a 3 8% of the time
a 2 4% of the time, and
a 1 3% of the time
then after 1,000 rolls, your counts should be "close" to
6: .4*1,000 = 400
5: .25*1,000 = 250
4: .2*1,000 = 200
3: .08*1,000 = 80
2: .04*1,000 = 40
1: .03*1,000 = 30
and even "closer" to these ideal ratios after 1,000,000 rolls
- which Bernoulli brought up in the fourth and final section of the book, in the context of analyzing sociological data and policymaking.
One source: "Do Dice Play God?" by Ian Stewart
[ Please DM me if you would like the author of this post to explain this stuff better. I don't have much idea how clear I am being to a LessWrong audience! ]
This is one of those subject areas that'd be unfortunately bad to get into publicly. If you or any other individual wants to grill me on this, feel free to DM me or contact me by any of the above methods and I will take disclosure case by case.
This is a just ask.
Also, even though it's not locally rhetorically convenient [ where making an isolated demand for rigor of people making claims like "scaling has hit a wall [therefore AI risk is far]" that are inconvenient for AInotkilleveryoneism, is locally rhetorically convenient for us ], we should demand the same specificity of people who are claiming that "scaling works", so we end up with a correct world-model and so people who just want to build AGI see that we are fair.
Crossposting a riddle from Twitter:
Karl Marx writes in 1859 on currency debasement and inflation:
This paradox has an explanation, which resolves everything such that it stops feeling unnatural and in fact feels neatly inevitable in retrospect. I'll post it as soon as I have a paycheck to "tell the time by" again.
Until then, I'm curious whether anyone* can give the answer.
*who hasn't already heard it from me on a Discord call - this isn't very many people and I expect none of them are commenters here