Carlos Javier Gil Bellosta

Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

Note that Shannon, 3 years before, had already trained the possibly first ever LLM. I could generate text such as

THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.

See [_A Mathematical Theory of Communication_](https://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf).

There is a paper by Taleb on the differences between probabilities and betting prices, these taken to be as "binary options". It can be found [here](https://arxiv.org/abs/1907.11162). Of course, it applies when bets are made on real money and there is a possibility of not betting and, rather, investing in some interest bearing safe asset.

I like to see the issues presented in this post from the exploration-exploitation trade-off perspective (yes, the multi-armed bandit and all that).

This counterexample saga reminds me of Lakato's "Proofs and Refutations". You have a result that is "essentially true" but you can still find some "counterexamples" to it by conveniently stressing the "obvious setting" in which the result was originally formulated. Note in any case that whereas Euler has been "refuted" he is still credited for his original V - E + F = 2 formula.

I would only like to note that in the conception of probability of Jaynes, Keynes and others, it makes no sense to talk about P(A). They all assume that probabilities do not happen in the void and that you are always "conditioning" on some previous knowledge, B. So they would always write P(A|B) where other authors/schools just write P(A).

What I find most shocking about all this exponential vs linear discussion is how easily it gets us trapped into a [necessarily false] dichotomy. As a mathematician I am surprised that the alternative to an exponential curve be a line (why not a polynomial curve in between?).

The article mentions Lazard's levelized cost of energy report. I reverse engineered the spreadsheets on which the report results are based and put them online here in case somebody wants to recreate the scenarios, stress them differently or create new ones.

I believe that what Jaynes does is quite standard: start with a minimalistic set of axioms (or principles, or whatever) and work your way to the intiuitive results later on. Euclid geometry is just like that!

I just skimmed over the details of the proofs (and I am a mathematician by training!). I did not read Jaynes for such details. I just guess that if they were wrong, somebody would have already reported them. The meaty part is elsewhere.

I believe this entry could have been written in much more general terms, i.e., why using [Gaussian] approximations at all nowadays. There is one answer: get general, asymptotic results. But in practice, given the current status of computers and statistical software, there is no point in using approximations. Particularly, as they do not work for small samples, as the ones you mention. And, in practice, we need to deal with small samples as well. 

The general advice would then be: if you need to model some random phenomenon, use the tools that allow to model it best. If beta, Poisson, gamma, etc. distributions seem more adequate, just do not use normal approximations at all.

First, I want to dispute the statement that a 50% is uninformative. It can be very informative depending on value of the outcomes. E.g., if I am analyzing transactions looking for fraud, that a transaction has 50% prediction of being fraudulent is "very informative": most fraudulent transactions may have fraud probabilities much, much lower than that.

Second, it is true that beliefs on probabilities need not be "sharp". The Bayesian approach to the problem (which is in fact the very problem that Bayes originally discussed!) would require you to provide a distribution of your "expected" (I want to avoid the terms "prior" or "subjective" explicitly here) probabilities. Such distribution could be more or less concentrated. The beta distribution could be used to encode such uncertainty; actually, it is the canonical distribution to do so. The question would remain how to operationalize it in a prediction market, particularly from the UX point of view.

Load More