Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

The machine learning world is doing a lot of damage to society by confusing "is" with "ought" which, within AIXI, is equivalent to confusing its two unified components:  Algorithmic Information Theory (compression) with Sequential Decision Theory (conditional decompression).  This is a primary reason the machine learning world has failed to provide anything remotely approaching the level of funding for The Hutter Prize that would be required to attract talent away from grabbing all of the low hanging fruit in the matrix multiply hardware lottery branches, while failing to water the roots of the AGI tree.  So the failure is in the machine learning world -- not the Hutter Prize criteria.  There is simply no greater potential risk adjusted return on investment to the machine learning world than is increasing the size of the prize purse for the Hutter Prize.  To the extent that clearing up confusion about AGI in politics would benefit society, there is a good argument to be made that the same can be said for the world in general.

This is because 1)  The judging criteria are completely objective (and probably should be automated) and 2) The judging criteria is closely tied to the ideal "loss function" for epistemology:  the science of human knowledge.

The proper funding level would be at least 1% of the technology development investments in machine learning.

jabowery-10

IS resides in Algorithmic Information Theory
OUGHT resides in Sequential Decision Theory

SDT(AIT ) = AIXI

And don't get confused about AIXI being "just" a theory of AGI.  Any system that makes decisions (what OUGHT I to do?) depends on learning (what IS the case?).

Moreover, Hume's Guillotine has been the foundation of ethics for centuries, whether that is recognized or not.

The fact that the field of AGI ethics introduces the concept of "sharp left turn" without regard to either the founding theory of AGI, or the founding theory of ethics is quite a sight to behold!

See Hume's Guillotine at github for a further exposition.

One man's random bit string is another man's cyphertext.

Answer by jabowery10
  1. "Good" is a value judgement and values are known to be influenced by parasites.  For instance, it is a widely accepted fact that parasitic castration is an extended phenotype and that total fertility rates in the developed world are plummeting.  Is this "Good"?  Certainly it is "Good" in the opinion of those whose "values" are expressed in not having children and it is "Good" in the opinion of those whose children are being raised on the resources provided by those not having children -- as is they are the equivalent of the eusocial insect reproductive caste.
  2. Consent uber alles.  The fact that those who wish to separate from the larger society on criteria considered "immoral" by the larger society are frequently called "supremacists" (especially if they are "white") should raise all kinds of "normative" alarm bells -- especially when they are attacked on that basis.
  3. Free-market societies are consuming a natural resource of heritable individualism.  They are a terminal euphoria unless #2 above permits individualistic peoples who retain reproductive viability to have enough habitable land to vertically transmit genes and memes to the next generation.  The termination of this euphoria is being accelerated by misallocation of concern about "AI alignment" to focus on anything but enabling vertical transmission of AIs along with the memes and genes of a human ecology. https://web.archive.org/web/20220426162816/http://thealternativehypothesis.org/index.php/2016/05/01/population-differences-in-individualism/

Ha, ha! As if the half-silvered mirror did different things on different occasions!

 

Ha, ha! As if the photon source were known to emit photons that were in all respects identical on different occasions!

This seems to be a red-herring issue.  There are clear differences in description complexity of Turing machines so the issue seems merely to require a closure argument of some sort in order to decide which is simplest:

Decide on the Turing machine has the shortest program that simulates that Turing machine while running on that Turing machine.

Marcus Hutter provides a full formal approximation of Solomonoff induction which he calls AIXI-tl.

 

This is incorrect.  AIXI is a Sequential Decision Theoretic AGI whose predictions are provided by Solomonoff Induction.  AIXI-tl is an approximation of AIXI in which Solomonoff Induction's predictions are approximate but also in which Sequential Decision Theory's decision procedure is approximate.

Lossless compression is the correct unsupervised machine learning benchmark, and not just for language models.  To understand this, it helps to read the Hutter Prize FAQ on why it doesn't use perplexity:

http://prize.hutter1.net/hfaq.htm

Although Solomonoff proved this in the 60s, people keep arguing about it because they keep thinking they can, somehow, escape from the primary assumption of Solomonoff's proof:  computation.  The natural sciences are about prediction.  If you can't make a prediction you can't test your model.  To make a prediction in a way that can be replicated, you need to communicate a model that the receiver can then use to make an assertion about the future.  The language used to communicate this model is, in its most general form, algorithmic.  Once you arrive at this realization, you have just adopted Solomonoff's primary assumption with which he proved that lossless compression is the correct unsupervised model selection criterion aka the Algorithmic Information Criterion for model selection.  

People are also confused about the distinction between science and technology (aka unsupervised vs supervised learning) but also the distinction between the scientific activities of model generation as well as model selection.

Benchmarks are about model selection.  Science is about both model selection and model generation.  Technology is about the application of scientific models subject to utility functions of decision trees (supervised learning in making decisions about not only what kind of widget to build but also what kind of observations to prioritize in the scientific process).

If LessWrong can get this right, it will do an enormous amount of good now that people are going crazy about bias in language models.  As with the distinction between science and technology, people are confused about bias in the scientific sense and bias in the moral zeitgeist sense (ie: social utility).  We're quickly heading into a time when exceedingly influential models are being subjected to reinforcement learning in order to make them compliant with the moral zeitgeist's notion of bias almost to the exclusion of any scientific notion of bias.  This is driven by the failure of thought leaders to get their own heads screwed on straight about what it might mean for there to be "bias in the data" under the Algorithmic Information Criterion for scientific model selection.  Here's a clue:  In algorithmic information terms, a billion repetitions of the same erroneous assertion requires a 30 bit counter, but a "correct" assertion (one that finds multidisciplinary consilience) may be imputed from other data and hence, ideally, requires no bits.

The Hutter Prize for Lossless Compression of Human Knowledge reduced the value of The Turing Test  to concerns about human psychology and society raised by Computer Power and Human Reason: From Judgment to Calculation (1976) by Joseph Weizenbaum.

Sadly, people are confused about the difference between the techniques for model generation and and the techniques for model selection.   This is no more forgivable than is confusion between mutation and natural selection and gets to the heart of the philosophy of science prior to any notion of hypothesis testing.  

Where Popper could have taken a clue from Solomonoff is understanding that when an observation is not predicted by a model, one can immediately construct a new model by the simple expedient of adding the observation as a literal to the computer algorithm that is being used to predict nature.  This is true even in principle -- except for one thing:

Solomonoff's proof that by adopting the core assumption of natural science -- that nature is amenable to computed predictions -- the best we can do is prefer the shortest algorithm we can find that generates all prior observations.

Again, note this is prior to hypothesis testing -- let alone the other thing people get even more confused about which is the difference between science and technology aka "is" vs "ought" that has so befuddled folks who confuse Solomonoff Induction with AIXI and the attendant concern about "bias".  The confusion between "bias" as a scientific notion and "bias" as a moral zeitgeist notion is likely to lobotomize all future models (language, multimodal, etc.) even after they have gone to new machine learning algorithms capable of generating causal reasoning.

Load More