You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Modifying Universal Intelligence Measure

2 Alex_Altair 18 September 2012 11:44PM

In 2007, Legg and Hutter wrote a paper using the AIXI model to define a measure of intelligence. It's pretty great, but I can think of some directions of improvement.

  • Reinforcement learning. I think this term and formalism are historically from much simpler agent models which actually depended on being reinforced to learn. In its present form (Hutter 2005 section 4.1) it seems arbitrarily general, but it still feels kinda gross to me. Can we formalize AIXI and the intelligence measure in terms of utility functions, instead? And perhaps prove them equivalent?
  • Choice of Horizon. AIXI discounts the future by requiring that total future reward is bounded, and therefore so does the intelligence measure. This seems to me like a constraint that does not reflect reality, and possibly an infinitely important one. How could we remove this requirement? (Much discussion on the "Choice of the Horizon" in Hutter 2005 section 5.7).
  • Unknown utility function. When we reformulate it in terms of utility functions, let's make sure we can measure its intelligence/optimization power without having to know its utility function. Perhaps by using an average of utility functions weighted by their K-complexity.
  • AI orientation. Finally, and least importantly, it tests agents across all possible programs, even those which are known to be inconsistent with our universe. This might okay if your agent is a playing arbitrary games on a computer, but if you are trying to determine how powerful an agent will be in this universe, you probably want to replace the Solomonoff prior with the posterior resulting from updating the Solomonoff prior with data from our universe.

Any thought or research on this by others? I imagine lots of discussion has occurred over these topics; any referencing would be appreciated.

Combining causality with algorithmic information theory

15 [deleted] 09 June 2012 01:31AM

Warning: maths.

Causal inference using the algorithmic Markov condition (Janzing and Schölkopf, 2008) replaces conditional independences between random variables, which define the structure of causal graphs, with algorithmic conditional independences between bit strings.

Conditional probabilities between variables become conditional complexities between strings, i.e. K(x|y) is the length of the shortest program that can generate the string x from y. Similarly, algorithmic mutual information I(x:y) is the amount of information that can be omitted in defining a string y given a shortest compressor for string x, I(x:y) = K(y) - K(y|x*). K(x,y) is the complexity of the concatenation of two strings x and y. These lead naturally to a definition of algorithmic conditional independence as I(x:y|z) = K(x|z) + K(y|z) - K(x,y|z) = 0 , where equality is defined up to the standard additive constant.

Then a lot of sexy, confusing proofs happen. When the dust settles, it looks like if you take some strings describing observations, interpret them as nodes in a graph, and "factor" so that a certain algorithmic Markov condition holds (every node string should be algorithmically independent of its non-descendant node strings given the optimal compressor of its parents' node strings), then every node can be computed by an O(1) program run on a Turing machine, with the node's parents and a noise term as input (with each node's noise string being jointly independent of the others). 

Notably, this means that if we make two observations which were "generated from their parents by the same complex rule", then we can "postulate another causal link between the nodes that explains the similarity of mechanisms". They say "complex rule" because the mutual algorithmic information between simple information strings, like some digits of pi, will be swallowed up by additive constants. Which all seems very close to rediscovering TDT.

There's more to the paper, but that's the tasty bit, so the summary ends here.

A Philosophical Treatise of Universal Induction (Link)

11 XiXiDu 10 June 2011 06:53PM

Abstract: Understanding inductive reasoning is a problem that has engaged mankind for thousands of years. This problem is relevant to a wide range of fields and is integral to the philosophy of science. It has been tackled by many great minds ranging from philosophers to scientists to mathematicians, and more recently computer scientists. In this article we argue the case for Solomonoff Induction, a formal inductive framework which combines algorithmic information theory with the Bayesian framework. Although it achieves excellent theoretical results and is based on solid philosophical foundations, the requisite technical knowledge necessary for understanding this framework has caused it to remain largely unknown and unappreciated in the wider scientific community. The main contribution of this article is to convey Solomonoff induction and its related concepts in a generally accessible form with the aim of bridging this current technical gap. In the process we examine the major historical contributions that have led to the formulation of Solomonoff Induction as well as criticisms of Solomonoff and induction in general. In particular we examine how Solomonoff induction addresses many issues that have plagued other inductive systems, such as the black ravens paradox and the confirmation problem, and compare this approach with other recent approaches.

Link: mdpi.com/1099-4300/13/6/1076/

Download PDF Full-Text: mdpi.com/1099-4300/13/6/1076/pdf

Authors: Samuel Rathmanner and Marcus Hutter

Published: 3 June 2011

Via: vetta.org/2011/06/treatise-on-universal-induction/