New forum for MIRI research: Intelligent Agent Foundations Forum

orthonormal

Today, the Machine Intelligence Research Institute is launching a new forum for research discussion: the Intelligent Agent Foundations Forum! It's already been seeded with a bunch of new work on MIRI topics from the last few months.

We've covered most of the (what, why, how) subjects on the forum's new welcome post and the How to Contribute page, but this post is an easy place to comment if you have further questions (or if, maths forbid, there are technical issues with the forum instead of on it).

But before that, go ahead and check it out!

(Major thanks to Benja Fallenstein, Alice Monday, and Elliott Jin for their work on the forum code, and to all the contributors so far!)

EDIT 3/22: Jessica Taylor, Benja Fallenstein, and I wrote forum digest posts summarizing and linking to recent work (on the IAFF and elsewhere) on reflective oracle machines, on corrigibility, utility indifference, and related control ideas, and on updateless decision theory and the logic of provability, respectively! These are pretty excellent resources for reading up on those topics, in my biased opinion.

But before that, go ahead and check it out!

(Major thanks to Benja Fallenstein, Alice Monday, and Elliott Jin for their work on the forum code, and to all the contributors so far!)

Learning how to create even a simple recommendation engine whose output is constrained by the values of its creators would be a large step forward and would help society today.

I think something showing how to do value learning on a small scale like this would be on topic. It might help to expose the advantages and disadvantages of algorithms like inverse reinforcement learning.

I also agree that, if there are more practical applications of AI safety ideas, this will increase interest and resources devoted to AI safety. I don't really see those applications yet, but I will look out for them. Thanks for bringing this to my attention.

it is demonstrably not the case in history that the fastest way to develop a solution is to ignore all practicalities and work from theory backwards

I don't have a great understanding of the history of engineering, but I get the impression that working from the theory backwards can often be helpful. For example, Turing developed the basics of computer science before sufficiently general computers existed.

My current impression is that solving FAI with a hypercomputer is a fundamentally easier problem that solving it with a bounded computer, and it's hard to say much about the second problem if we haven't made steps towards solving the first one. On the other hand, I do think that concepts developed in the AI field (such as statistical learning theory) can be helpful even for creating unbounded solutions.

AIXI showed that all the complexity of AGI lies in the practicalities, because the pure uncomputable theory is dead simple but utterly divorced from practice.

I would really like it if the pure uncomputable theory of Friendly AI were dead simple!

Anyway, AIXI has been used to develop more practical algorithms. I definitely approach many FAI problems with the mindset that we're going to eventually need to scale this down, and this makes issues like logical uncertainty a lot more difficult. In fact, Paul Christiano has written about tractable logical uncertainty algorithms, which is a form of "scaling down an intractable theory". But it helped to have the theory in the first place before developing this.

an ignore-all-practicalities theory-first approach is useless until it nears completion

Solutions that seem to work for practical systems might fail for superintelligence. For example, perhaps induction can yield acceptable practical solutions for weak AIs, but does not necessarily translate to new contexts that a superintelligence might find itself in (where it has to make pivotal decisions without training data for these types of decisions). But I do think working on these is still useful.

My current trajectory places the first AGI at 10 to 15 years out, and the first self-improving superintelligence shortly thereafter. Will MIRI have practical results in that time frame?

I consider AGI in the next 10-15 years fairly unlikely, but it might be worth having FAI half-solutions by then, just in case. Unfortunately I don't really know a good way to make half-solutions. I would like to hear if you have a plan for making these.

I don't have a great understanding of the history of engineering, but I get the impression that working from the theory backwards can often be helpful. For example, Turing developed the basics of computer science before sufficiently general computers existed.

The first computer was designed by Babbage who was mostly interested in practical applications (although admitedly it was never built.) 100 years later Konrad Zuse developed the first working computer and was also for practical purposes. I'm not sure if he was even aware of Turing's work.

Not that Tu... (read more)

2[anonymous]11y

One way to fix the lack of historical perspective is to actively involve engineers and their projects into the MIRI research agenda, rather than specifically excluding them. Regarding your example, Turing hardly invented computing. If anything that honor probably goes to Charles Babbage who nearly a century earlier designed the first general computation devices, or to the various business equipment corporations that had been building and marketing special purpose computers for decades after Babbage and prior to the work of Church and Turing. It is far, far easier to provide theoretical backing to a broad category of devices which are already known to work than to invent out of whole cloth a field with absolutely no experimental validation. [...] The first statement is trivially true: everything is easier on a hypercomputer. But who cares? we don't have hypercomputers. The second statement is the real meat of the argument -- that "it's hard to say much about the [tractable FAI] if we haven't made steps towards solving the [uncomputable FAI]." While on the surface that seems like a sensible statement, I'm afraid your intuition fails you here. Experience with artificial intelligence has shown that there does not seem to be any single category of tractable algorithms which provides general intelligence. Rather we are faced with a dizzying array of special purpose intelligences which in no way resemble general models like AIXI, and the first superintelligences are likely to be some hodge-podge integration of multiple techniques. What we've learned from neuroscience and modern psychology basically backs this up: the human mind at least achieves its generality from a variety of techniques, not some easy-to-analyze general principle. It's looking more and more likely that the tricks we will use to actually achieve general intelligence will not resemble in the slightest the simple unbounded models for general intelligence that MIRI currently plays with. It's not unreas

53

New forum for MIRI research: Intelligent Agent Foundations Forum

53

53

53

New forum for MIRI research: Intelligent Agent Foundations Forum

53

53