The Third Alternative

Eliezer Yudkowsky

there was a result (from Pieter Abbeel's lab?) a couple of years ago that showed that pretraining a model on language would lead to improved sample efficiency in some nominally-totally-unrelated RL task

Pretrained Transformers as Universal Computation Engines
From the abstract:

We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning – in particular [...] a variety of sequence classification tasks spanning numerical, computation, vision, and protein fold prediction

4Rohin Shah3y

That's the one, thanks!

Lies Told To Children

glazgogabgolab3y170

Given your perspective, you may enjoy: Lies Told To Children: Pinocchio, Which I found posted here.

Personally I think I'd be fine with the bargain, but having read that alternative continuation, I think I better understand how you feel.

[Intro to brain-like-AGI safety] 1. What's the problem & Why work on it now?

glazgogabgolab3yΩ030

Oops, strangely enough I just wasn't thinking about that possibility. It's obvious now, but I assumed that SL vs RL would be a minor consideration, despite the many words you've already written on reward.

[Intro to brain-like-AGI safety] 1. What's the problem & Why work on it now?

glazgogabgolab3yΩ010

Hey Steve, I might be wrong here but I don't think Jon's question was specifically about what architectures you'd be talking about. I think he was asking more specifically about how to classify something as Brain-like-AGI for the purposes of your upcoming series.

The way I read your answer makes it sound like the safety considerations you'll be discussing depend more on whether the NTM is trained via SL or RL rather than whether it neatly contains all your (soon to be elucidated) Brain-like-AGI properties.

Though that might actually have been what you meant so I probably should have asked for clarification before I presumptively answered Jon for you.

3Steven Byrnes3y

I'm confused; this statement makes it sound like "whether it's trained via SL or RL" is NOT a possible candidate for a "brain-like-AGI property". Why can't it be? Or maybe I'm reading too much into your wording.

[Intro to brain-like-AGI safety] 1. What's the problem & Why work on it now?

glazgogabgolab3yΩ010

If I'm reading your question right I think the answer is:

I’m going to make a bunch of claims about the algorithms underlying human intelligence, and then talk about safely using algorithms with those properties. If our future AGI algorithms have those properties, then this series will be useful, and I would be inclined to call such an algorithm "brain-like".

i.e. The distinction depends on whether or not a given architecture has some properties Steve will mention later. Which, given Steve's work, are probably the key properties of "A learned population of Compositional Generative Models + A largely hardcoded Steering Subsystem".

5Steven Byrnes3y

OK that's fair, I didn't really answer Jon's question. So: What makes an AI algorithm brain-like for present purposes? Probably the biggest single thing is whether it's an actor-critic model-based RL algorithm. If yes, there's a good chance that some of the things I talk about in this series will be at least somewhat applicable to it. If no, probably not. But I'm not too sure and wouldn't want to make any promises. "Actor-critic model-based RL" is AFAIK a big diverse group of a thousand different models that work in a thousand different ways. Probably all of them have some kinds of safety-relevant differences from what I'm gonna talk about. But it's hard for me to say anything in general.

How I'm thinking about GPT-N

glazgogabgolab3y20

Regarding "posts making a bearish case" against GPT-N, there's Steve Byrnes', Can you get AGI from a transformer.

I was just in the middle of writing a draft revisiting some of his arguments, but in the meantime one claim that might be of particular interest to you is that: "...[GPT-N type models] cannot take you more than a couple steps of inferential distance away from the span of concepts frequently used by humans in the training data"

1delton1373y

oo ok, thanks, I'll take a look. The point about generative models being better is something I've been wanting to learn about, in particular.

LESSWRONG
LW

All of glazgogabgolab's Comments + Replies