JonahSinick comments on Tiling Agents for Self-Modifying AI (OPFAI #2) - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (260)
The paper is meant to be interpreted within an agenda of "Begin tackling the conceptual challenge of describing a stably self-reproducing decision criterion by inventing a simple formalism and confronting a crisp difficulty"; not as "We think this Godelian difficulty will block AI", nor "This formalism would be good for an actual AI", nor "A bounded probabilistic self-modifying agent would be like this, only scaled up and with some probabilistic and bounded parts tacked on". If that's not what you meant, please clarify.
Ok, that is what I meant, so your comment has helped me better understand your position.
Why do you think that
is cost-effective relative to other options on the table?
For "other options on the table," I have in mind things such as spreading rationality, building the human capital of people who care about global welfare, increasing the uptake of important information into the scientific community, and building transferable skills and connections for later use.
Personally, I feel like that kind of metawork is very important, but that somebody should also be doing something that isn't just metawork. If there's nobody making concrete progress on the actual problem that we're supposed to be solving, there's a major risk of the whole thing becoming a lost purpose, as well as of potentially-interested people wandering off to somewhere where they can actually do something that feels more real.
From inside MIRI, I've been able to feel this one viscerally as genius-level people come to me and say "Wow, this has really opened my eyes. Where do I get started?" and (until now) I've had to reply "Sorry, we haven't written down our technical research agenda anywhere" and so they go back to machine learning or finance or whatever because no, they aren't going to learn 10 different fields and become hyper-interdisciplinary philosophers working on important but slippery meta stuff like Bostrom and Shulman.
Yes, that's a large de-facto part of my reasoning.
I think that in addition to this being true, it is also how it looks from the outside -- at least, it's looked that way to me, and I imagine many others who have been concerned about SI focusing on rationality and fanfiction are coming from a similar perspective. It may be the case that without the object-level benefits, the boost to MIRI's credibility from being seen to work on the actual technical problem wouldn't justify the expense of doing so, but whether or not it would be enough to justify the investment by itself, I think it's a really significant consideration.
[ETA: Of course, in the counterfactual where working on the object problem actually isn't that important, you could try to explain this to people and maybe that would work. But since I think that it is actually important, I don't particularly expect that option to be available.]
Yes. I've had plenty of conversations with people who were unimpressed with MIRI, in part because the organization looked like it was doing nothing but idle philosophy. (Of course, whether that was the true rejection of the skeptics in question is another matter.)
I understand your position, but believe that your concerns are unwarranted, though I don't think that this is obvious.
If I gave you a list of people who in fact expressed interest but then, when there were no technical problems for them to work on, "wandered off to somewhere where they can actually do something that feels more real," would you change your mind? (I may not be able to produce such a list, because I wasn't writing down people's names as they wandered away, but I might be able to reconstruct it.)
Sounds like me two years ago, before I committed to finishing my doctorate. Oops.
Well, I'm not sure the "oops" is justified, given that two years ago, I really couldn't help you contribute to a MIRI technical research program, since it did not exist.
No, the oops is on me for not realizing how shallow "working on something that feels more real" would feel after the novelty of being able to explain what I work on to laypeople wore off.
Ah, I see.
I don't doubt you: I have different reasons for believing Kaj's concerns to be unwarranted:
It's not clear to me that offering people problems in mathematical logic is a good way to get people to work on Friendly AI problems. I think that the mathematical logic work is pretty far removed from the sort of work that will be needed for friendliness.
I believe that people who are interested in AI safety will not forget about AI safety entirely, independently of whether they have good problems to work on now.
I believe that people outside of MIRI will organically begin to work on AI safety without MIRI's advocacy when AI is temporally closer.
Mathematical logic problems are FAI problems. How are we going to build something self-improving that can reason correctly without having a theory of what "reasoning correctly" (ie logic) even looks like?
Based on what?
I'll admit I don't know that I would settle on mathematical logic as an important area of work, but EY being quite smart, working on this for ~10 years, and being able to convince quite a few people who are in a position to judge on this is good confirmation of the plausible idea that work in reflectivity of formal systems is a good place to be.
If you do have some domain knowledge that I don't have that makes stable reflectivity seem less important and puts you in a position to disagree with an expert (EY), please share.
People can get caught in other things. Maybe without something to work on now, they get deep into something else and build their skills in that and then the switching costs are too high to justify it. Mind you there is a steady stream of smart people, but opportunity costs.
Also, MIRI may be burning reputation capital by postponing actual work such that there may be less interested folks in the future. This could go either way, but it's a risk that should be accounted for.
(I for one (as a donor and wannabe contributor) appreciate that MIRI is getting these (important-looking) problems available to the rest of us now)
How will they tell? What if it happens too fast? What if the AI designs that are furthest along are incompatible with stable reflection? Hence MIRI working on stategic questions like "how close are we, how much warning can we expect" (Intelligence Explosion Microecon), and "What fundamental architectures are even compatible with friendliness" (this Lob stuff).
See my responses to paper-machine on this thread for (some reasons) why I'm questioning the relevance of mathematical logic.
I don't see this as any more relevant than Penrose's views on consciousness, which I recently discussed. Yes, there are multiple people who are convinced, but their may be spurious correlations which are collectively driving their interests. Some that come to mind are
Also, I find Penrose more impressive than all of the involved people combined. (This is not intended as a slight – rather, the situation is that Penrose's accomplishments are amazing.)
The idea isn't plausible to me, again, for reasons that I give in my responses to paper-machine (among others).
No, my reasons are at a meta-level rather than an object level, just as most members of the Less Wrong community (rightly) believe that Penrose's views on consciousness are very likely wrong without having read his arguments in detail.
This is possible, but I don't think that it's a major concern.
Note this as a potential source of status quo bias.
Place yourself in the shoes of the creators of the early search engines, online book store, and social network websites. If you were in their positions, would you feel justified in concluding "if we don't do it then no one else will"? If not, why do you think that AI safety will be any different?
I agree that it's conceivable that it could happen to fast, but I believe that there's strong evidence that it won't happen within the next 20 years, and 20 years is a long time for people to become interested in AI safety.
Why?
I say something about this here.
Okay; why specifically isn't mathematical logic the right domain?
EDIT: Or, to put it another way, there's nothing in the linked comment about mathematical logic.
The question to my mind is why is mathematical logic the right domain? Why not game theory, or solid state physics, or neural networks? I don't see any reason to privilege mathematical logic – a priori it seems like a non sequitur to me. The only reason that I give some weight to the possibility that it's relevant is that other people believe that it is.
People keep saying that. I don't understand why "planning fallacy" is not a sufficient reply. See also my view on why we're still alive.
I agree that my view is not a priori true and requires further argumentation.
BTW, I spent a large fraction of the first few months of 2013 weighing FAI research vs. other options before arriving at MIRI's 2013 strategy (which focuses heavily on FAI research). So it's not as though I think FAI research is obviously the superior path, and it's also not as though we haven't thought through all these different options, and gotten feedback from dozens of people about those options, and so on.
Also note that MIRI did, in fact, decide to focus on (1) spreading rationality, and (2) building a community of people who care about rationality, the far future, and x-risk, before turning its head to FAI research: see (in chronological order) the Singularity Summit, Less Wrong and CFAR.
But the question of which interventions are most cost effective (given astronomical waste) is a huge and difficult topic, one that will require thousands of hours to examine properly. Building on Beckstead and Bostrom, I've tried to begin that examination here. Before jumping over to that topic, I wonder: do you now largely accept the case Eliezer made for this latest paper as an important first step on an important sub-problem of the Friendly AI problem? And if not, why not?
My comments were addressed at Eliezer's paper specifically, rather than MIRI's general strategy, or your own views.
Sure – what I'm thinking about is cost-effectiveness at the margin.
Based on Eliezer's recent comments, my impression is that Eliezer is not making such a case, and is rather making a case for the paper being of sociological/motivational value. Is your understanding different?
No, that's not what I've been saying at all.
I'm sorry if this seems rude in some sense, but I need to inquire after your domain knowledge at this point. What is your level of mathematical literacy and do you have any previous acquaintance with AI problems? It may be that, if we're to proceed on this disagreement, MIRI should try to get an eminent authority in the field to briefly confirm basic, widespread, and correct ideas about the relevance of doing math to AI, rather than us trying to convince you of that via object-level arguments that might not be making any sense to you.
By 'the relevance of math to AI' I don't mean mathematical logic, I mean the relevance of trying to reduce an intuitive concept to a crisp form. In this case, like it says in the paper and like it says in the LW post, FOL is being used not because it's an appropriate representational fit to the environment... though as I write this, I realize that may sound like random jargon on your end... but because FOL has a lot of standard machinery for self-reflection of which we could then take advantage, like the notion of Godel numbering or ZF proving that every model entails every tautology... which probably doesn't mean anything to you either. But then I'm not sure how to proceed; if something can't be settled by object-level arguments then we probably have to find an authority trusted by you, who knows about the (straightforward, common) idea of 'crispness is relevant to AI' and can quickly skim the paper and confirm 'this work crispifies something about self-modification that wasn't as crisp before' and testify that to you. This sounds like a fair bit of work, but I expect we'll be trying to get some large names to skim the paper anyway, albeit possibly not the Early Draft for that.
Quick Googling suggest someone named "Jonah Sinick" is a mathematician in number theory. It appears to be the same person.
I really wish Jonah had mentioned that some number of comments ago, there's a lot of arguments I don't even try to use unless I know I'm talking to a mathematical literati.
It's mentioned explicitly at the beginning of his post Mathematicians and the Prevention of Recessions, strongly implied in The Paucity of Elites Online, and the website listed under his username and karma score is http://www.mathisbeauty.org.
Ok, I look forward to better understanding :-)
I have a PhD in pure math, I know the basic theory of computation and of computational complexity, but I don't have deep knowledge of these domains, and I have no acquaintance with AI problems.
Yes, this could be what's most efficient. But my sense is that our disagreement is at a non-technical level rather than at a technical level.
My interpretation of
was that you were asserting only very weak confidence in the relevance the paper to AI safety, and that you were saying "Our purpose in writing this was to do something that could conceivably have something to do with AI safety, so that people take notice and start doing more work on AI safety." Thinking it over, I realize that you might have meant "We believe that this paper is an important first step on a technical level. Can you clarify here?
If the latter interpretation is right, I'd recur to my question about why the operationalization is a good one, which I feel that you still haven't addressed, and which I see as crucial.
...
Do you not see that what Luke wrote was a direct response to your question?
There are really two parts to the justification for working on the this paper: 1) Direct FAI research is a good thing to do now. 2) This is a good problem to work on within FAI research. Luke's comment gives context explaining why MIRI is focusing on direct FAI research, in support of 1. And it's clear from what you list as other options that you weren't asking about 2.
It sounds like what you want is for this problem to be compared on its own to every other possible intervention. In theory that would be the rational thing to do to ensure you were always doing the most cost-effective work on the margin. But that only makes sense if it's computationally practical to do that evaluation at every step.
What MIRI has chosen to do instead is to invest some time up front coming up with a strategic plan, and then follow through on that. This seem entirely reasonable to me.
If the probability is too small, then it isn't worth it. The activities that I mention plausibly reduce astronomical waste to a nontrivial degree. Arguing that you can do better than them requires an argument that establishes the expected impact of MIRI Friendly AI research on AI safety above a nontrivial threshold.
Which question?
Sure, I acknowledge this.
I don't think that it's computationally intractable to come up with better alternatives. Indeed, I think that there are a number of concrete alternatives that are better.
I wasn't disputing this. I was questioning the relevance of MIRI's current research to AI safety, not saying that MIRI's decision process is unreasonable.
The one I quoted: "Why do you think that ... is cost-effective relative to other options on the table?"
Yes, you have a valid question about whether this Lob problem is relevant to AI safety.
What I found frustrating as a reader was that you asked why Eliezer was focusing on this problem as opposed to other options such as spreading rationality, building human capital, etc. Then when Luke responded with an explanation that MIRI had chosen to focus on FAI research, rather than those other types of work, you say, no I'm not asking about MIRI's strategy or Luke's views, I'm asking about this paper. But the reason Eliezer is working on this paper is because of MIRI's strategy!
So that just struck me as sort of rude and/or missing the point of what Luke was trying to tell you. My apologies if I've been unnecessarily uncharitable in interpreting your comments.
I read Luke's comment differently, based on the preliminary "BTW." My interpretation was that his purpose in making thecomment was to give a tangentially related contextual remark rather than to answer my question. (I wasn't at all bothered by this – I'm just explaining why I didn't respond to it as if it were intended to address my question.)
Ah, thanks for the clarification.
The way I'm using these words, my "this latest paper as an important first step on an important sub-problem of the Friendly AI problem" is equivalent to Eliezer's "begin tackling the conceptual challenge of describing a stably self-reproducing decision criterion by inventing a simple formalism and confronting a crisp difficulty."
Ok. I disagree that the paper is an important first step.
Because Eliezer is making an appeal based on psychological and sociological considerations, spelling out my reasoning requires discussion of what sorts of efforts are likely to impact the scientific community, and whether one can expect such research to occur by default. Discussing these requires discussion of psychology, sociology and economics, partly as related to whether the world's elites will navigate the creation of AI just fine.
I've described a little bit of my reasoning, and will be elaborating on it in detail in future posts.
I look forward to it! Our models of how the scientific community works may be substantially different. To consider just one particularly relevant example, consider what the field of machine ethics looks like without the Yudkowskian line.
I agree that Eliezer has substantially altered the field of machine ethics. My view here is very much contingent on the belief that elites will navigate the creation of AI just fine, which, if true, is highly nonobvious.
That sounds like a very long conversation if we're supposed to be giving quantitative estimates on everything. The qualitative version is just that this sort of thing can take a long time, may not parallelize easily, and can potentially be partially factored out to academia, and so it is wise to start work on it as soon as you've got enough revenue to support even a small team, so long as you can continue to scale your funding while that's happening.
This reply takes for granted that all astronomical benefits bottleneck through a self-improving AI at some point.
Thanks for clarifying your position.
My understanding based on what you say is that the research in your paper is intended to spearhead a field of research, rather than to create something that will be directly used for friendliness in the first AI. Is this right?
If so, our differences are about the sociology of the scientific, technological and political infrastructure rather than about object level considerations having to do with AI.
Sounds about right. You might mean a different thing from "spearhead a field of research" than I do, my phrasing would've been "Start working on the goddamned problem."
From your other comments I suspect that you have a rather different visualization of object-level considerations to do with AI and this is relevant to your disagreement.
Ok. I think that MIRI could communicate more clearly by highlighting this. My previous understanding had been that MIRI staff think that by default, one should expect to need to solve the Lob problem in order to build a Friendly AI. Is there anything in the public domain that would have suggested otherwise to me? If not, I'd suggest writing this up and highlighting it.
AFAIK, the position is still "need to 'solve' Lob to get FAI", where 'solve' means find a way to build something that doesn't have that problem, given that all the obvious formalisms do have such problems. Did EY suggest otherwise?
See my response to EY here.
By default, if you can build a Friendly AI you can solve the Lob problem. That working on the Lob Problem gets you closer to being able to build FAI is neither obvious nor certain, but everything has to start somewhere...
EDIT: Moved the rest of this reply to a new top-level comment because it seemed important and I didn't want it buried.
http://lesswrong.com/lw/hmt/tiling_agents_for_selfmodifying_ai_opfai_2/#943i
For readers who want to read more about this point, see FAI Research as Effective Altruism.
Other options on the table are not mutually exclusive. There is a lot of wealth and intellectual brain power in the world, and a lot of things to work on. We can't and shouldn't all work on one most important problem. We can't all work on the thousand most important problems. We can't even agree on what those problems are.
I suspect Eliezer has a comparative advantage in working on this type of AI research, and he's interested in it, so it makes sense for him to work on this. It especially makes sense to the extent that this is an area no one else is addressing. We're only talking about an expenditure of several careers and a few million dollars. Compared to the world economy, or even compared to the non-profit sector, this is a drop in the bucket.
Now if instead Eliezer was the 10,000th smart person working on string theory, or if there was an Apollo-style government-funded initiative to develop an FAI by 2019, then my estimate of the comparative advantage of MIRI would shift. But given the facts as they are, MIRI seems like a plausible use of the limited resources it consumes.
If Eliezer feels that this is his comparative advantage then it's fine for him to work on this sort of research — I'm not advocating that such research be stopped. My own impression is that Eliezer has comparative advantage in spreading rationality and that he could have a bigger impact by focusing on doing so.
I'm not arguing that such research shouldn't be funded. The human capital question is genuinely more dicey, insofar as I think that Eliezer has contributed substantial value through his work on spreading rationality, and my best guess is that the opportunity cost of not doing more is large.