Kaj_Sotala comments on Tiling Agents for Self-Modifying AI (OPFAI #2) - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (260)
Ok, that is what I meant, so your comment has helped me better understand your position.
Why do you think that
is cost-effective relative to other options on the table?
For "other options on the table," I have in mind things such as spreading rationality, building the human capital of people who care about global welfare, increasing the uptake of important information into the scientific community, and building transferable skills and connections for later use.
Personally, I feel like that kind of metawork is very important, but that somebody should also be doing something that isn't just metawork. If there's nobody making concrete progress on the actual problem that we're supposed to be solving, there's a major risk of the whole thing becoming a lost purpose, as well as of potentially-interested people wandering off to somewhere where they can actually do something that feels more real.
From inside MIRI, I've been able to feel this one viscerally as genius-level people come to me and say "Wow, this has really opened my eyes. Where do I get started?" and (until now) I've had to reply "Sorry, we haven't written down our technical research agenda anywhere" and so they go back to machine learning or finance or whatever because no, they aren't going to learn 10 different fields and become hyper-interdisciplinary philosophers working on important but slippery meta stuff like Bostrom and Shulman.
Yes, that's a large de-facto part of my reasoning.
I think that in addition to this being true, it is also how it looks from the outside -- at least, it's looked that way to me, and I imagine many others who have been concerned about SI focusing on rationality and fanfiction are coming from a similar perspective. It may be the case that without the object-level benefits, the boost to MIRI's credibility from being seen to work on the actual technical problem wouldn't justify the expense of doing so, but whether or not it would be enough to justify the investment by itself, I think it's a really significant consideration.
[ETA: Of course, in the counterfactual where working on the object problem actually isn't that important, you could try to explain this to people and maybe that would work. But since I think that it is actually important, I don't particularly expect that option to be available.]
Yes. I've had plenty of conversations with people who were unimpressed with MIRI, in part because the organization looked like it was doing nothing but idle philosophy. (Of course, whether that was the true rejection of the skeptics in question is another matter.)
I understand your position, but believe that your concerns are unwarranted, though I don't think that this is obvious.
If I gave you a list of people who in fact expressed interest but then, when there were no technical problems for them to work on, "wandered off to somewhere where they can actually do something that feels more real," would you change your mind? (I may not be able to produce such a list, because I wasn't writing down people's names as they wandered away, but I might be able to reconstruct it.)
Sounds like me two years ago, before I committed to finishing my doctorate. Oops.
Well, I'm not sure the "oops" is justified, given that two years ago, I really couldn't help you contribute to a MIRI technical research program, since it did not exist.
No, the oops is on me for not realizing how shallow "working on something that feels more real" would feel after the novelty of being able to explain what I work on to laypeople wore off.
Ah, I see.
I don't doubt you: I have different reasons for believing Kaj's concerns to be unwarranted:
It's not clear to me that offering people problems in mathematical logic is a good way to get people to work on Friendly AI problems. I think that the mathematical logic work is pretty far removed from the sort of work that will be needed for friendliness.
I believe that people who are interested in AI safety will not forget about AI safety entirely, independently of whether they have good problems to work on now.
I believe that people outside of MIRI will organically begin to work on AI safety without MIRI's advocacy when AI is temporally closer.
Mathematical logic problems are FAI problems. How are we going to build something self-improving that can reason correctly without having a theory of what "reasoning correctly" (ie logic) even looks like?
Based on what?
I'll admit I don't know that I would settle on mathematical logic as an important area of work, but EY being quite smart, working on this for ~10 years, and being able to convince quite a few people who are in a position to judge on this is good confirmation of the plausible idea that work in reflectivity of formal systems is a good place to be.
If you do have some domain knowledge that I don't have that makes stable reflectivity seem less important and puts you in a position to disagree with an expert (EY), please share.
People can get caught in other things. Maybe without something to work on now, they get deep into something else and build their skills in that and then the switching costs are too high to justify it. Mind you there is a steady stream of smart people, but opportunity costs.
Also, MIRI may be burning reputation capital by postponing actual work such that there may be less interested folks in the future. This could go either way, but it's a risk that should be accounted for.
(I for one (as a donor and wannabe contributor) appreciate that MIRI is getting these (important-looking) problems available to the rest of us now)
How will they tell? What if it happens too fast? What if the AI designs that are furthest along are incompatible with stable reflection? Hence MIRI working on stategic questions like "how close are we, how much warning can we expect" (Intelligence Explosion Microecon), and "What fundamental architectures are even compatible with friendliness" (this Lob stuff).
See my responses to paper-machine on this thread for (some reasons) why I'm questioning the relevance of mathematical logic.
I don't see this as any more relevant than Penrose's views on consciousness, which I recently discussed. Yes, there are multiple people who are convinced, but their may be spurious correlations which are collectively driving their interests. Some that come to mind are
Also, I find Penrose more impressive than all of the involved people combined. (This is not intended as a slight – rather, the situation is that Penrose's accomplishments are amazing.)
The idea isn't plausible to me, again, for reasons that I give in my responses to paper-machine (among others).
No, my reasons are at a meta-level rather than an object level, just as most members of the Less Wrong community (rightly) believe that Penrose's views on consciousness are very likely wrong without having read his arguments in detail.
This is possible, but I don't think that it's a major concern.
Note this as a potential source of status quo bias.
Place yourself in the shoes of the creators of the early search engines, online book store, and social network websites. If you were in their positions, would you feel justified in concluding "if we don't do it then no one else will"? If not, why do you think that AI safety will be any different?
I agree that it's conceivable that it could happen to fast, but I believe that there's strong evidence that it won't happen within the next 20 years, and 20 years is a long time for people to become interested in AI safety.
Why?
I say something about this here.
Okay; why specifically isn't mathematical logic the right domain?
EDIT: Or, to put it another way, there's nothing in the linked comment about mathematical logic.
The question to my mind is why is mathematical logic the right domain? Why not game theory, or solid state physics, or neural networks? I don't see any reason to privilege mathematical logic – a priori it seems like a non sequitur to me. The only reason that I give some weight to the possibility that it's relevant is that other people believe that it is.
AI's do Reasoning. If you can't see the relevance of logic to reasoning, I can't help.
Further, do you have some other domain of inquiry that has higher expected return? I've seen a lot of stated meta-level skepticism, but no strong arguments either on the meta level (why should MIRI be as uncertain as you) or the object level (are there arguments against studying logic, or arguments for doing something else).
Now I imagine it seems to you that MIRI is privileging the mathematical logic hypothesis, but as above, it looks to me rather obviously relevant such that it would take some evidence against it to put me in your epistemic position.
(Though strictly speaking given strong enough evidence against MIRI's strategy I would go more towards "I don't know what's going on here, everything is confusing" rather than your (I assume) "There's no good reason one way or the other")
You seem to be taking a position of normative ignorance (I don't know and neither can you), in what looks like the face of plenty of information. I would expect rational updating exposed to such information to yield a strong position one way or the other or epistemic panic, not calm (normative!) ignorance.
Note that to take a position of normative uncertainty you have to believe that not only have you seen no evidence, there is no evidence. I'm seeing normative uncertainty and no strong reason to occupy a position of normative uncertainty, so I'm confused.
AI have to come to conclusions about the state of the world, where "world" also includes their own being. Model theory is the field that deals with such things formally.
These could be relevant, but it seems to me that "mind of an AI" is an emergent phenomena of the underlying solid state physics, where "emergent" here means "technically explained by, but intractable to study as such." Game theory and model theory are intrinsically linked at the hip, and no comment on neural networks.
People keep saying that. I don't understand why "planning fallacy" is not a sufficient reply. See also my view on why we're still alive.
I agree that my view is not a priori true and requires further argumentation.