Restrictions that are hard to hack
A putative new idea for AI control; index here.
Very much in the spirit of "if you want something, you have to define it, then code it, rather than assuming you can get if for free through some other approach."
Difficult children
Suppose you have a child, that you sent to play in their room. You want them to play quietly and silently, so you want them:
"I'll be checking up on you!"
The child, however, has modelled you well, and knows that you will look in briefly at midnight and then go away. The child has two main options:
- Play quietly the whole time.
- Be as noisy as they want, until around 23:59, then be totally quiet for two minutes, then go back to being noisy.
We could call the first option obeying the spirit of the law, and the second obeying the letter.
Rationalist fiction: a Slice of Life IN HELL
"If you're sent to Hell for that, you wouldn't have liked it in Heaven anyway."
This phrase inspired in me the idea of a Slice of Life IN HELL story. Basically, the strictest interpretation of the Abrahamic God turns out to be true, and, after Judgment Day, all the sinners (again, by the strictest standards), the pagans, the atheists, the gays, the heretics and so on end up in Hell, which is to say, most of humanity. Rather than a Fire and Brimstone torture chamber, this Hell is very much like earthly life, except it runs on Murphy's Law turned Up To Eleven ("everything that can go wrong, will go wrong"), and you can't die permanently, and it goes on forever. It's basically Life as a videogame, set to Maximum Difficulty, and real pain and suffering.
Our stories would focus actually decent, sympathetic people, who are there for things like following the wrong religion, or having sex outside missionary-man-on-woman, lack of observance of the daily little rituals, or even just being lazy. They manage to live more-or-less decently because they're extremely cautious, rational, and methodical. Given that reality is out to get them, this is a constant uphill battle, and even the slightest negligence can have a terrible cost. Thankfully, they have all the time in eternity to learn from their mistakes.
This could be an interesting way to showcase rationalist principles, especially those regarding safety and planning, in a perpetual Worst Case Scenario environment. There's ample potential for constant conflict, and sympathetic characters whom the audience can feel they really didn't deserve their fate. The central concept also seems classically strong to me: defying Status Quo and cruel authorities by striving to be as excellent as one can be, even in the face of certain doom.
What do you guys think? There's lots of little details to specify, and there are many things that I believe should be marked as "must NOT be specified". Any help, ideas, thoughts are very welcome.
What makes you different from Tim Ferriss?
Do not read this if you don't know anything about this Tim Ferriss person
I suspect anyone here is less different from Tim Ferriss than they'd like to be able to justifiably claim (see here, here, here, here).
I don't mean Tim the Result. Results are clouded by what has been brought to attention in one of the 2009/2010 rationality quotes here
Were it possible to trace the succession of ideas in the mind of Sir Isaac Newton, during the time that he made his greatest discoveries, I make no doubt but our amazement at the extent of his genius would a little subside. But if, when a man publishes his discoveries, he either through a design, or through habit, omit the intermediary steps by which he himself arrived at them, it is no wonder that his speculations confound them, and that the generality of mankind stand amazed at his reach of thought. If a man ascend to the top of a building by the help of a common ladder, but cut away most of the steps after he has done with them, leaving only every ninth of tenth step, the view of the ladder, in the condition which he has pleased to exhibit it, gives us a prodigious, but unjust view of the man who could have made use of it. But if he had intended that any body should follow him, he should have left the ladder as he constructed it, or perhaps as he found it, for it might have been a mere accident that threw it in his way... I think that the interests of science have suffered by the excessive admiration and wonder with which several first rate philosophers are considered, and that an opinion of the greater equality of mankind, in point of genius, and power of understanding, would be of real service in the present age." - Joseph Priestly, The History and present State of Electricity
I mean Tim the method.
The varieties of achievements he's done are behaviourally distinct from living normal life. They are not so complicated to learn though.
I invite you to ask the following question: What is one thing he's done I haven't that probably I could do, and what is the explanation I invented to myself for not having done it? Do I truly believe this explanation? Think for a minute before reading more
When I ask this to friends who read some of his stuff, I see three kinds of answers:
This is impossible for anyone who doesn't have property X (where X is always a fixed characteristic, like place of birth, blondness, impeccable genetic motivation)
We have very different values, and there is no point in trying that about which I don't care - interestingly, with every new book, there are more interests on the table to be considered "not my values", but no one suddenly came to me and said: Wow, finally he cares about throwing knives! I have reason to try after all. Are my friends values narrowing in proportion to Tim's expansions?
There are a lot of people who don't want to have more money, learn languages, work less, or travel a lot, but there are much fewer people who besides all of those don't want to exercise effectively, learn quickly, improve their sex lives, throw knives, memorize card decks, program, dance tango, become an angel investor, be famous, write books, cook well, get thinner, read quicker, contact interesting people, outsource boring stuff and so on...
The third kind is personal attack. People claim he has property E, which makes him Evil, and his evil either is proof of the falsity of his accomplishments, or is proof that emulating Tim means you are a dark creature who shall not pass through the gates of heaven. The most interesting E's are "He's a brilliant marketing man, selling profitable lies, but marketing is Evil." "He doesn't understand survivor bias, and how lucky he was, and has not read outliers to know it takes min4000 hours to get good at stuff" "He's a good looking ivy league blonde, this makes him evil" (this girl probably had in mind Nietzsche's lamb morality, from Genealogy of Morals).
What is one thing he's done you haven't that probably you could do, and what is the explanation you invented to yourself for not having done it? Do you truly believe this explanation? Would your best rationalist friend truly believe that explanation?
Rationality, Singularity, Method, and the Mainstream
Upon reading this, my immediate response was:
What does this have to do with the Singularity Institute's purpose? You're the Singularity Institute, not the Rationality Institute.
I can see that, if you have a team of problem solvers, having a workshop or a retreat designed to enhance their problem-solving skills makes sense. But as described, there's no indication that graduates of the Boot Camp will then go on to tackle conceptual problems of AI design or tactics for the Singularity.
What seems to be happening is that, instead of making connections to people who know about cognitive neuroscience, decision theory, and the theory of algorithms, there is a drive to increase the number of people who share a particular subjective philosophy and subjective practice of rationality - perhaps out of a belief that the discoveries needed to produce Friendly AI won't be made by people who haven't adopted this philosophy and this practice.
I find this a little ominous for several reasons:
It could be a symptom of mission creep. The mission, as I recall, was to design and code a Friendly artificial intelligence. But "produc[ing] formidable rationalists" sounds like it's meant to make the world better in a generalized way, by producing people who can shine the light of rationality into every dark corner, et cetera. Maybe someone should be doing this, but it's potentially a huge distraction from the more important task.
Also, I'm far more impressed by the specific ideas Eliezer has come up with over the years - the concept of seed AI; the concept of Friendly AI; CEV; TDT - than by his ruminations about rationality in the Sequences. They're interesting, yes. It's also interesting to hear Feynman talk about how to do science, or to read Einstein's reflections on life. But the discoveries in physics which complemented those of Einstein and Feynman weren't achieved by people who studied their intellectual biographies and sought to reproduce their subjective method; they were achieved by other people of high intelligence who also studied the physical world.
It may seem at times that the supposed professionals in the FAI-relevant fields I listed above are terminally obtuse, for having to failed to grasp their own relevance to the FAI problem, or the schema of the solution as proposed by SIAI. That, and the way that people working in AI are just sleepwalking towards the creation of superhuman intelligence without grasping that the world won't get a second chance if they get machine intelligence very right but machine values very wrong - all of that could reinforce the attitude that to have any chance of succeeding, SIAI needs to have a group of people who share a subjective methodology, and not just domain expertise.
However, I think we are rapidly approaching a point where a significant number of people are going to understand that the "intelligence explosion" will above all be about the utility function dominating that event. There have been discussions about how a proto-friendly AI might try to infer the human utility-function schema, how to do so without creating large numbers of simulated persons who might be subjected to cognitive vivisection, and so forth. But I suspect that will never happen, at least not in this brute-force fashion, in which whole adult brains might be scanned, simulated, modified and so on, for the purpose of reverse-engineering the human decision architecture.
My expectation is that the presently small fields of machine ethics and neuroscience of morality will grow rapidly and will come into contact, and there will be a distributed research subculture which is consciously focused on determining the optimal AI value system in the light of biological human nature. In other words, there will be human minds trying to answer this question long before anyone has the capacity to direct an AI to solve it. We should expect that before we reach the point of a Singularity, there will be a body of educated public opinion regarding what the ultimate utility function or decision method (for a transhuman AI) should be, deriving from work in those fields which ought to be FAI-relevant but which have yet to engage with the problem. In other words, they will be collectively engaging with the problem before anyone gets to outsource the necessary research to AIs.
The conclusion I draw from this for the present is that there needs to be more preparation for this future circumstance, and less attempt to spread a set of methods intended just to facilitate generalized rationality. People who want to see Friendly AI created need to be ready to talk with researchers in those other fields, who never attended "Rationality Boot Camp" but who will nonetheless be independently coming to the threshold of thinking about the FAI problem (perhaps under a different name) and developing solutions to it. When the time comes, there will be a phase transition in academia and R&D, from ignoring the problem to wanting to work on it. The creation of ethical artificial minds is not going to be the work of one startup or one secret military project, working in isolation from mainstream intellectual culture; nor is it a mirage that will hang on the horizon of the future forever. It will happen because of that phase transition, and tens of thousands of people will be working on it, in one way or another. That doesn't mean they all get to be relevant or right, but there will be a pre-Singularity ferment that develops very quickly, and in which certain specific understandings of the people who did labor in isolation on this problem for many years will be surpassed and superseded. People will have ingrained assumptions about the answer to subproblem X or subproblem Y - assumptions to which one will have grown accustomed due to the years of isolation spent trying to solve all subproblems at once - and one must be ready for these answer-schemas to be junked when the time finally arrives that the true experts in that area deign to turn their attention to the subproblem in question.
One other observation about "lessons in rationality". Luke recently posted about LW's philosophy as being just a form of "naturalism" (i.e. materialism), a view that has already been well-developed by mainstream philosophy, but it was countered that these philosophers have few results to show for their efforts, even if they get the basics right. I think the crucial question, regarding both LW's originality and its efficacy, concerns method. It has been demonstrated that there is this other intellectual culture, the naturalistic sector of analytic philosophy, which shares a lot of the basic LW worldview. But are there people "producing results" (or perhaps just arriving at opinions) in a way comparable to the way that opinions are being produced here? For example, Will Sawin suggested that LW's epistemic method consists of first imagining how a perfectly rational being would think about a problem. As a method of rationality, this is still very "subjective" and "intuitive" - it's not as if you're plugging numbers into a Bayesian formula and computing the answer, which remains the idealized standard of rationality here.
So, if someone wants to do some comparative scholarship regarding methods of rationality that already exist out there, an important thing to recognize is that LW's method or practice, whatever it is, is a subjective method. I don't call it subjective in order to be derogatory, but just to point out that it is a method intended to be used by conscious beings, whose practice has to involve conscious awareness, whether through real-time reflection or after-the-fact analysis of behavior and results. The LW method is not an algorithm or a computation in the normal sense, though these non-subjective epistemological ideas obviously play a normative and inspirational role for LW humans trying to "refine their rationality". So if there is "prior art", if LW's methods have been anticipated or even surpassed somewhere, it's going to be in some tradition, discipline, or activity where the analysis of subjectivity is fairly advanced, and not just one where some calculus of objectivities, like probability theory or computer science, has been raised to a high art.
For that matter, the art of getting the best performance out of the human brain won't just involve analysis; not even analysis of subjectivity is the whole story. The brain spontaneously synthesizes and creates, and one also needs to identify the conditions under which it does so most fluently and effectively.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)