Donated $500!
FYI, this is not what the word "corrigibility" means in this context. (Or, at least, it's not how we at MIRI have been using it, and it's not how Stuart Russell has been using it, and it's not a usage that I, as one of the people who originally brought that word into the AI alignment space, endorse.) We use the phrase "utility indifference" to refer to what you're calling "corrigibility", and we use the word "corrigibility" for the broad vague problem that "utility indifference" was but one attempt to solve.
By analogy, imagine people groping around in the dark attempting to develop probability theory. They might call the whole topic the topic of "managing uncertainty," and they might call specific attempts things like "fuzzy logic" or "multi-valued logic" before eventually settling on something that seems to work pretty well (which happened to be an attempt called "probability theory.") We're currently reserving the "corrigibilty" word for the analog of "managing uncertainty"; that is, we use the "corrigibility" label to refer to the highly general problem of developing AI algorithms that cause a system to (in an intuitive sense) reason without incentives to deceive/manipulate, and to reason (vaguely) as if it's still under construction and potentially dangerous :-)
Imagine a world where humans somehow achieved jet-propelled flight before developing a firm understanding of calculus or celestial mechanics.
No need to imagine it. Rockets have been around since at least the 10th century.
In a world like that, what work would be needed in order to safely transport humans to the Moon?
Pretty much the same work that was needed in order to transport humans to the Moon at all.
Note how humans didn't manage to fly rockets to the Moon, or even to use them as really effective weapons, until they figured out calculus, celestial mechanics, and a ton of other stuff.
By your analogy, one of the main criticism of doing MIRI-style AGI safety research now is that it's like 10th century Chinese philosophers doing Saturn V safety research based on what they knew about fire arrows.
By your analogy, one of the main criticism of doing MIRI-style AGI safety research now is that it's like 10th century Chinese philosophers doing Saturn V safety research based on what they knew about fire arrows.
This is a fairly common criticism, yeah. The point of the post is that MIRI-style AI alignment research is less like this and more like Chinese mathematicians researching calculus and gravity, which is still difficult, but much easier than attempting to do safety engineering on the Saturn V far in advance :-)
I don't claim that it developed skill and talent in all participants, nor even in the median participant.
And yet you called it "a resounding success". Does that mean that you're focusing on the crème de la crème, the top tier of the participants, while being less concerned with what's happening in lower quantiles?
Yes, precisely. (Transparency illusion strikes again! I had considered it obvious that the default outcome was "a few people are nudged slightly more towards becoming AI alignment researchers someday", and that the outcome of "actually cause at least one very talented person to become AI alignment researcher who otherwise would not have, over the course of three weeks" was clearly in "resounding success" territory, whereas "turn half the attendees into AI alignment researchers" is in I'll-eat-my-hat territory.)
Thanks for writing this up!
As a participant, I think the claim that MSFP was a resounding success is a little strong. It's not at all clear to me that anyone gained new skills by attending (at least, I don't feel like I did), as distinct from learning about new ideas, using their existing skills, becoming convinced of various positions, and making social connections (which are more than enough to explain the new hires). To me it was an interesting experiment whose results I find hard to evaluate.
I don't claim that it developed skill and talent in all participants, nor even in the median participant. I do stand by my claim that it appears to have had drastic good effects on a few people though, and that it led directly to MIRI hires, at least one of which would not have happened otherwise :-)
$250 plus a vote to have winter fundraiser right after the bonus season :)
Thanks! :-p It's convenient to have the 2015 fundraisers end before 2015 ends, but we may well change the way fundraisers work next year.
Donation sent.
I've been very impressed with MIRI's output this year, to the extent I am able to be a judge. I don't have the domain specific ability to evaluate the papers, but there is a sustained frequency of material being produced. I've also read much of the thinking around VAT, related open problems, definition of concepts like foreseen difficulties... the language and framework for carving up the AI safety problem has really moved forward.
Thanks! Our languages and frameworks definitely have been improving greatly over the last year or so, and I'm excited to see where we go now that we've pulled a sizable team together.
There are an awful lot of ideas in this comment thread but many ideas have been proposed in the past. Without leadership, nothing's going to happen, and as I understand it the leaders of lw have left. Nate's been contacted? Ok, does he have decision making power? Is he an appropriate leader to have it? Will he use it? Well, I hope so, but the first step is a deliberate move to take ownership and end the headlessness
I have the requisite decision-making power. I hereby delegate Vaniver to come up with a plan of action, and will use what power I have to see that that plan gets executed, so long as the plan seems unlikely to do more harm than good (but regardless of whether I think it will work). Vaniver and the community will need to provide the personpower and the funding, of course.
$1000. (With an additional $1000 because of private, non-employer matching.)
Thanks! And thanks again for your huge donation in the summer; I was not expecting more.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Clicking the "Donate now" button under "PayPal or Credit Card" does not seem to do anything other than refresh the page.
(browser Firefox 48.0 , OS Ubuntu)
Huh, thanks for the heads up. If you use an ad-blocker, try pausing that and refreshing. Meanwhile, I'll have someone look into it.