Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
I just saw another comment implying that immigration was good because it increased GDP. Over the years, I've seen many similar comments in the LW / transhumanist / etc bubble claiming that increasing a country's population is good because it increases its GDP. These are generally used in support of increasing either immigration or population growth.
It doesn't, however, make sense. People have attached a positive valence to certain words, then moved those words into new contexts. They did not figure out what they want to optimize and do the math.
I presume they want to optimize wealth or productivity per person. You wouldn't try to make Finland richer by absorbing China. Its GDP would go up, but its GDP per person would go way down.
Crossposted at the Intelligent Agents Forum
It's occurred to me that there is a framework where we can see all "indifference" results as corrective rewards, both for the utility function change indifference and for the policy change indifference.
Imagine that the agent has reward R0 and is following policy π0, and we want to change it to having reward R1 and following policy π1.
Then the corrective reward we need to pay it, so that it doesn't attempt to resist or cause that change, is simply the difference between the two expected values:
where V is the agent's own valuation of the expected reward, conditional on the policy.
This explains why off-policy reward-based agents are already safely interruptible: since we change the policy, not the reward, R0=R1. And since off-policy agents have value estimates that are indifferent to the policy followed, V(R0|π0)=V(R1|π1), and the compensatory rewards are zero.
Crossposted at the Intelligent Agent Forum
I'll try to clarify what I was doing with the AI truth setup in a previous post. First I'll explain the nature of the challenge, and then how the setup tries to solve it.
The nature of the challenge is to have an AI give genuine understanding to a human. Getting the truth out of an AI or Oracle is not that hard, conceptually: you get the AI to report some formal property of its model. The problem is that that truth can be completely misleading, or, more likely, incomprehensible.
The Center for Human-Compatible AI (CHCAI) and the Machine Intelligence Research Institute (MIRI) are looking for talented, driven, and ambitious technical researchers for a summer research internship.
CHCAI is a research center based at UC Berkeley with PIs including Stuart Russell, Pieter Abbeel and Anca Dragan. CHCAI describes its goal as "to develop the conceptual and technical wherewithal to reorient the general thrust of AI research towards provably beneficial systems".
MIRI is an independent research nonprofit located near the UC Berkeley campus with a mission of helping ensure that smarter-than-human AI has a positive impact on the world.
CHCAI's research focus includes work on inverse reinforcement learning and human-robot cooperation (link), while MIRI's focus areas include task AI and computational reflection (link). Both groups are also interested in theories of (bounded) rationality that may help us develop a deeper understanding of general-purpose AI agents.
1. Fill in the form here: https://goo.gl/forms/bDe6xbbKwj1tgDbo1
2. Send an email to email@example.com with the subject line "AI safety internship application", attaching your CV, a piece of technical writing on which you were the primary author, and your research proposal.
The research proposal should be one to two pages in length. It should outline a problem you think you can make progress on over the summer, and some approaches to tackling it that you consider promising. We recommend reading over CHCAI's annotated bibliography and the concrete problems agenda as good sources for open problems in AI safety, if you haven't previously done so. You should target your proposal at a specific research agenda or a specific adviser’s interests. Advisers' interests include:
• Andrew Critch (CHCAI, MIRI): anything listed in CHCAI's open technical problems; negotiable reinforcement learning; game theory for agents with transparent source code (e.g., "Program Equilibrium" and "Parametric Bounded Löb's Theorem and Robust Cooperation of Bounded Agents").
• Daniel Filan (CHCAI): the contents of "Foundational Problems," "Corrigibility," "Preference Inference," and "Reward Engineering" in CHCAI's open technical problems list.
• Dylan Hadfield-Menell (CHCAI): application of game-theoretic analysis to models of AI safety problems (specifically by people who come from a theoretical economics background); formulating and analyzing AI safety problems as CIRL games; the relationships between AI safety and principal-agent models / theories of incomplete contracting; reliability engineering in machine learning; questions about fairness.
This application does not bind you to work on your submitted proposal. Its purpose is to demonstrate your ability to make concrete suggestions for how to make progress on a given research problem.
Who we're looking for:
This is a new and somewhat experimental program. You’ll need to be self-directed, and you'll need to have enough knowledge to get started tackling the problems. The supervisors can give you guidance on research, but they aren’t going to be teaching you the material. However, if you’re deeply motivated by research, this should be a fantastic experience. Successful applicants will demonstrate examples of technical writing, motivation and aptitude for research, and produce a concrete research proposal. We expect most successful applicants will either:
• have or be pursuing a PhD closely related to AI safety;
• have or be pursuing a PhD in an unrelated field, but currently pivoting to AI safety, with evidence of sufficient knowledge and motivation for AI safety research; or
• be an exceptional undergraduate or masters-level student with concrete evidence of research ability (e.g., publications or projects) in an area closely related to AI safety.
Program dates are flexible, and may vary from individual to individual. However, our assumption is that most people will come for twelve weeks, starting in early June. The program will take place in the San Francisco Bay Area. Basic living expenses will be covered. We can’t guarantee that housing will be all arranged for you, but we can provide assistance in finding housing if needed. Interns who are not US citizens will most likely need to apply for J-1 intern visas. Once you have been accepted to the program, we can help you with the required documentation.
The deadline for applications is the March 1. Applicants should hear back about decisions by March 20.
Razib summarized my entire cognitive biases talk at the Singularity Summit 2009 as saying: "Most people are stupid."
Hey! That's a bit unfair. I never said during my talk that most people are stupid. In fact, I was very careful not to say, at any point, that people are stupid, because that's explicitly not what I believe.
And in the closing sentence of my talk on cognitive biases and existential risk, I did not say that humanity was devoting more resources to football than existential risk prevention because we were stupid.
There's an old joke that runs as follows:
A motorist is driving past a mental hospital when he gets a flat tire.
He goes out to change the tire, and sees that one of the patients is watching him through the fence.
Nervous, trying to work quickly, he jacks up the car, takes off the wheel, puts the lugnuts into the hubcap -
And steps on the hubcap, sending the lugnuts clattering into a storm drain.
The mental patient is still watching him through the fence.
The motorist desperately looks into the storm drain, but the lugnuts are gone.
The patient is still watching.
The motorist paces back and forth, trying to think of what to do -
And the patient says,
"Take one lugnut off each of the other tires, and you'll have three lugnuts on each."
"That's brilliant!" says the motorist. "What's someone like you doing in an asylum?"
"I'm here because I'm crazy," says the patient, "not because I'm stupid."
I'm interested from hearing from everyone who reads this.
Who is checking LW's Discussion area and how often?
1. When you check, how much voting or commenting do you do compared to reading?
2. Do bother clicking through to links?
3. Do you check using a desktop or a smart phone? Do you just visit the website in browser or use an RSS something-or-other?
4. Also, do you know of other places that have more schellingness for the topics you think this place is centered on? (Or used to be centered on?) (Or should be centered on?)
I would ask this in the current open thread except that structurally it seems like it needs to be more prominent than that in order to do its job.
If you have very very little time to respond or even think about the questions, I'd appreciate it if you just respond with "Ping" rather than click away.
Benja, Eliezer, and I have published a new technical report, in collaboration with Stuart Armstrong of the Future of Humanity institute. This paper introduces Corrigibility, a subfield of Friendly AI research. The abstract is reproduced below:
As artificially intelligent systems grow in intelligence and capability, some of their available options may allow them to resist intervention by their programmers. We call an AI system "corrigible" if it cooperates with what its creators regard as a corrective intervention, despite default incentives for rational agents to resist attempts to shut them down or modify their preferences. We introduce the notion of corrigibility and analyze utility functions that attempt to make an agent shut down safely if a shutdown button is pressed, while avoiding incentives to prevent the button from being pressed or cause the button to be pressed, and while ensuring propagation of the shutdown behavior as it creates new subsystems or self-modifies. While some proposals are interesting, none have yet been demonstrated to satisfy all of our intuitive desiderata, leaving this simple problem in corrigibility wide-open.
View more: Next