CarlShulman comments on Safety Culture and the Marginal Effect of a Dollar - Less Wrong

23 Post author: jimrandomh 09 June 2011 03:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (105)

You are viewing a single comment's thread.

Comment author: CarlShulman 09 June 2011 07:58:16AM *  12 points [-]

I agree that there is a lot of room for more and better academic work on this topic to reduce existential risk (including other channels like more academic research into AI safety strategies, influence on other actors like large corporations and governments, etc), but as I said at the minicamp, I think the assumptions of this model systematically lead to overestimates of effectiveness of this channel (EDIT: and would lead to overestimates of other strategies as well, including the "FAI team in a basement" strategy as I mention in my comment below).

One of the primary reasons for concern about AI risk is the likelihood of tradeoffs between safety and speed of development. Commercial or military competition make it plausible that quite extensive tradeoffs along these lines will be made, so that reckless (or self-deceived) projects are more likely to succeed first than more cautious ones. So the "random selection" assumption disproportionately favors safety.

The assumption that safety-conscious researchers always succeed in making any AI they produce safe is also fairly heroic and a substantial upward bias. There may be some cheap and simple safety measures that any safety-conscious AI project can take without significant sacrifice, but we shouldn't assign high probability to the problem being that easy. Also, if it turns out safety is so easy, why wouldn't any group sophisticated enough to build AI start to take such precautions once it became apparent they were making real progress?

As folks discussed at the time this idea was first presented, if 'concern for safety' means halting a project with high risk to pursue a lower-risk design, then unless almost all researchers are affected this just leads to a modest expected delay until someone unconcerned succeeds.

Comment author: whpearson 09 June 2011 12:45:24PM *  5 points [-]

The assumption that safety-conscious researchers always succeed in making any AI they produce safe is also fairly heroic and a substantial upward bias.

What makes the SIAI team, that will be assembled, any different?

Comment author: CarlShulman 09 June 2011 04:41:54PM *  4 points [-]

I think many of the same assumptions also lead to overestimates of the success odds of an SIAI team in creating safe AI. In general, some features that I would think conduce to safety and could differ across scenarios include:

  • Internal institutions and social epistemology of a project that makes it possible to slow down, or even double back, upon discovering a powerful but overly risky design, rather than automatically barreling ahead because of social inertia or releasing the data so that others do the same
  • The relative role of different inputs, like researchers of different ability levels, abundant computing hardware, neuroscience data, etc, in designing AI; with some patterns of input favoring higher understanding by designers of the likely behavior of their systems
  • Dispersion of project success, i.e. the longer a period after finding the basis of a design in which one can expect other projects not to reach the same point; the history of nuclear weapons suggests that this can be modestly large (nukes were developed by the first five powers in 1945, 1949, 1952, 1960, 1964) under some development scenarios, although near-simultaneous development is also common in science and technology
  • The type of AI technology: whole brain emulation looks like it could be relatively less difficult to control initially by solving social coordination problems, without developing new technology, while de novo AGI architectures may vary hugely in the difficulty of specifying decision algorithms with needed precision

Some shifts along these dimensions do seem plausible given sufficient resources and priority for safety (and suggest, to me, that there is a large spectrum of safety investments to be made beyond simply caring about).

Comment author: whpearson 11 June 2011 07:29:52PM 1 point [-]

Another factor to consider, the permeability of the team, how much they are likely to leak information to the outside world.

However if the teams are completely impermeable then it becomes hard for external entities to evaluate the other factors for evaluating the project.

Does SIAI have procedures/structures in place to shift funding between the internal team and more promising external teams if they happen to arise?

Comment author: CarlShulman 12 June 2011 12:19:45AM 1 point [-]

Most potential funding exists in the donor cloud, which can reallocate resources easily enough; SIAI does not have large reserves or an endowment that would be encumbered by the nonprofit status. Ensuring that the donor cloud is sophisticated and well-informed contributes to that flexibility, but I'm not sure what other procedures you were thinking about. Formal criteria to identify more promising outside work to recommend?

Comment author: whpearson 12 June 2011 11:14:14AM *  0 points [-]

Formal criteria to identify more promising outside work to recommend?

I think that might help. In this matter it all seems to be about trust.

  • People doing outside work have to trust that SIAI will look at their work and may be supportive. Without formal guidelines, they might suspect that their work will be judged subjectively and negatively due to potential conflict of interest due to funding.

  • SIAI also need to be trusted not to leak information from other projects as they evaluate them, having a formal vetted well known evaluation team might help with that.

  • The Donor cloud needs to trust SIAI to look at work and make a good decision about it, not just based on monkey instincts. Formal criteria might help instill that trust.

SIAI doesn't need all this now as there aren't any projects that need evaluating. However it is something to think about for the future.

Comment author: timtyler 09 June 2011 09:38:01PM 1 point [-]

I don't think the SIAI has much experience writing code, or programming machine learning applications.

Superficially, that makes them less likley to know what they are doing, and more likely to make mistakes and screw up.

Comment author: CarlShulman 09 June 2011 09:52:58PM *  4 points [-]

I don't think the SIAI has much experience writing code, or programming machine learning applications.

Eliezer's FAI team currently consists of 2 people: himself and Marcello Herreshoff. Whatever its probability of success, most would seem to come from actually recruiting enough high-powered folk for a team. Certainly he thinks so, thus his focus on Overcoming Bias and then the rationality book as a tool to recruit a credible team.

Superficially, that makes them less likley to know what they are doing, and more likely to make mistakes and screw up.

Sure, ceteris paribus, although coding errors seem less likely than architectural screwups to result in catastrophic harm rather than the AI not working.