Posts

Sorted by New

Wiki Contributions

Comments

Sorted by
LTM50

One method of keeping humans in key industrial processes might be expanding credentialism. Individuals remaining control even when the majority of the thinking isn't done by them has always been a key part of any hierarchical organisation.

Legally speaking, certain key tasks can only be performed by qualified accountants, auditors, lawyers, doctors, elected officials and so on. 

It would not be good for short term economic growth. However, legally requiring that certain tasks be performed by people with credentials machines are not eligible for might be a good (though absolutely not perfect) way of keeping humans in the loop. 

LTM10

Broadly agree, in that most safety research expands control over systems and our understanding of them, which can be abused by a bad actor.

This problem is encountered by for-profit companies, where profit is on the lines instead of catastrophe. They too have R&D departments and research directions which have the potential for misuse. However, this research is done inside a social environment (the company) where it is only explicitly used to make money.

To give a more concrete example, improving self-driving capabilities also allows the companies making the cars to intentionally make them run people down if they so wish. The more advanced the capabilities, the more precise they can be in deploying their pedestrian-killing machines onto the roads. However, we would never expect this to happen as this would clearly demolish the profitability of a company and result in the cessation of these activities.

AI safety research is not done in this kind of environment at present whatsoever. However, it does seem to me that these kinds of institutions that carefully vet research and products, only releasing them when they remain beneficial, are possible. 

LTM20

Really fascinating stuff! I have a (possibly answered) question about how using expert updates on other expert prediction might be valuable.

You discuss the negative impacts of allowing experts to aggregate themselves, or viewing one another's forecasts before initially submitting their own. Might there be value in allowing experts to submit multiple times, each time seeing the submitted predictions of a previous round? The final aggregation scheme would be able to not only assign a credence to each expert, but also gain a proxy for what credence the experts give to one another. In a more practical scenario where experts will talk if not collude, this might give better insight into how expert predictions are being created.

Thanks for taking the time to distill this work into a more approachable format - it certainly made the thesis more manageable!