Isn’t it fair to say that the model plus the selection mechanism is maximizing and wanting the reward? If the selection mechanism is subject to market forces, corporate or political schemes, or just mechanical in some way that isn’t just “a human is explicitly making these choices” it is likely to eventually tend in a direction that doesn’t align with human welfare.
Corporations already generate tons of negative externalities. They churn out metric tons of plastic, pollute, exhaust resources, destroy ecosystems. Governments often work with them to enforce various things, eg intellectual property for monsanto, or engineering crops with high fructose corn syrup, or overusing antibiotics on factory farms that can lead to superbugs. Or overfishing. Or starting wars.
None of these things are really “aligned” with humans. Humans are in fact told that as an individual they can make a meaningful differency by going vegan, recycling, exercising, and so on. But if the corporations are the ones producing metric tons of plastic, conserving plastic straws and bags isn’t the solution. The problem is upstream and the individuals being shamed to do this or that is just distracting from getting together solving systemic problems.
My point is “the machine” is already not necessarily selecting for alignment with humans on a macro scale. So the fact that the model parameters are selected by “the machine” doesn’t mean it will end up somehow becoming good for humans. Yes it will if humans are the customer. But if you are not the customer, you’re the product (eg a cow in a factory farm).
Anyway … this is a bit like John Searle’s Chinese room argument.
Isn’t it fair to say that the model plus the selection mechanism is maximizing and wanting the reward? If the selection mechanism is subject to market forces, corporate or political schemes, or just mechanical in some way that isn’t just “a human is explicitly making these choices” it is likely to eventually tend in a direction that doesn’t align with human welfare.
Corporations already generate tons of negative externalities. They churn out metric tons of plastic, pollute, exhaust resources, destroy ecosystems. Governments often work with them to enforce various things, eg intellectual property for monsanto, or engineering crops with high fructose corn syrup, or overusing antibiotics on factory farms that can lead to superbugs. Or overfishing. Or starting wars.
None of these things are really “aligned” with humans. Humans are in fact told that as an individual they can make a meaningful differency by going vegan, recycling, exercising, and so on. But if the corporations are the ones producing metric tons of plastic, conserving plastic straws and bags isn’t the solution. The problem is upstream and the individuals being shamed to do this or that is just distracting from getting together solving systemic problems.
My point is “the machine” is already not necessarily selecting for alignment with humans on a macro scale. So the fact that the model parameters are selected by “the machine” doesn’t mean it will end up somehow becoming good for humans. Yes it will if humans are the customer. But if you are not the customer, you’re the product (eg a cow in a factory farm).
Anyway … this is a bit like John Searle’s Chinese room argument.