To clarify there's a distinction I'm making between a utility function and the utility calculations. You can absolutely set a utility function arbitrarily. The issue is not that a utility function itself can go to infinity, but that religious beliefs can make an AI's prediction of the state of the world contain infinities.
Suppose you have a system that consists of a model that predicts the state of the world contingent on some action taken by the system, a utility function that evaluates those states, and an agent which can take the highest utility action.
Let's say the system's utility function is the expected number of paperclips produced. The model predicts the number of paper clips produced by various courses of action. One of which is converting all of humanity to Catholicism. It is possible that a model would predict that converting everyone to Catholicism would result in an infinite expected number of paperclips. And so try to do that.
This is different than setting a utility function to produce an infinite value for finite input. And creates an alignment issue because this behavior would be very hard to predict. Conceivably regularization could be a way to address this sort of problem. But, the potential for religious considerations to dominate others is real and is worthy of serious consideration.
Religious beliefs are special because they introduce infinity to the utility calculations. Which can lead to very weird results.
Suppose we have an agent that wants to maximize the expected number of paperclips in the universe. There is an upper bound to the number of paperclips that can exist in the physical universe.
The agent assumes there is a 0.05% chance that Catholicism is true. And if it converts the population of the world to Catholicism it will be rewarded with infinite paperclips. Converting everyone to Catholicism would therefor maximize the expected number of paperclips. Even for very low estimated probabilities of Catholicism being true.
I'm not sure I agree with your comment. Or at least I wouldn't put it that way. But I think agree with the gist of what you're getting at.
I agree the prospect of eternal reward has a huge motivating effect on human behavior. The question I'm trying to raise is whether it might have a similar effect on machine behavior.
An agnostic expectation maximizing machine might be significantly by influenced religious beliefs. And I expect a machine would be agnostic.
Unless we're very certain that an AI will be atheistic I think this is something we should think about seriously.
I think that's a fairly modest claim. Note I don't say the only way.
Religion is evidence (albeit weak and in some respects contradictory evidence) of a certain form of morality bing true. The probability of certain religions existing is different conditional on certain moral facts being true. I would emphasize that taken seriously this leads to conclusions that are very different than most traditional religions. But I think the argument is valid.
Moral intuitionism is another option. But, imo, it's hard to argue why human intuition should be a good predictor of morality without some supernatural element.
It's also true that even if you don't now what your specific moral duties are. Attempting to discover them is them a moral duty in most circumstances. But, that's sort of a second order argument, and depends on your views wrt moral uncertainty.
But those are the only three ways of dealing with the issue I've seen.
I agree an AI wouldn't necessarily be totally defined by religion. But very large values, even with small probabilities can massively effect behavior.
And yes, religions could conceivably use AIs do very bad things. As could many human actors.
I think this is sort of a naive approach to this problem.
For one, startup valuations are very high variance. It's impossible to know if you were right or lucky in the case you cite. Although you do make a plausible case you had more information than the VCs who invested.
The the real reason for modesty is the status quo for a lot of systems is at or near optimal. Especially in areas where competitive pressures are strong. Building gears level models can help. But doing that with sufficient fidelity is hard. Because even insiders often don't understand the system with enough granularity to sufficiently model it.
By that logic, wouldn't it make the most sense to donate to an organization that lobbies for more international aid or scientific research than attempting to fund it yourself?
The theory of comedy that I find the most convincing is that things we find "funny" are non-threatening violations of social mores. According to that theory being funny isn't so much about being rational, but understanding the unwritten rules that govern society. More specifically it's about understanding when breaking social rules is actually acceptable. It's kind of like speeding. It's theoretically illegal to go 26 in a 25 mph zone. But as a practical matter, no cop is going to pull you over for it. I'm not sure that an especially detailed understanding of social norms is directly useful to becoming more rational. Maybe to the extent that you're more consciously aware of them and how they influence your thinking.
I’m not clear what you’re saying here.
Are you saying there are specific beliefs that make infinite predictions you regard as having a non infinitesimal probability of being true? For example trying to appeal to whatever may be running the simulation that could be our universe?
Alternatively, are you saying that religious beliefs are no more likely to be true than any arbitrary belief? And are in fact less likely to be true than many since religious beliefs are more complex?
The problem with that is Occam’s Razor alone can't produce useful information for making decisions here. The belief that a set of actions A will lead to an infinite outcome is no more complex than the belief that the complement of A will lead to an infinite outcome. The mere existence of a prediction leading to infinite outcomes doesn't give useful information because the complementary prediction is equally likely to be true. You need some level of evidence to prefer a set of actions to its complement.
The existence of religion(s) is (imperfect) evidence that there is some A that is more likely to produce an infinite outcome than its complement. Which is why I think it might be an important motivator.