Comment author: D_Alex 11 May 2012 06:31:18AM 1 point [-]

I am saddened by the amount of downvotes on Alerus's well written and provocative posts. He made a positive contribution to the discussion, and should not be discouraged, IMO.

Comment author: Alerus 11 May 2012 04:09:01PM *  0 points [-]

Thanks, I appreciate that. I have no problem with people disagreeing with me as confronting disagreement is how people (self included) grow. However, I was taken aback by the amount of down voting I received merely for disagreeing with people here and the fact that by merely choosing to respond to people's arguments it would effectively guarantee even more down votes—a system tied to how much you can participate in the community—made it more concerning to me. At least on the discussion board side of the site, I expected down voting to be reserved for posts that were derailing topics, flaming, ignoring arguments presented to them, etc., not for posts with which one disagreed. As someone who does academic research in AI, I thought this could be a fun lively online community to discuss that, but having my discussion board topic posting privileges removed because people did not agree with things I said (and the main post didn't even assert anything, it asked for feedback), I've reconsidered that. I'm glad to see not all people here think this was an appropriate use of down voting, but I feel like the community at large has spoken with regards to how they use that and when this thread ends I'll probably be moving on.

Thanks for you support though, I do appreciate that.

Comment author: DanielLC 10 May 2012 12:26:10AM 0 points [-]

Even if it wasn't, if the gain of adding a person was less than drop in well-being of others, it wouldn't be beneficial unless the AI was able to without prevention, create many more such people.

Do you honestly think a universe the size of ours can only support six billion people before reaching the point of diminishing returns?

We're operating under the assumption that the AI's methods of value manipulation are limited to what we can do ourselves, in which case rewiring is not something we can do with any great affect.

If you allow it to use the same tools but better, it will be enough. If you don't, it's likely to only try to do things humans would do, on the basis that they're not smart enough to do what they really want done.

Comment author: Alerus 10 May 2012 01:29:23AM *  0 points [-]

Do you honestly think a universe the size of ours can only support six billion people before reaching the point of diminishing returns?

That's not my point. The point is people aren't going to be happy if an AI starts making people that are easier to maximize for the sole reason that they're easier to maximize. This will suggest a problem to us by the very virtue that we are discussing hypotheticals where doing so is considered a problem by us.

If you allow it to use the same tools but better, it will be enough. If you don't, it's likely to only try to do things humans would do, on the basis that they're not smart enough to do what they really want done.

You seem to be trying to break the hypothetical assumption on the basis that I have not specified a complete criteria that would prevent an AI from rewiring the human brain. I'm not interested in trying to find a set of rules that would prevent an AI from rewiring human's brain (and I never tried to provide any, that's why it's called an assumption), because I'm not posing that as a solution to the problem. I've made this assumption to try and generate discussion all the problems where it will break down since typically discussion seems to stop at "it will rewire us". Trying to assert "yeah but it would rewire because you haven't strongly specified how it couldn't" really isn't relevant to what I'm asking since I'm trying to get specifically at what it could do besides that.

Is friendly AI "trivial" if the AI cannot rewire human values?

-5 Alerus 09 May 2012 05:48PM

I put "trivial" in quotes because there are obviously some exceptionally large technical achievements that would still need to occur to get here, but suppose we had an AI with a utilitarian utility function of maximizing subjective human well-being (meaning, well-being is not something as simple as physical sensation of "pleasure" and depends on the mental facts of each person) and let us also assume the AI can model this "well" (lets say at least as well as the best of us can deduce the values of another person for their well-being). Finally, we will also assume that the AI does not possess the ability to manually rewire the human brain to change what a human values. In other words, the ability for the AI to manipulate another person's values is limited by what we as humans are capable of today. Given all this, is there any concern we should have about making this AI; would it succeed in being a friendly AI?

One argument I can imagine for why this fails friendly AI is the AI would wire people up to virtual reality machines. However, I don't think that works very well, because a person (except Cypher from the Matrix) wouldn't appreciate being wired into a virtual reality machine and having their autonomy forcefully removed. This means the action does not succeed in maximizing their well-being.

But I am curious to hear what arguments exist for why such an AI might still fail as a friendly AI.

Comment author: Alerus 08 May 2012 11:51:46PM 0 points [-]

So I think my basic problem here is I'm not familiar with this construct for decision making or why it would be favored over others. Specifically, why make logical rules about which actions to take? Why not take an MDP value-learning approach where the agent chooses an action based on which action has the highest predicted utility. If the estimate is bad, it's merely updated and if that situation arises again, the agent might choose a different action as a result of the latest update to it.

Comment author: Alerus 08 May 2012 04:19:32PM 4 points [-]

I feel like the suggested distinction between bayes and science is somewhat forced. Before I knew of bayes, I knew of Occam's razor and its incredible role in science. I had always been under the impression that science favored simpler hypotheses. If it is suggested that we don't see people rigorously adhering to bayes theorem when developing hypotheses, then the answer to why is not because science doesn't value the simpler hypotheses suggested by bayes and priors, but because determining the simplest hypothesis is incredibly difficult to do in many cases. And this difficulty is acknowledged in the post. As is such, I'm not seeing science as diverging from bayes, the way its practiced is just a consequence of the admitted difficulty of finding the correct priors and determining the space of hypotheses.

Comment author: momothefiddler 08 May 2012 03:11:57AM 0 points [-]

Hm. If people have approximately-equivalent utility functions, does that help them all accomplish their utility better? If so, it makes sense to have none of them value stealing (since having all value stealing could be a problem). In a large enough society, though, the ripple effect of my theft is negligible. That's beside the point, though.

"Avoid death" seems like a pretty good basis for a utility function. I like that.

Comment author: Alerus 08 May 2012 03:00:22PM 1 point [-]

Yeah I agree that the ripple effect of your personal theft would be negligible. I see it as similar to littering. You do it in a vacuum, no big deal, but when many have that mentality, it causes problems. Sounds like you agree too :-)

Comment author: momothefiddler 07 May 2012 08:05:35PM 0 points [-]

I'm not saying I can change to liking civil war books. I'm saying if I could choose between A) continuing to like scifi and having fantasy books, or B) liking civil war books and having civil war books, I should choose B, even though I currently value scifi>stats>civil war. By extension, if I could choose A) continuing to value specific complex interactions and having different complex interactions, or B) liking smiley faces and building a smiley-face maximizer I should choose B even though it's counterintuitive. This one is somewhat more plausible, as it seems it'd be easier to build an AI that could change my values to smiley faces and make smiley faces than it would be to build one that works toward my current complicated (and apparently inconsistent) utility function.

I don't think society-damaging actions are "objectively" bad in the way you say. Stealing something might be worse than just having it, due to negative repercussions, but that just changes the relative ordering. Depending on the value of the thing, it might still be higher-ordered than buying it.

Comment author: Alerus 07 May 2012 08:47:58PM 0 points [-]

Right, so if you can choose your utility function, then it's better to choose one that can be better maximized. Interestingly though, if we ever had this capability, I think we could just reduce the problem by using an unbiased utility function. That is, explicit preferences (such as liking math versus history) would be removed and instead we'd work with a more fundamental utility function. For instance, death is pretty much a universal stop point since you cannot gain any utility if you're dead, regardless of your function. This would be in a sense the basis of your utility function. We also find that death is better avoided when society works together and develops new technology. Your actions then might be dictated by what you are best at doing to facilitate the functioning and growth of society. This is why I brought up society damaning as being potentially objectively worse. You might be able to come up with specific instances of actions that we associate as society-damaging that seem okay, such as specific instances of stealing, but then they aren't really society damaging in the grand scheme of things. That said, I think as a rule of thumb stealing is bad in most cases due to the ripple effects of living in a society in which people do that, but that's another discussion. The point is there may be objectively better choices even if you have no explicit preferences for things (or you can choose your preferences).

Of course, that's all conditioned on whether you can choose your utility function. For our purposes for the foreseeable future, that is not the case and so you should stick with expected utility functions.

Comment author: Alerus 07 May 2012 08:16:50PM 3 points [-]

It's hard for me to gauge your audience, so maybe this wouldn't be terribly useful, but a talk outlining logical fallacies (especially lesser-known ones) and why they are fallacies seems like it would have a high impact since I think the layperson commits fallacies quite frequently. Or should I say, I observe people committing fallacies more often than I'd like :p

Comment author: Alerus 07 May 2012 03:48:56PM 5 points [-]

Hi! So I've actually already made a few comments on this site, but had neglected to introduce myself so I thought I'd do so now. I'm a PhD candidate in computer science at the University of Maryland, Baltimore County. My research interests are in AI and Machine Learning. Specifically, my dissertation topic is on generalization in reinforcement learning (policy transfer and function approximation).

Given this, AI is obviously my biggest interest, but as a result, my study of AI has led me to applying the same concepts to human life and reasoning. Lately, I've also been thinking more about systems of morality and how an agent should reach rational moral conclusions. My knowledge of existing working in ethics is not profound, but my impression is that most systems seem to be at too high a level to make concrete (my metric is whether we could implement it in an AI; if we cannot, then it's probably too high-level for us to reason strongly with it ourselves). Even desirism, which I've examined at least somewhat, seems to be a bit too high-level, but is perhaps closer to the mark than others (to be fair, I may just not know enough about it). In response to these observations, I've been developing my own system of morality that I'd like to share here in the near future to receive input.

Comment author: Alerus 07 May 2012 02:22:27PM *  0 points [-]

I disagree with the quoted part of the post. Science doesn't reject your bayesian conclusion (provided it is rational), it's simply unsatisfied by the fact that it's a probabilistic conclusion. That is, probabilistic conclusions are never knowledge of truth. They are estimations of the likelihood of truth. Science will look at your bayesian conclusion and say "99% confident? That's good!, but lets gather more data and raise the bar to 99.9%!). Science is the constant pursuit of knowledge. It will never reach it it, but it will demand we never stop trying to get closer.

Beyond that, I think in a great many cases (not all) there are also some inherent problems in using explicit bayesian (or otherwise) reasoning for models of reality because we simply have no idea what the space of hypotheses could be. As is such, the best bayesian can ever do in this context is give an ordering of models (e.g., this model is better than this model), not definitive probabilities. This doesn't mean science rejects correct bayesian reasoning for the reason previously stated, but it would mean that you can't get definitive probabilistic conclusions with bayesian reasoning in the first place for many contexts.

View more: Next