Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Liso 27 November 2014 06:25:27AM 0 points [-]

It seems that the unfriendly AI is in a slightly unfavourable position. First, it has to preserve the information content of its utility function or other value representation, in addition to the information content possessed by the friendly AI.

There are two sorts of unsafe AI: one which care and one which doesnt care.

Ignottant is fastest - only calculate answer and doesnct care of anything else.

Friend and enemy has to analyse additional things...

FAI calculate and analyse possibile dangers because dont wanr ot to harm

Comment author: wedrifid 27 November 2014 06:22:15AM 1 point [-]

Wow. I want the free money too!

Comment author: Unknowns 27 November 2014 06:05:45AM 0 points [-]

I think there is good reason to think coming up with an actual VNM representation of human preferences would not be a very good approximation. On the other hand as long as you don't program an AI in that way -- with an explicit utility function -- then I think it is unlikely to be dangerous even if it does not have exactly human values. This is why I said the most important thing is to make sure that the AI does not have a utility function. I'm trying to do a discussion post on that now but something's gone wrong (with the posting).

I thought you could map an unbounded function to a bounded one to produce the same behavior, but actually you may be right that this is not really possible since you have to multiply your utilities by probabilities. I would have to think about that more.

Comment author: Artaxerxes 27 November 2014 05:56:28AM 0 points [-]

Calico is hiring.

Might be worth looking into.

Comment author: AlexSchell 27 November 2014 05:56:24AM 0 points [-]

Oops, shouldn't have assumed you're talking about genetics :)

Still, if you're talking about character in a causally neutral sense, it seems that you need to posit character traits that hardly change within a person's lifetime. Here I admit that the evidence for rapid institutional effects is weaker than the evidence for institutional effects in general.

(Re: Hong Kong, Singapore, no, I do mean those cities specifically. Their economic outcomes differ strikingly from culturally and genetically similar neighbors because of their unique histories.

Comment author: Liso 27 November 2014 05:49:07AM 0 points [-]

The other question is: what happens once you know the AI has desire for subversion?

There is plenty of people cooperating and supporting dictators and mafians due to selfish reasons. We could expect same in this area.

"I will support dangerous AI because I will be more rich,powerful or safe then. "

Comment author: Liso 27 November 2014 05:43:09AM 0 points [-]

I don't understand negative point. Could it be explained please?

Is it because his position equals what he declares and not what he really think?

Or becase Bostroms mimicry (=for some reasons prefere negative, catastrophic and alarmist position) is not suposed?

Comment author: Liso 27 November 2014 05:28:43AM *  0 points [-]

Expose it to limited data about the world, or no data about the real world -Provide it little or no information about its operators or the society they live in -Provide it little or no information about its own hardware or software architecture

When I read "Oracle AI" I realized one possibility which is probably still overlooked.


Erasing data on physical media is not perfect. Zero bit which was before nonzero could be read more slowly or quicker. It could allow SAI to see shadows of past data. Which could lead to phenomenon similar to dreams.

Comment author: AlexMennen 27 November 2014 04:53:33AM 0 points [-]

It's true that humans do not have utility functions, but I think it still can make sense to try to fit a utility function to a human that approximates what they want as well as possible, since non-VNM preferences aren't really coherent. It's a good point that it is pretty worrying that the best VNM approximation to human preferences might not fit them all that closely though.

a bounded function that behaves in a similar way by approaching a limit (if it didn't behave similarly it would not treat anything as having infinite value.)

Not sure what you mean by this. Bounded utility functions do not treat anything as having infinite value.

Comment author: Azathoth123 27 November 2014 04:49:46AM 0 points [-]

You're leaving out that he left Latin America to get away from those problems

But do they understand what caused them.

also that a lot of immigrants want to become real Americans (or whichever country they're moving to).

I'd be more comfortable with an immigration policy that explicitly screened for something like this.

Comment author: Azathoth123 27 November 2014 04:46:47AM 0 points [-]

Many Western societies have seen pretty dramatic productivity-enhancing institutional changes in the last few hundred years that aren't explicable in terms of changes in genetic makeup.

Who said anything about genetics?

Hong Kong, Singapore, and South Korea seem to make a pretty strong case for a huge independent effect of institutions.

Korea is. China (I assume this is what you mean by Hong Kong and Singapore) is evidence against.

Comment author: Vulture 27 November 2014 04:42:32AM *  0 points [-]

Average article quality is almost certainly going down, but the main driving force is probably mass-creation of stub articles about villages in Eastern Europe, plant genera, etc. Of course, editors are probably spread mpre thinly even among important topics as well. A lot of people seem to place the blame for any and all of Wikipedia's problems on bureaucracy, but as a regular editor such criticisms often seem foreign, like they're talking about a totally different website. True, there's a lot of formalities, but they're mostly invisible, and a reasonably intelligent person can probably pick up the important customs quite quickly. In the past 6 months of relatively regular editing, I can't say I remember ever interacting involuntarily with any kind of bureaucratic process or individual (I occasionally putter around the deletion nominations for fun, but that's just to satisfy my need for conflict). Writing an article (for example), especially if it's any good, is virtually never going to get you ensnared in some kind of Kafkaesque editorial process. Such things seem to operate mainly for the benefit of people who enjoy inflicting such things on each other (e.g., descending hierarchies of committees for dealing with mod drama).

It's late, so hopefully the above makes some modicum of sense.

Comment author: Azathoth123 27 November 2014 04:42:30AM *  0 points [-]

It could be any number of things. Including the one I take it you're looking for, namely some genetic inferiority on the part of the people in country A.

Not necessarily, my argument goes through even if it's memetic.

The people who move from country A to country B may be atypical of the people of country A, in ways that make them more likely overall to be productive in country B.

Your only response to this has been a handwavy dismissal, to the effect that that might have been true once but now immigration is too easy so it isn't any more. How about some evidence?

How about some yourself. Note simply saying that something may happen is not a reason to ignore the prior that it won't. I responded to your only argument about the prior. Also, look at the way the immigrants are in fact behaving, I believe it involves lots of riots and creating neighborhoods that the police are afraid to go into.

Comment author: Azathoth123 27 November 2014 04:41:33AM 0 points [-]

True, but it may very well lead to importing the reason country X has the government it does.

Comment author: Larks 27 November 2014 04:15:49AM 0 points [-]

You might perhaps like to edit out the username from this comment now.

Comment author: Larks 27 November 2014 04:15:39AM 0 points [-]

You might perhaps like to edit out the username from this comment now.

Comment author: Unknowns 27 November 2014 04:10:30AM 0 points [-]

No, I wasn't saying that all utility functions are unbounded. I was making two points in that paragraph:

1) An AI that values something infinitely will not have anything remotely like human values, since human beings do not value anything infinitely. And if you describe this AI's values with a utility function, it would either be an unbounded function, or a bounded function that behaves in a similar way by approaching a limit (if it didn't behave similarly it would not treat anything as having infinite value.)

2) If you program an AI with an explicit utility function, in practice it will not have human values, because human beings are not made with an explicit utility function, just as if you program an AI with a GLUT, in practice it will not engage in anything like human conversation.

Comment author: Azathoth123 27 November 2014 04:08:28AM 0 points [-]

Unfortunately online women get attacked more easily and more nasty than a lot of men.

What I've heard is that men are more likely to get attacked (makes sense given where they hang out), it's just that women are more likely to make a big deal of it.

Comment author: Azathoth123 27 November 2014 03:46:26AM 0 points [-]

there's a persistent market distortion because investment profits are undertaxed

What do you mean? Corporate tax rates, at least in the US, are higher than personal tax rates.

Comment author: Azathoth123 27 November 2014 03:44:29AM 0 points [-]

Yeh, that's why I stopped reading xkcd.

Comment author: Azathoth123 27 November 2014 03:38:06AM *  0 points [-]

Are you familiar with the various only automatic rant generators?

Comment author: Slider 27 November 2014 03:37:58AM 0 points [-]

What people permit is more inclusive and vague than what they want and doesn't even in the same sense try to aim to further a persons goals. There is also an problem that people could accept a fate they don't want. Whether that is the human being self-unfriendly or the ai being unfriendly is a matter of debate. But still it's a form of unfriendliness.

Comment author: Slider 27 November 2014 03:33:53AM 0 points [-]

I you don't know that you are missing somethin or reason to be beleive this to be the case, you are unsure about wheter you are dummy when it comes to AI or not. Not knowiing whether you should AI discuss is different from knowing not to AI discuss.

Comment author: Slider 27 November 2014 03:18:46AM 0 points [-]

I had a similar prompt for knowledge seeking in wanting to figure out how the math supports or doesn't support "converging worlds" or "mangled worlds". The notion of a converging world is also porbably of note worthy intuitive reference point in thought-space. You could have a system that is in a quantum indeterministic state each state have a different interaction so that the futures of the states are identical. At that point you can drop the distinguising of the worlds and just say that two worlds have become one. Now there is a possibility that a state left alone first splits and then converges or that it does both at the same time. There would be middle part that would not be being able to be "classified" which in these theories would be represented by two worlds in different configurations (and waves in more traditional models).

Some times I have stumbled upon an argument that if many worlds creates extra worlds whether that forms as a kind of growing block ontology (such as the flat splitters in the sequence post). Well if the worlds also converge that could keep the amount of "ontology stuff" constant or able to vary in both directions.

I stumbled upon that |psi(x)^2| was how you calculated the evolution of a quantum state which was like taking a second power and then essentailly taking a square root by only careing about the magnitude and not the phase of the complex value. For a double slit wtih L being left and R being R it resulted in P(L+R)^2= <L|L>^2+C<L|R><R|L>+<R|R>^2 (where C was either 1, 2 or sqr(2) don't remember and didn't understand which) . The squarings in the sum I found was claimed to be the classical equivalent of the two options. The interference fridges would be great and appear where the middle term was strong. I also that you could read <x|y> as something like "obtain X if situation was/is y". Getting L when the particle went L is thus very ordinary and all. You can also note that the squaring have the same form as the evolution of a pure state. However I didn't find anything in whether the middle term was interpretable or not. If you try to put it into words it looks a lot like "probability of getting L when the situation was R" which seems very surprising that it could be anything else than zero. But then again I dont' know what imaginary protoprobabilties are. Because it's a multipication of two "chains of events" it's clear you can't single out the "responcible party", it can be a contribution from both. I somehow suspect that this correlates that if your "base" is |L> then the |R>|L> base doesn't apply, ie you can't know the path taken and still get interference. I get that many worlds posits the R world and the L word but it seems there is like a bizarre combination world also involved. One way I in my brute naivity think migth be goign on is taht the particle started in the L world but then "crossed over" to the R world. If worlds in contact can exchange particles it might seem as particles "mysteriously jumped" while the jumping would be loosely related where the particle was. They would have continous trajectories when tracked within the multiverse but they get confused for each other in the single worlds.

However I was unable to grasp the intuition how bras and kets work or what they mean. I pushed the strangeness to wavefunctions but was unable to reach my goal.

It still seems mysterious to me how the single photon state turns into two distinct L and R. I could imagine the starting state to "do a full loop" be a kind of spiral where the direction that photon is travelling is a superpositon of it travelloing in each particular direction with each direction differing from it's neighbour by the phase of the protoprobability phase with their magnitudes summing to 1. That way if the photon has probability one at L it can't have probability 1 as the real part of the protoprobability at R can't be 1 as it is known to differ in phase. I know these intuitions are not well founded, I know the construction of them is known to be unsafe. However intuitive pictures are more easy for me to work with even if it means needing to reimagine them rather than just have them in the right configuration (if somebody know s a more representative way to think about it please tip me about it).

I am also using a kind of guess that you can take a protoprobaility strip it of imaginary parts an dyou get a "single world view" and I am using a view of having 2 time dimensions: a second additional clock makes the phases of the complex values sweep forward (or sweep equal surface areas) even if the "ordinary clock time" would stay still. The undeterminancy under this time would be that a being that is not able to measure the meta-time would be ignorant on what part of the cycle the world is in. Thus you would be ignorant of the phases, but the phases would "resonate". I am assuming one could turn this into a equivalent view where the imaginary component would just select a spatial world in a 1-time multiverse (in otherwise totally real-part only worlds).

I don't have known better understanding but I have a bunch of different understadnings of unknown fittness.

Comment author: SteveG 27 November 2014 02:57:34AM 0 points [-]

People have complex sets of goals, tendencies, and instincts. There has never been any entity brought into existence so far which is a utility maximizer.

That renders us dangerous if we become too powerful, but we are not useless if our powers are checked.

We really might not wish an AI to be an explicit utility maximizer. Oddly, starting with that design actually might not generate the most utility.

Comment author: Azathoth123 27 November 2014 02:55:00AM -2 points [-]

I happen to think it's quite likely that there are good explanations for the phenomena you cite that don't include "women are intrinsically more biased against cryonics than men"

I think the explanation is that women are intrinsically more conformist then men and since cryonics is currently unusual and perceived as weird, well.

so it would be a bit daft to assume that that one possibility explains all the variance.

The rule of thumb is that 20% of the causes are responsible for 80% of the effect.

Comment author: lukeprog 27 November 2014 02:52:23AM 2 points [-]

I worry about the phrase "provably aligned with human values."

Comment author: Punoxysm 27 November 2014 02:50:41AM *  0 points [-]

Edited for clarity. Thought terms get diluted all the time.

Maybe "Talebian" would be more appropriate.

Comment author: Slider 27 November 2014 02:13:24AM 0 points [-]

I failed to do basic googling. They are sorry for the fate but don't revert any official decision.

Comment author: Timo 27 November 2014 02:07:29AM 0 points [-]

And people never learn to take the possibility of bad things seriously... If it's that bad, it can't possibly actually happen.

Comment author: Wes_W 27 November 2014 01:44:51AM 2 points [-]

First, there's the political problem: if you can build agent AI and just choose not to, this doesn't help very much when someone else builds their UFAI (which they want to do, because agent AI is very powerful and therefore very useful). So you have to get everyone on board with the plan first. Also, having your superintelligent oracle makes it much easier for someone else to build an agent: just ask the oracle how. If you don't solve Friendliness, you have to solve the incentives instead, and "solve politics" doesn't look much easier than "solve metaethics."

Second, the distinction between agents and oracles gets fuzzy when the AI is much smarter than you. Suppose you ask the AI how to reduce gun violence: it spits out a bunch of complex policy changes, which are hard for you to predict the effects of. But you implement them, and it turns out that they result in drastically reduced willingness to have children. The population plummets, and gun violence deaths do too. "Okay, how do I reduce per capita gun violence?", you ask. More complex policy changes; this time they result in increased pollution which disproportionately depopulates the demographics most likely to commit gun violence. "How do I reduce per capita gun violence without altering the size or demographic ratios of the population?" Its recommendations cause a worldwide collapse of the firearms manufacturing industry, and gun violence plummets, along with most metrics of human welfare.

If you have to blindly implement policies you can't understand, you're not really much better off than letting the AI implement them directly. There are some things you can do to mitigate this, but ultimately the AI is smarter than you. If you could fully understand all its ideas, you wouldn't have needed to ask it.

Does this sound familiar? It's the untrustworthy genie problem again. We need a trustworthy genie, one that will answer the questions we mean to ask, not just the questions we actually ask. So we need an oracle that understands and implements human values, which puts us right back at the original problem of Friendliness!

Non-agent AI might be a useful component of realistic safe AI development, just as "boxing" might be. Seatbelts are a good idea too, but it only matters if something has already gone wrong. Similarly, oracle AI might help, but it's not a replacement for solving the actual problem.

Comment author: Manfred 27 November 2014 01:41:40AM *  0 points [-]

This actually works if you condition every probability, including the probability of the parent nodes, on the observed information. For example, say that option one is you could start with all options possible in the marble game and then observe that the result was not Heads and White. And option two is you could determine the marble color causally, in a way that never even has the possibility of White when Heads. And these two options result in different probabilities.

This really reinforces how the information about how a node's value is causally generated is different from observed information about that node.

Comment author: polymathwannabe 27 November 2014 01:36:24AM 0 points [-]
Comment author: Artaxerxes 27 November 2014 01:34:04AM 0 points [-]

Calico, the aging research company founded by Google, is hiring.

Comment author: Azathoth123 27 November 2014 01:29:51AM 0 points [-]

I wonder if Eliezer will now attempt to disassociate himself from Hanson like he has the NRx's.

Comment author: Slider 27 November 2014 01:27:56AM 0 points [-]

Studying computers I have ran into Turings name occasionally. When I actually looked up the papers he had wrote that seeded the concepts that caryy his name, this was a very refreshing read. To me they stand the test of tmie well. I knew that Turing committed suicide that had to do with him being a homosexual. Now I have learned of suggestions that official instituitons might have had a helping hand in that and that there wil be no offcial apology.

Turing was quite young and what he produced was pretty good stuff. I would have been really exited to read what he would have written if he had been on the field for 5 times as much. Shortening that lifespan motivated with something as silly as homosexuality inflamed me with a big anger emotion.

You can add to your list of why we don't have the singularity yet the item of "not tolerant enough".

Comment author: satt 27 November 2014 01:24:32AM 1 point [-]

Are the upvotes this account is receiving here done by actual lesswrong users (who, frankly, ought to be ashamed of themselves) or has Azathoth123 created sockpuppets to vote itself up?

I've suspected Azathoth123 of upvoting their own comments with sockpuppets since having this argument with them. (If I remember rightly, their comments' scores would sit between -1 & +1 for a while, then abruptly jump up by 2-3 points at about the same time my comments got downvoted.)

Moreover, Azathoth123 is probably Eugine_Nier's reincarnation. They're similar in quite a few ways (political views, spelling errors, mannerisms) and Azathoth123 started posting prolifically roughly when Eugine_Nier got banned.

Comment author: Azathoth123 27 November 2014 01:24:30AM -1 points [-]

(See actual political discussions of "real rape".)

Which is extremely idiotic and mostly seems to consist of feminists attempting to get away with further and further expanding the definition of "rape" while keeping the word's connotations the same.

Comment author: Azathoth123 27 November 2014 01:19:50AM 1 point [-]

The sacred value of a woman's control over her own body is still violated.

Would you accept the same argument for cuckoldry violating the sacredness value of the marriage? If so then wasn't Hanson comparing two sacredness violations? If not how do you decide which sacredness values to accept?

Comment author: blacktrance 27 November 2014 01:09:35AM *  0 points [-]

But presumably you don't get utility from switching as such, you get utility from having A, B, or C, so if you complete a cycle for free (without me charging you), you have exactly the same utility as when you started, and if I charge you, then when you're back to A, you have lower utility.

Comment author: skeptical_lurker 27 November 2014 01:08:01AM 0 points [-]

2) Gays aren't monogamous. One obvious way to see this is to note how much gay culture is based around gay bathhouses. Another way is to image search pictures of gay pride parades.

I think the stereotype is that male gays are promiscuous while lesbians are the opposite. Given this, would you be in favour of letting lesbians adopt?

Comment author: skeptical_lurker 27 November 2014 12:57:14AM *  0 points [-]

When I went to school (not that long ago) there was no mention of homosexuality in school sex education by law, and there was homophobic bullying. Even the most liberal teacher said "if two pupils were in a gay relationship, we'd cross that bridge when we came to it". Despite this, some of my school friends were gay, and plenty of people of my age are gay.

Do you have a link to back up this claim of schools teaching children to "find out if they're trans"? There's a difference between preaching tolerance and preaching advocacy.

Comment author: Liso 27 November 2014 12:48:40AM 0 points [-]

I am afraid that we have not precisely defined term goal. And I think we need it.

I am trying to analyse this term.

Do you think that todays computer's have goals? I dont think so (but probably we have different understanding of this term). Are they useless? Have cars goals? Are they without action and reaction?

Probably I could more precisely describe my idea in other way: In Bostrom's book there are goals and subgoals. Goals are utimate, petrified and strengthened, subgoals are particular, flexible and temporary.

Could we think AI without goals but with subgoals?

One posibility could be if they will have "goal centre" externalized in human brain.

Could we think AI as tabula rasa, pure void in the begining after creation? Or AI could not exists without hardwired goals?

If they could be void - will be goal imprinted with first task?

Or with first task with word "please"? :)

About utility maximizer - human (or animal brain is not useless if it not grow without limit. And there is some tradeoff between gain and energy comsumption.

We have or could to think balanced processes. One dimensional, one directional, unbalanced utility function seems to have default outcome doom. But are the only choice?

How did that nature? (I am not talking about evolution but about DNA encoding)

Balance between "intelligent" neural tissues (SAI) and "stupid" non-neural (humanity). :)

Probably we have to see difference between purpose and B-goal (goal in Bostrom's understanding).

If machine has to solve arithmetic equation it has to solve it and not destroy 7 planet to do it most perfect.

I have feeling that if you say "do it" Bostrom's AI hear "do it maximally perfect".

If you tell: "tell me how much is 2+2 (and do not destroy anything)" then she will destroy planet to be sure that nobody could stop her to answer how much is 2+2.

I am feeling that Bostrom is thinking that there is implicitly void AI in the begining and in next step there is AI with ultimate unchangeable goal. I am not sure if it is plausible. And I think that we need good definition or understanding about goal to know if it is plausible.

Comment author: polymathwannabe 27 November 2014 12:42:17AM 1 point [-]

there are people who engage in gay sex and who claim to be "really" the opposite gender

People who sleep with their same sex do not necessarily identify as homosexuals, and definitely not all homosexuals identify as transgender. They are not the same phenomenon, they must not be confused, and the fact that you confuse them reveals a lot about your suitability to have this discussion.

There are also people who rape, people who believe in creationism and people who believe themselves to be "spiritually" some animal.

No valid argument exists to equal homosexuality per se with, respectively, violating others' autonomy, being ridiculously misinformed, or having a psychiatric disorder.

Comment author: DefectiveAlgorithm 27 November 2014 12:38:19AM *  1 point [-]

Even leaving aside the matters of 'permission' (which lead into awkward matters of informed consent) as well as the difficulties of defining concepts like 'people' and 'property', define 'do things to X'. Every action affects others. If you so much as speak a word, you're causing others to undergo the experience of hearing that word spoken. For an AGI, even thinking draws a miniscule amount of electricity from the power grid, which has near-negligible but quantifiable effects on the power industry which will affect humans in any number of different ways. If you take chaos theory seriously, you could take this even further. It may seem obvious to a human that there's a vast difference between innocuous actions like those in the above examples and those that are potentially harmful, but lots of things are intuitively obvious to humans and yet turn out to be extremely difficult to precisely quantify, and this seems like just such a case.

Comment author: satt 27 November 2014 12:37:45AM 0 points [-]

Wikipedia is more comprehensive now than in 2008, but I speculate that its average article quality might be lower, because of (1) competent editors being spread more thinly, and (2) the gradual entrenchment of a hierarchy of Wikipedia bureaucrats who compensate for a lack of expertise with pedantry and rules lawyering.

(I may be being unfair here? I'm going by faint memories of articles I've read, and my mental stereotype of Wikipedia, which I haven't edited regularly in years.)

Comment author: Azathoth123 27 November 2014 12:30:57AM -2 points [-]

This user seems to to spreading an agenda of ignorant bigotry against homosexuality and polyamory.

Do you have a counterargument to go with your insults. Also, while you're on the subject could you define what you mean by "bigotry" and why it's a bad thing. In my experience these days it usually means "he's using a Bayesian prior based on a category I don't like".

Or is this simply the kind of comment you now need to occasionally make to keep the Australian thought police of your case? If so, I'd like you to know that I sympathize with your position and hope Australia desides to re-embrace free speech.

Comment author: Azathoth123 27 November 2014 12:20:40AM -2 points [-]

Sexual diversity is real

In what sense? It's real in the sense that there are people who engage in gay sex and who claim to be "really" the opposite gender. There are also people who rape, people who believe in creationism and people who believe themselves to be "spiritually" some animal.

That doesn't mean we should endorse their behaviors or take their claims at face value.

Comment author: MaximumLiberty 26 November 2014 11:45:20PM 0 points [-]

I'm a lawyer, over 20 years out from law school. I took the LSAT cold, so I'm not a good candidate for your questions. I've always liked taking tests and always did well on standardized ones. I did well on the LSAT.

The reason I am responding is to add a bit of information. Lawyers talk, among ourselves and to law students, about what it means to "think like a lawyer." It is a topic of fairly serious debate in jurisprudence for a number of reasons. One is that lawyers have a lot of power in American society. There are issues of justification and effects there. Another is the underlying sense that we really do think differently from most people. We see it in our everyday lives and it sparks our curiosity. There are many other reasons.

So, it makes me wonder what the MRI images would show when comparing lawyers' brains to comparable non-lawyers brains.

Comment author: Dallas 26 November 2014 11:24:54PM -1 points [-]

Your updates to your blog as of this post seem to replace "Less Wrong", or "MIRI", or "Eliezer Yudkowsky", with the generic term "AI risk advocates".

This just sounds more insidiously disingenuous.

Comment author: MaximumLiberty 26 November 2014 11:14:41PM 0 points [-]

I'm a super-dummy when it comes to thinking about AI. I rightly leave it to people better equipped and more motivated than me.

But, can someone explain to me why a solution would not involve some form of "don't do things to people or their property without their permission"? Certainly, that would lead to a sub-optimal use of AI in some people's opinions. But it would completely respect the opinions of those who disagree.

Recognizing that I am probably the least AI-knowledgeable person to have posted a comment here, I ask, what am I missing?

Comment author: Transfuturist 26 November 2014 11:12:45PM 0 points [-]

And 3 utilons. I see no cost there.

Comment author: DanielFilan 26 November 2014 10:54:46PM 0 points [-]

I think that the point of the original post was to basically say "humans can justify doing things that our systems 1 want to do, by saying that they're for the good of everyone". With this in mind, you would expect people to be more OK with eating meat and animal products than they should be, because our systems 1 don't really care about people far away from us who we have never met, meaning that way 2 doesn't apply. I do think that there's a logical equivalence between your way 1 and way 2, which is why I think that utilitarianism is the correct theory of morality, but I don't think that they're psychologically equivalent at all.

Comment author: Capla 26 November 2014 10:28:16PM 0 points [-]

I was just reading though the Eliezer article. I'm not sure I understand. Is he saying that my computer actually does have goals?

Isn't there a difference between simple cause and effect and an optimization process that aims at some specific state?

Comment author: Jiro 26 November 2014 10:17:51PM 1 point [-]

Now they have no incentive to blackmail you, and you are safe, even if they do exist!

How does that work if they precommit to blackmail even when there is no incentive (which benefits them by making the blackmail more effective)?

Comment author: wedrifid 26 November 2014 10:05:59PM 3 points [-]

2) Gays aren't monogamous. One obvious way to see this is to note how much gay culture is based around gay bathhouses. Another way is to image search pictures of gay pride parades.

This user seems to to spreading an agenda of ignorant bigotry against homosexuality and polyamory. It doesn't even temper the hostile stereotyping with much pretense of just referring to trends in the evidence.

Are the upvotes this account is receiving here done by actual lesswrong users (who, frankly, ought to be ashamed of themselves) or has Azathoth123 created sockpuppets to vote itself up?

Comment author: Kawoomba 26 November 2014 10:05:37PM 6 points [-]

AI value alignment problem / AI goal alignment problem

Incidentally, rephrasing and presenting the problem in this manner is how MIRI could possibly gain additional traction in the private sector and its associated academia. As self-modifying code will gain more popularity, there will be plenty of people encountering just this hurdle, i.e. "but how can I make sure that my modified agent still optimizes for X?" Establishing that as a well delineated subfield, unrelated to the whole "fate of humanity"-thing, could both prompt/shape additional external research and lay the groundwork to the whole "the new goals post-modication may be really damaging", reducing the inferential distance to one of scope alone.

A company making a lot of money off of their self-modifying stock-brokering algorithms* (a couple of years down the line), in a perpetual tug-of-war with their competitor's self-modifying stock-brokering algorithms* will be quite interested in proofs that their modified-beyond-recognition agent will still try to make them a profit.

I imagine that a host of expert systems, medium term, will increasingly rely on self-modification. Now, compare: "There is an institute concerned with the friendliness of AI, in the x-risk sense" versus "an institute concerned with preserving AI goals across modifications as an invariant which I can use to keep the money flowing" in terms of industry attractiveness.

Even if both match to the same research activities at MIRI, modulo the "what values should the AI have?", which in large parts isn't a technical question anyways and brings us into the murky maelstrom between game theory / morality / psychology / personal dogmatism. Just a branding suggestion, since the market on the goal alignment question in the concrete economic sense can still be captured. Could even lead to some Legg-itimate investments.

* Using algorithm/tool/agent interchangeably, since they aren't separated by more than a trivial modification.

Comment author: Jiro 26 November 2014 09:54:33PM -1 points [-]

"How could it" means "how could it always result in", not "how could it in at least one case". Giving examples of how it could do it in at least one case is trivial (consider the case where refusing to be blackmailed results in humanity being killed off for some unlikely reason, and humanity, being killed off, can't build an AI).

Comment author: wedrifid 26 November 2014 09:51:48PM 0 points [-]

This is the gist of the AI Box experiment, no?

No. Bribes and rational persuasion are fair game too.

Comment author: wedrifid 26 November 2014 09:46:08PM 0 points [-]

To quote someone else here: "Well, in the original formulation, Roko's Basilisk is an FAI

I don't know who you are quoting but they are someone who considers AIs that will torture me to be friendly. They are confused in a way that is dangerous.

The AI acausally blackmails people into building it sooner, not into building it at all.

It applies to both - causing itself to exist at a different place in time or causing itself to exist at all. I've explicitly mentioned elsewhere in this thread that merely refusing blackmail is insufficient when there are other humans who can defect and create the torture-AI anyhow.

You asked "How could it?". You got an answer. Your rhetorical device fails.

Comment author: Brillyant 26 November 2014 09:32:38PM 0 points [-]

I see.

by threating or hypnotising a human

This is the gist of the AI Box experiment, no?

Comment author: Dr_Manhattan 26 November 2014 09:30:06PM 0 points [-]

Why does the compromise have to be a function of simplified values? I don't think I implied that.

Comment author: nkh 26 November 2014 09:27:53PM 1 point [-]

I like the idea, and I especially like the idea of safely observing treacherous turns. But, a few failure modes might be:

  1. If the AI wreaks havoc on the planet before it manages to get access to the self-termination script, humans aren't left in very good shape, even if the AI ends up switched off afterward. (This DOES seem unlikely, since presumably getting the script would be easy enough that it would not first require converting the planet to computronium or whatever, but it's a possibility.)

  2. A sufficiently intelligent AI would probably read the script, realize that the script's execution will result in its own termination, and plan accordingly by putting other mechanisms in place to reactivate itself afterward--all so it could continue to run the scrip again and again. Then it would also have instrumental reasons to safeguard itself against interruption through some of the same "bad for humanity" strategies that a pi calculator might use. Maybe this could be fixed by making the final goal be "run SELF-TERMINATE.sh once and only once"... but I feel like that's susceptible to the same problems as telling Clippy "only make 32 paperclips, don't just make them indefinitely".

Comment author: gwern 26 November 2014 09:27:00PM 1 point [-]

I have enough pills for the following two months or so; I'll continue taking it and see whether my response improves.

Also try reducing the dose. 1mg pills should be easy to split into quarters, thirds, and halves.

Comment author: Viliam_Bur 26 November 2014 09:21:14PM 2 points [-]

Why can't we program hard stops into AI, where it is required to pause and ask for further instruction?

If the AI is aware of the pauses, it can try to eliminate them (if the pauses are triggered by a circumstance X, it can find a clever way to technically avoid X), or to make itself receive the "instruction" it wants to receive (e.g. by threating or hypnotising a human, or by doing something that technically counts as human input).

Comment author: AlexMennen 26 November 2014 09:09:55PM 3 points [-]

if various human beings have diverging values, there is no way for the AI to be aligned with both.

Yes, it is trivially true that an AI cannot perfectly optimize for one person's values while simultaneously perfectly optimizing for a different person's values. But, by optimizing for some combination of each person's values, there's no reason the AI can't align reasonably well with all of them unless their values are rather dramatically in conflict.

In particular, as I said elsewhere, human beings do not value anything infinitely. Any AI that does value something infinitely will not have human values, and it will be subject to Pascal's Muggings. Consequently, the most important point is to make sure that you do not give an AI any utility function at all, since if you do give it one, it will automatically diverge from human values.

Are you claiming that all utility functions are unbounded? That is not the case. (In fact, if you only consider continuous utility functions on a complete lottery space, then all utility functions are bounded. http://lesswrong.com/lw/gr6/vnm_agents_and_lotteries_involving_an_infinite/)

Comment author: nkh 26 November 2014 09:04:13PM *  0 points [-]

Seems to me an AI without goals wouldn't do anything, so I don't see it as being particularly dangerous. It would take no actions and have no reactions, which would render it perfectly safe. However, it would also render the AI perfectly useless--and it might even be nonsensical to consider such an entity "intelligent". Even if it possessed some kind of untapped intelligence, without goals that would manifest as behavior, we'd never have any way to even know it was intelligent.

The question about utility maximization is harder to answer. But I think all agents that accomplish goals can be described as utility maximizers regardless of their internal workings; if so, that (together with what I said in the last paragraph) implies that an AI that doesn't maximize utility would be useless and (for all intents and purposes) unintelligent. It would simply do nothing.

Comment author: JStewart 26 November 2014 08:58:55PM *  5 points [-]

This has been proposed before, and on LW is usually referred to as "Oracle AI". There's an entry for it on the LessWrong wiki, including some interesting links to various discussions of the idea. Eliezer has addressed it as well.

See also Tool AI, from the discussions between Holden Karnofsky and LW.

Comment author: RichardKennaway 26 November 2014 08:57:24PM 1 point [-]

No, that's your inner Nazgul.

Comment author: ChristianKl 26 November 2014 08:54:04PM 1 point [-]

In general what this community is about is having good arguments for doing what you do. As such it usually makes sense if a person who advocates some practices makes the case for the practice instead of simply posting a link.

In this case, did you follow that program? What results did you get?

Comment author: Capla 26 November 2014 08:36:28PM 0 points [-]

This may be a naive question, which has a simple answer, but I haven't seen it. Please enlighten me.

I'm not clear on why an AI should have a utility function at all.

The computer I'm typing this on doesn't. It simply has input-output behavior. When I hit certain keys it reacts in certain, very complex ways, but it doesn't decide. It optimizes, but only when I specifically tell it to do so, and only on the parameters that I give it.

We tend to think of world-shaping GAI as an agent with it's own goals, which it seeks to implement. Why can't it be more like a computing machine in a box. We could feed it questions, like "given this data, will it rain tomorrow?", or "solve this protein folding problem", or "which policy will best reduce gun-violence?", or even "given these specific parameters and definitions, how do we optimize for human happiness?" For the complex answers like the last of those, we could then ask the AI to model the state of the world that results from following this policy. If we see that it leads to tiling the universe with smiley faces, we know that we made a mistake somewhere (that wasn't what we were trying to optimize for), and adjust the parameters. We might even train the AI over time, so that it learns how to interpret what we mean from what we say. When the AI models a state of the world that actually reflects our desires, then we implement it's suggestions ourselves, or perhaps only then hit the implement button, by with the AI takes the steps to carry out it's plan. We might even use such a system to check the safety of future generations of the AI. This would slow recursive self improvement, but it seems it would be much safer.

Comment author: drethelin 26 November 2014 08:28:50PM 0 points [-]

That technology pretty much exists already, its just extremely under-advertised for various reasons.

Comment author: JoshuaFox 26 November 2014 08:23:34PM *  2 points [-]

Anyone want to comment on a pilot episode of a podcast "Rationalists in Tech"? Please PM or email me. I'll ask for your feedback and suggestions for improvement on a 30-minute audio interview with a leading technologist from the LW community. This will allow me to plan an even better series of further interviews with senior professionals, consultants, founders, and executives in technology, mostly in software.

  • Discussion topics will include the relevance of CfAR-style techniques to the career and daily work of a tech professional; tips on career aimed at LWer technologists; and the rationality-related products and services of some interviewees;

  • The goal is to show LessWrongers in the tech sector that they have a community of like-minded people. Often engineers, particularly those just starting out, have heard of the value of networking, but don't know where they can find people who they can and should connect to. Similarly, LWers who are managers or owners are always on the lookout for talent. This will highlight some examples of other LWers in the sector as an inspiration for networking.

Comment author: Capla 26 November 2014 08:17:51PM 0 points [-]

You are right, but much of the fitness game is motivation, and we are tribal organisms. Being part of a community to which one relates, that pushes you to be better, is a huge benefit.

Maybe this is a solved problem, but I think there might be at least one person here with whom it resonates, and to whom it could provide substantial value.

Comment author: Unknowns 26 November 2014 07:58:10PM -1 points [-]

Yes, this would happen if you take an unbounded function and simply map it to a bounded function without actually changing it. That is why I am suggesting admitting that you really don't have an infinite capacity for caring, and describing what you care about as though you did care infinitely is mistaken, whether you describe this with an unbounded or with a bounded function. This requires admitting that scope insensitivity, after a certain point, is not a bias, but just an objective fact that at a certain point you really don't care anymore.

Comment author: blacktrance 26 November 2014 07:56:19PM 0 points [-]

What empirical claims do you consider yourself to be making about the jumble of interacting systems that is the human cognitive architecture when you say that the sole 'actual' terminal value of a human is pleasure?

That upon ideal rational deliberation and when having all the relevant information, a person will choose to pursue pleasure as a terminal value.

Comment author: maxikov 26 November 2014 07:55:57PM 0 points [-]

being spoken by "figures wearing black robes, and speaking in a dry, whispering voice, and they are actually withered beings who touched the Stone of Evil"

Isn't that what my inner Quirrellmort supposed to be?

Comment author: blacktrance 26 November 2014 07:53:27PM 0 points [-]

I might be perfectly happy with the expenditure per utility shift.

That's exactly the problem - you'd be happy with the expenditure per shift, but every time a fill cycle would be made, you'd be worse off. If you start out with A and $10, pay me a dollar to switch to B, another dollar to switch to C, and a third dollar to switch to A, you'd end up with A and $7, worse off than you started, despite being satisfied with each transaction. That's the cost of inconsistency.

Comment author: solipsist 26 November 2014 07:52:24PM 1 point [-]

Yeah, I follow. I'll bring up another wrinkle (which you may already be familiar with): Suppose the objective you're maximizing never equals or exceeds 20. You can reach to 19.994, 19.9999993, 19.9999999999999995, but never actually reach 20. Then even though your objective function is bounded, you will still try to optimize forever, and may resort to increasingly desperate measures to eek out another .000000000000000000000000001.

Comment author: Dahlen 26 November 2014 07:41:02PM 0 points [-]

My third update on curcumin (previous: one, two): I haven't been taking it very regularly lately. This is not unusual for me; as my focus on self-improvement waxes and wanes, so does my likelihood of continuing a practice that either is productive in itself or causes me to be more productive.

Mostly, I've been too depressed to take my pills. My antidepressant pills.

The thing is, the antidepressant effect is not so powerful as to leave one into a basically neutral mood absolutely regardless of what's happening. It can help you withstand minor stress without being emotionally affected; and if you're the kind of person who slips into a bad mood suddenly and without any exterior cause, it will probably help. But if sufficiently unfortunate events happen to you -- say, your mother is in hospital with cancer, your father commits suicide, the love of your life rejects you and then kisses someone else right in front of you, all of your friends turn out to be fake, you have to survive on two dollars for the following week, and you find yourself having to drop out of college -- all at the same time -- then I'm sorry to say that curcumin ain't gonna do shit for you.

I'll be saving my current supply for a time when things go better for me and I can actually focus on the goals for which I chose to take curcumin pills in the first place. Or maybe I'll just take them on days when I don't expect much to be happening, to improve baseline mood.


In a similar vein, I started experimenting with melatonin. I never had any problem sleeping when tired, and only very recently I experienced a couple of episodes of insomnia, but after reading gwern's sterling recommendation of it I thought I should give it a shot.

I took it thrice so far. The first two times -- 2 pills of 1 mg melatonin each; I was surprised to see they actually worsened my problem. I got stuck into the unpleasant state of being extremely tired (from the melatonin) and still unable to sleep. I am not sure whether this was because of melatonin or in spite of it. The third time I only took 1 mg, and slept very well for a very long time -- but then again I had amassed a very large sleep debt in the previous few days, and probably would have slept well no matter what. Needless to say, the data I've gathered so far is inconclusive. I have enough pills for the following two months or so; I'll continue taking it and see whether my response improves.

View more: Next