Comment author: DittoDevolved 02 October 2016 04:13:27PM *  0 points [-]

Hi, new here.

I was wondering if I've interpreted this correctly:

'For a true Bayesian, it is impossible to seek evidence that confirms a theory. There is no possible plan you can devise, no clever strategy, no cunning device, by which you can legitimately expect your confidence in a fixed proposition to be higher (on average) than before. You can only ever seek evidence to test a theory, not to confirm it.'

Does this mean that it is impossible to prove the truth of a theory? Because the only evidence that can exist is evidence that falsifies the theory, or supports it?

For example, something people know about gravity and objects under it's influence, is that on Earth objects will accelerate at something like 9.81ms^-2. If we dropped a thousand different objects and observed their acceleration, and found it to be 9.81ms^-2, we would have a thousand pieces of evidence supporting the theory, and zero pieces to falsify the theory. We all believe that 9.81 is correct, and we teach that it is the truth, but we can never really know, because new evidence could someday appear that challenges the theory, correct?

Thanks

Comment author: nshepperd 03 October 2016 04:50:52AM 0 points [-]

"For a true Bayesian, it is impossible to seek evidence that confirms a theory"

The important part of the sentence here is seek. The isn't about falsificationism, but the fact that no experiment you can do can confirm a theory without having some chance of falsifying it too. So any observation can only provide evidence for a hypothesis if a different outcome could have provided the opposite evidence.

For instance, suppose that you flip a coin. You can seek to test the theory that the result was HEADS, by simply looking at the coin with your eyes. There's a 50% chance that the outcome of this test would be "you see the HEADS side", confirming your theory (p(HEADS | you see HEADS) ~ 1). But this only works because there's also a 50% chance that the outcome of the test would have shown the result to be TAILS, falsifying your theory (P(HEADS | you see TAILS) ~ 0). And in fact there's no way to measure the coin so that one outcome would be evidence in favour of HEADS (P(HEADS | measurement) > 0.5), without the opposite result being evidence against HEADS (P(HEADS | ¬measurement) < 0.5).

Comment author: Jiro 26 April 2015 05:55:39PM 0 points [-]

Variation on this:

An oracle comes up to you and tells you that you will give it a thousand dollars. This oracle has done this many times and every time it has told people this the people have given the oracle a thousand dollars. This oracle, like the other one, isn''t threatening you. It just goes around finding people who will give it money. Should you give the oracle money?

Comment author: nshepperd 15 January 2016 06:41:43AM *  0 points [-]

Under UDT: pay iff you need human contact so much that you'd spend $1000 to be visited by a weird oracle who goes around posing strange decision theory dilemmas.

Comment author: jacob_cannell 23 June 2015 08:07:50PM *  12 points [-]

Thanks, I was waiting for at least one somewhat critical reply :)

Specifically, I think you fail to address the evidence for evolved modularity: * The brain uses spatially specialized regions for different cognitive tasks. * This specialization pattern is mostly consistent across different humans and even across different species.

The ferret rewiring experiments, the tongue based vision stuff, the visual regions learning to perform echolocation computations in the blind, this evidence together is decisive against the evolved modularity hypothesis as I've defined that hypothesis, at least for the cortex. The EMH posits that the specific cortical regions rely on complex innate circuitry specialized for specific tasks. The evidence disproves that hypothesis.

Damage to or malformation of some brain regions can cause specific forms of disability (e.g. face blindness). Sometimes the disability can be overcome but often not completely.

Sure. Once you have software loaded/learned into hardware, damage to the hardware is damage to the software. This doesn't differentiate the two hypotheses.

In various mammals, infants are capable of complex behavior straight out of the womb. Human infants are only exhibit very simple behaviors and require many years to reach full cognitive maturity therefore the human brain relies more on learning than the brain of other mammals, but the basic architecture is the same, thus this is a difference of degree, not kind.

Yes - and I described what is known about that basic architecture. The extent to which a particular brain relies on learning vs innate behaviour depends on various tradeoffs such as organism lifetime and brain size. Small brained and short-living animals have much less to gain from learning (less time to acquire data, less hardware power), so they rely more on innate circuitry, much of which is encoded in the oldbrain and the brainstem. This is all very much evidence for the ULH. The generic learning structures - the cortex and cerbellum, generally grow in size with larger organisms and longer lifespans.

This has also been tested via decortication experiments and confirms the general ULH - rabbits rely much less on their cortex for motor behavior, larger primates rely on it almost exclusively, cats and dogs are somewhere in between, etc.

This evidence shows that the cortex is general purpose, and acquires complex circuitry through learning. Recent machine learning systems provide further evidence in the form of - this is how it could work.

For all the speculation, there is still no clear evidence that the brain uses anything similar to backpropagation.

As I mentioned in the article, backprop is not really biologically plausible. Targetprop is, and there are good reasons to suspect the brain is using something like targetprop - as that theory is the latest result in a long line of work attempting to understand how the brain could be doing long range learning. Investigating and testing the targetprop theory and really confirming it could take a while - even decades. On the other hand, if targetprop or some variant is proven to work in a brain-like AGI, that is something of a working theory that could then help accelerate neuroscience confirmation.

There seems to be a trend in AI where for any technique that is currently hot there are people who say: "This is how the brain works. We don't know all the details, but studies X, Y and Z clearly point in this direction." After a few years and maybe an AI (mini)winter the brain seems to work in another way...

I did not say deep learning is "how the brain works". I said instead the brain is - roughly - a specific biological implementation of a ULH, which itself is a very general model which also will include any practical AGIs.

I said that DL helps indirectly confirm the ULH of the brain, specifically by showing how the complex task specific circuitry of the cortex could arise through a simple universal learning algorithm.

Computational modeling is key - if you can't build something, you don't understand it. To the extent that any AI model can functionally replicate specific brain circuits, it is useful to neuroscience. Period. Far more useful than psychological theorizing not grounded in circuit reality. So computational neuroscience and deep learning (which really is just the neuroscience inspired branch of machine learning) naturally have deep connections.

Some of the most successful deep learning approaches, such as modern convnets for computer vision, rely on quite un-biological features such as weight sharing and rectified linear units

Biological plausibility was one of the heavily discussed aspects of RELUs.

From the abstract:

"While logistic sigmoid neurons are more biologically plausible than hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of . . "

Weight sharing is unbiological: true. It is also an important advantage that von-neumman (time-multiplexed) systems have over biological (non-multiplexed). The neuromorphic hardware approaches largely cannot handle weight-sharing. Of course convnents still work without weight sharing - it just may require more data and or better training and regularization. It is interesting to speculate how the brain deals with that, as is comparing the details of convent learning capability vs bio-vision. I don't have time to get into that at the moment, but I did link to at least one article comparing convents to bio vision in the OP.

"Deep learning" is a quite vague term anyway,

Sure - so just taboo it then. When I use the term "deep learning", it means something like "the branch of machine learning which is more related to neuroscience" (while still focused on end results rather than emulation).

Perhaps most importantly, deep learning methods generally work in supervised learning settings and they have quite weak priors: they require a dataset as big as ImageNet to yield good image recognition performances

Comparing two learning systems trained on completely different datasets with very different objective functions is complicated.

In general though, CNNs are a good model of fast feedforward vision - the first 150ms of the ventral stream. In that domain they are comparable to biovision, with the important caveat that biovision computes a larger and richer output parameter map than most any CNNs. Most CNNs (there are many different types) are more narrowly focused, but also probably learn faster because of advantages like weight sharing. The amount of data required to train the CNN up to superhuman performance on narrow tasks is comparable or less than that required to train a human visual system up to high performance. (but again the cortex is doing something more like transfer learning, which is harder)

Past 150 ms or so and humans start making multiple saccades and also start to integrate information from a larger number of brain regions, including frontal and temporal cortical regions. At that point the two systems aren't even comparable, humans are using more complex 'mental programs' over multiple saccades to make visual judgements.

Of course, eventually we will have AGI systems that also integrate those capabilities.

days of continuous simulated gameplay on the ATARI 2600 emulator to obtain good scores

That's actually extremely impressive - superhuman learning speed.

Therefore I would say that deep learning methods, while certainly interesting from an engineering perspective, are probably not very much relevant to the understanding of the brain, at least given the current state of the evidence.

In that case, I would say you may want to read up more on the field. If you haven't yet, check out the original sparse coding paper (over 3000 citations), to get an idea of how crucial new computational models have been for advancing our understanding of cortex.

Comment author: nshepperd 07 September 2015 04:32:22AM 0 points [-]

The ferret rewiring experiments, the tongue based vision stuff, the visual regions learning to perform echolocation computations in the blind, this evidence together is decisive against the evolved modularity hypothesis as I've defined that hypothesis, at least for the cortex. The EMH posits that the specific cortical regions rely on complex innate circuitry specialized for specific tasks. The evidence disproves that hypothesis.

It seems a little strange to treat this as a triumphant victory for the ULH. At the most, you've shown that the "fundamentalist" evolved modularity hypothesis is false. You didn't really address how the ULH explains this same evidence.

And there are other mysteries in this model, such as the apparent universality of specific cognitive heuristics and biases, or of various behaviours like altruism, deception, sexuality that seems obviously evolved. And, as V_V mentioned, the lateral asymmetry of the brain's functionality vs the macroscopic symmetry.

Otherwise, the conclusion I would draw from this is that both theories are wrong, or that some halfway combination of them is true (say, "universal" plasticity plus a genetic set of strong priors somehow encoded in the structure).

Comment author: nshepperd 29 August 2015 05:01:19AM 12 points [-]

I donated $10. I can't afford not to.

Comment author: DanielLC 21 April 2015 09:24:48PM 7 points [-]

I've come up with an interesting thought experiment I call oracle mugging.

An oracle comes up to you and tells you that either you will give them a thousand dollars or you will die in the next week. They refuse to tell you which. They have done this many times, and everyone has either given them money or died. The oracle isn't threatening you. They just go around and find people who will either give them money or die in the near future, and tell them that.

Should you pay the oracle? Why or why not?

Comment author: nshepperd 22 July 2015 04:00:14AM *  0 points [-]

This is essentially just another version of the smoking lesion problem, in that there is no connection, causal or otherwise, beween the thing you care about and the action you take. Your decision theory has no specific effect on your likelyhood of dying, that being determined entirely by environmental factors that do not even attempt to predict you. All you are paying for is to determine whether or not you get a visit from the oracle.

ETA: Here's a UDT game tree (see here for an explanation of the format) of this problem, under the assumption that oracle visits everyone meeting his criteria, and uses exclusive-or:

ETA2: More explanation: the colours are states of knowledge. Blue = oracle asks for money, Orange = they leave you alone. Let's say the odds of being healthy are α. If you Pay the expected reward is α(-1000) + (1-α) DEATH; if you Don't Pay the expected reward is α 0 + (1-α) DEATH. Clearly (under UDT) paying is worse by a term of -1000α.

Comment author: TheAncientGeek 18 May 2015 09:17:46AM *  1 point [-]

Loosemore, Yudkowsky, and myself are all discussing AIs that have a goal misaligned with human values that they nevertheless find motivating.

If that is supposed to be a universal or generic AI, it is a valid criticiYsm to point out that not all AIs are like that.

If that is supposed to be a particular kind of AI, it is a valid criticism to point out that no realistic AIs are like that.

You seem to feel you are not being understood, but what is being said is not clear,

1 Whether or not "superintelligent" is a meaningful term in this context

"Superintelligence" is one of the clearer terms here, IMO. It just means more than human intelligence, and humans can notice contradictions.

This comment seems to be part of a concernabout "wisdom", assumed to be some extraneous thing an AI would not necessarily have. (No one but Vaniver has brought in wisdom) The counterargument is that compartmentalisation between goals and instrumental knowledge is an extraneous thing an AI would not necessarily have, and that its absence is all that is needed for a contradictions to be noticed and acted on.

2 Whether we should expect generic AI designs to recognize misalignments, or whether such a realization would impact the goal the AI pursues.

It's an assumption, that needs justification, that any given AI will have goals of a non trivial sort. "Goal" is a term that needs tabooing.

Neither Yudkowsky nor I think either of those are reasonable to expect--as a motivating example, we are happy to subvert the goals that we infer evolution was directing us towards in order to better satisfy "our" goals. I

While we are anthopomirphising, it might be worth pointing out that humans don't show behaviour patterns of relentlessly pursuing arbitrary goals.

oals. I suspect that Loosemore thinks that viable designs would recognize it, but agrees that in general that recognition does not have to lead to an alignment

Loosemore has put forward a simple suggestion, which MIRI appears not to have considered at all, that on encountering a contradiction, an AI could lapse into a safety mode, if so designed,

3 ...sees cleverness and wisdom as closely tied together

You are paraphrasing Loosemoreto sound less technical and more handwaving than his actual comments. The ability to sustain contradictions in a system that is constantly updating itself isnt a given....it requires an architectural choice in favour of compartmentalisation.

Comment author: nshepperd 18 May 2015 09:45:52AM *  3 points [-]

All this talk of contradictions is sort of rubbing me the wrong way here. There's no "contradiction" in an AI having goals that are different to human goals. Logically, this situation is perfectly normal. Loosemore talks about an AI seeing its goals are "massively in contradiction to everything it knows about <BLAH>", but... where's the contradiction? What's logically wrong with getting strawberries off a plant by burning them?

I don't see the need for any kind of special compartmentalisation; information about "normal use of strawberries" is already inert facts with no caring attached by default.

If you're going to program in special criteria that would create caring about this information, okay, but how would such criteria work? How do you stop it from deciding that immortality is contradictory to "everything it knows about death" and refusing to help us solve aging?

Comment author: Richard_Loosemore 14 May 2015 03:20:26PM 1 point [-]

Not really.

The problem is that nshepperd is talking as if the term "intention" has some exact, agreed-upon technical definition.

It does not. (It might have in nshepperd's mind, but not elsewhere)

I have made it clear that the term is being used, by me, in a non-technical sense, whose meaning is clear from context.

So, declaring that the AI "does not have good intentions" is neither here nor there. It makes no difference if you want to describe the AI that way or not.

Comment author: nshepperd 15 May 2015 04:10:55AM *  2 points [-]

That would be fine if you and everyone else who tries to argue on this side of the debate do not proceed to then conclude from the statement that the AI has "good intentions" that it is making some sort of "error" when it fails to act on our cries that "doing X isn't good!" or "doing X isn't what we meant!".

Saying an AI has "good intentions" strongly implies that it cares about what is good, which is, y'know, completely false for a pleasure maximiser. (No-one is claiming that FAI will do evil things just because it's clever, but a pleasure maximiser is not FAI.)

You can't use words any way you like.

Comment author: Richard_Loosemore 12 May 2015 01:50:31PM 0 points [-]

This comment is both rude and incoherent (at the same level of incoherence as your other comments). And it is also pedantic (concentrating as it does on meanings of words, as if those words were being used in violation of some rules that ... you just made up).

Sorry to say this but I have to choose how to spend my time, in responding to comments, and this does not even come close to meriting the use of my time. I did that before, in response to your other comments, and it made no impact.

Comment author: nshepperd 12 May 2015 03:38:05PM 4 points [-]

Equivocation is hardly something I just made up.

Here's an exercise to try. Next time you go to write something on FAI, taboo the words "good", "benevolent", "friendly", "wrong" and all of their synonyms. Replace the symbol with the substance. Then see if your arguments still make sense.

Comment author: Richard_Loosemore 05 May 2015 12:59:04PM *  3 points [-]

You ask

What is your probability estimate that an AI would be a psychopath

and you give me a helpful hint:

(Hint: All computer systems produced until today are psychopaths by this definition.)

Well, first please note that ALL artifacts at the present time, including computer systems, cans of beans, and screwdrivers, are psychopaths because none of them are DESIGNED to possess empathy. So your hint contains zero information. :-)

What is the probability that an AI would be a psychopath if someone took the elementary step of designing it to have empathy? Probability would be close to 1, assuming the designers knew what empathy was, and knew how to design it.

But your question was probably meant to target the situation where someone built an AI and did not bother to give it empathy. I am afraid that that is outside the context we are examining here, because all of the scenarios talk about some kind of inevitable slide toward psychpathic behavior, even under the assumption that someone does their best to give the AI an empathic motivation.

But I will answer this: if someone did not even try to give it empathy, that would be like designing a bridge and not even trying to use materials that could hold up a person's weight. In both cases the hypothetical is not interesting, since designing failure into a system is something any old fool could do.

Your second remark is a classic mistake that everyone makes in the context of this kind of discussion. You mention that the phrase "benevolence toward humanity" means "benevolence" as defined by the computer code.

That is incorrect. Let's try, now, to be really clear about that, because if you don't get why it is incorrect we might waste a lot of time running around in circles. It is incorrect for two reasons. First, because I was consciously using the word to refer to the normal human usage, not the implementation inside the AI. Second, it is incorrect because the entire issue in the paper is that there is a discrepancy between the implementation inside the AI and normal usage, and that discrepancy is then examined in the rest of the paper. By simply asserting that the AI may believe, "correctly" that benevolence is the same as violence toward people, you are pre-empting the discussion.

In the remarks you make after that, you are reciting the standard line contained in all the scenarios that the paper is addressing. That standard line is analyzed in the rest of the paper, and a careful explanation is given for why it is incoherent. So when you simply repeat the standard line, you are speaking as if the paper did not actually exist.

I can address questions that refer to the arguments in the paper, but I cannot say anything if you only recite the standard line that is demolished in the course of the paper's argument. So if you could say something about the argument itself.....

Comment author: nshepperd 12 May 2015 08:04:01AM *  2 points [-]

This is an absolutely blatant instance of equivocation.

Here's the sentence from the post:

[believes that benevolence toward humanity might involve forcing human beings to do something violently against their will.]

Assume that "benevolence" in that sentence refers to "benevolence as defined by the AI's code". Okay, then justification of that sentence is straightforward: The fact that the AI does things against the human's wishes provides evidence that the AI believes benevolence-as-defined-by-code to involve that.

Alternatively, assume that "benevolence" there refers to, y'know, actual human benevolence. Then how do you justify that claim? Observed actions are clearly insufficient, because actual human benevolence is not programmed into its code, benevolence-as-defined-by-code is. What makes you think the AI has any opinions about actual human benevolence at all?

You can't have both interpretations.

(As an aside, I do disapprove of Muehlhauser's use of "benevolence" to refer to mere happiness maximisation. "Apparently benevolent motivations" would be a better phrase. If you're going to use it to mean actual human benevolence then you can certainly complain that the FAQ appears to assert that a happiness maximiser can be "benevolent", even though it's clearly not.)

Comment author: Richard_Loosemore 11 May 2015 11:35:10PM 1 point [-]

This seems a little pedantic, so I thought about not replying (my usual policy), but I guess I will.

The paper is all about the precise nature of the distinction between

wants to make us happy

and

actually doing what is good

Most commenters got that straight away. The paper examines a particular issue within that contrast, but even so, that is clearly the topic of the paper. You, on the other hand, seem very, very keen to tell me that those two things are, arguably, different. Thank you, but since that is what the paper is about, you can safely assume that I got that.

Without exception, everyone so far who has read the paper understood that in the sentence that I wrote, which you quote:

It has good intentions (it wants to make us happy) but the programming to implement that laudable goal has had unexpected ramifications, and as a result the Nanny AI has decided to force all human beings to have their brains connected to a dopamine drip.

.... the phrase "good intentions" was being used as a colloquial paraphrase for the parenthetical clarification "it wants to make us happy". My intention (no pun intended) was clearly NOT to use the phrase "good intentions" in any technical sense, but to given a normal-usage summary of an idea. The two phrasings are supposed to say the same thing, and that thing is what you summarize with the words:

wants to make us happy

By contrast, the other part of my sentence, where I say

.... but the programming to implement that laudable goal has had unexpected ramifications, and as a result the Nanny AI has decided to force all human beings to have their brains connected to a dopamine drip.

.... was universally understood to refer to the other side of the distinction that is at the heart of the paper, namely (in your words):

actually doing what is good

I can't help but notice that TODAY there is a new article on the Future of Life Institute website written by Nathan Collins, whose title is:

'''Artificial Intelligence: The Danger of Good Intentions'''

with the subtitle:

'''Why well-intentioned AI could pose a greater threat to humanity than malevolent cyborgs'''

So my question to you is: why are you so smart in the absolute precision of your word usage, but everyone else is so "careless"?

Comment author: nshepperd 12 May 2015 03:19:46AM 1 point [-]

Well, I do take issue to even people at FLI describing UFAI as having "good intentions". It disguises a challengeable inductive inference. It certainly sounds less absurd to claim that an AI with a pleasure maximisation goal is likely to connect brains to dopamine drips, than that one with "good" intentions would do so. Even if you then assert that you were only using "good" in a colloquial sense, and you actually meant "bad" all along.

View more: Next