Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Alexandros 27 November 2016 10:40:52AM *  65 points [-]

Hi Anna,

Please consider a few gremlins that are weighing down LW currently:

  1. Eliezer's ghost -- He set the culture of the place, his posts are central material, has punctuated its existence with his explosions (and refusal to apologise), and then, upped and left the community, without actually acknowledging that his experiment (well kept gardens etc) has failed. As far as I know he is still the "owner" of this website, retains ultimate veto on a bunch of stuff, etc. If that has changed, there is no clarity on who the owner is (I see three logos on the top banner, is it them?), who the moderators are, who is working on it in general. I know tricycle are helping with development, but a part-time team is only marginally better than no-team, and at least no-team is an invitation for a team to step up.

  2. the no politics rule (related to #1) -- We claim to have some of the sharpest thinkers in the world, but for some reason shun discussing politics. Too difficult, we're told. A mindkiller! This cost us Yvain/Scott who cited it as one of his reasons for starting slatestarcodex, which now dwarfs LW. Oddly enough I recently saw it linked from the front page of realclearpolitics.com, which means that not only has discussing politics not harmed SSC, it may actually be drawing in people who care about genuine insights in this extremely complex space that is of very high interest.

  3. the "original content"/central hub approach (related to #1) -- This should have been an aggregator since day 1. Instead it was built as a "community blog". In other words, people had to host their stuff here or not have it discussed here at all. This cost us Robin Hanson on day 1, which should have been a pretty big warning sign.

  4. The codebase, this website carries tons of complexity related to the reddit codebase. Weird rules about responding to downvoted comments have been implemented in there, nobody can make heads or tails with it. Use something modern, and make it easy to contribute to. (telescope seems decent these days).

  5. Brand rust. Lesswrong is now kinda like myspace or yahoo. It used to be cool, but once a brand takes a turn for the worse, it's really hard to turn around. People have painful associations with it (basilisk!) It needs burning of ships, clear focus on the future, and as much support as possible from as many interested parties, but only to the extent that they don't dillute the focus.

In the spirit of the above, I consider Alexei's hints that Arbital is "working on something" to be a really bad idea, though I recognise the good intention. Efforts like this need critical mass and clarity, and diffusing yet another wave of people wanting to do something about LW with vague promises of something nice in the future (that still suffers from problem #1 AFAICT) is exactly what I would do if I wanted to maintain the status quo for a few more years.

Any serious attempt at revitalising lesswrong.com should focus on defining ownership and plan clearly. A post by EY himself recognising that his vision for lw 1.0 failed and passing the batton to a generally-accepted BDFL would be nice, but i'm not holding my breath. Further, I am fairly certain that LW as a community blog is bound to fail. Strong writers enjoy their independence. LW as an aggregator-first (with perhaps ability to host content if people wish to, like hn) is fine. HN may have degraded over time, but much less so than LW, and we should be able to improve on their pattern.

I think if you want to unify the community, what needs to be done is the creation of a hn-style aggregator, with a clear, accepted, willing, opinionated, involved BDFL, input from the prominent writers in the community (scott, robin, eliezer, nick bostrom, others), and for the current lesswrong.com to be archived in favour of that new aggregator. But even if it's something else, it will not succeed without the three basic ingredients: clear ownership, dedicated leadership, and as broad support as possible to a simple, well-articulated vision. Lesswrong tried to be too many things with too little in the way of backing.

Comment author: nshepperd 27 November 2016 07:06:01PM 14 points [-]

I think you're right that wherever we go next needs to be a clear schelling point. But I disagree on some details.

  1. I do think it's important to have someone clearly "running the place". A BDFL, if you like.

  2. Please no. The comments on SSC are for me a case study in exactly why we don't want to discuss politics.

  3. Something like reddit/hn involving humans posting links seems ok. Such a thing would still be subject to moderation. "Auto-aggregation" would be bad however.

  4. Sure. But if you want to replace the karma system, be sure to replace it with something better, not worse. SatvikBeri's suggestions below seem reasonable. The focus should be on maintaining high standards and certainly not encouraging growth in new users at any cost.

  5. I don't believe that the basilisk is the primary reason for LW's brand rust. As I see it, we squandered our "capital outlay" of readers interested in actually learning rationality (which we obtained due to the site initially being nothing but the sequences) by doing essentially nothing about a large influx of new users interested only in "debating philosophy" who do not even read the sequences (Eternal November). I, personally, have almost completely stopped commenting since quite a while, because doing so is no longer rewarding.

Comment author: DittoDevolved 02 October 2016 04:13:27PM *  0 points [-]

Hi, new here.

I was wondering if I've interpreted this correctly:

'For a true Bayesian, it is impossible to seek evidence that confirms a theory. There is no possible plan you can devise, no clever strategy, no cunning device, by which you can legitimately expect your confidence in a fixed proposition to be higher (on average) than before. You can only ever seek evidence to test a theory, not to confirm it.'

Does this mean that it is impossible to prove the truth of a theory? Because the only evidence that can exist is evidence that falsifies the theory, or supports it?

For example, something people know about gravity and objects under it's influence, is that on Earth objects will accelerate at something like 9.81ms^-2. If we dropped a thousand different objects and observed their acceleration, and found it to be 9.81ms^-2, we would have a thousand pieces of evidence supporting the theory, and zero pieces to falsify the theory. We all believe that 9.81 is correct, and we teach that it is the truth, but we can never really know, because new evidence could someday appear that challenges the theory, correct?


Comment author: nshepperd 03 October 2016 04:50:52AM 0 points [-]

"For a true Bayesian, it is impossible to seek evidence that confirms a theory"

The important part of the sentence here is seek. The isn't about falsificationism, but the fact that no experiment you can do can confirm a theory without having some chance of falsifying it too. So any observation can only provide evidence for a hypothesis if a different outcome could have provided the opposite evidence.

For instance, suppose that you flip a coin. You can seek to test the theory that the result was HEADS, by simply looking at the coin with your eyes. There's a 50% chance that the outcome of this test would be "you see the HEADS side", confirming your theory (p(HEADS | you see HEADS) ~ 1). But this only works because there's also a 50% chance that the outcome of the test would have shown the result to be TAILS, falsifying your theory (P(HEADS | you see TAILS) ~ 0). And in fact there's no way to measure the coin so that one outcome would be evidence in favour of HEADS (P(HEADS | measurement) > 0.5), without the opposite result being evidence against HEADS (P(HEADS | ¬measurement) < 0.5).

Comment author: Jiro 26 April 2015 05:55:39PM 0 points [-]

Variation on this:

An oracle comes up to you and tells you that you will give it a thousand dollars. This oracle has done this many times and every time it has told people this the people have given the oracle a thousand dollars. This oracle, like the other one, isn''t threatening you. It just goes around finding people who will give it money. Should you give the oracle money?

Comment author: nshepperd 15 January 2016 06:41:43AM *  0 points [-]

Under UDT: pay iff you need human contact so much that you'd spend $1000 to be visited by a weird oracle who goes around posing strange decision theory dilemmas.

Comment author: jacob_cannell 23 June 2015 08:07:50PM *  12 points [-]

Thanks, I was waiting for at least one somewhat critical reply :)

Specifically, I think you fail to address the evidence for evolved modularity: * The brain uses spatially specialized regions for different cognitive tasks. * This specialization pattern is mostly consistent across different humans and even across different species.

The ferret rewiring experiments, the tongue based vision stuff, the visual regions learning to perform echolocation computations in the blind, this evidence together is decisive against the evolved modularity hypothesis as I've defined that hypothesis, at least for the cortex. The EMH posits that the specific cortical regions rely on complex innate circuitry specialized for specific tasks. The evidence disproves that hypothesis.

Damage to or malformation of some brain regions can cause specific forms of disability (e.g. face blindness). Sometimes the disability can be overcome but often not completely.

Sure. Once you have software loaded/learned into hardware, damage to the hardware is damage to the software. This doesn't differentiate the two hypotheses.

In various mammals, infants are capable of complex behavior straight out of the womb. Human infants are only exhibit very simple behaviors and require many years to reach full cognitive maturity therefore the human brain relies more on learning than the brain of other mammals, but the basic architecture is the same, thus this is a difference of degree, not kind.

Yes - and I described what is known about that basic architecture. The extent to which a particular brain relies on learning vs innate behaviour depends on various tradeoffs such as organism lifetime and brain size. Small brained and short-living animals have much less to gain from learning (less time to acquire data, less hardware power), so they rely more on innate circuitry, much of which is encoded in the oldbrain and the brainstem. This is all very much evidence for the ULH. The generic learning structures - the cortex and cerbellum, generally grow in size with larger organisms and longer lifespans.

This has also been tested via decortication experiments and confirms the general ULH - rabbits rely much less on their cortex for motor behavior, larger primates rely on it almost exclusively, cats and dogs are somewhere in between, etc.

This evidence shows that the cortex is general purpose, and acquires complex circuitry through learning. Recent machine learning systems provide further evidence in the form of - this is how it could work.

For all the speculation, there is still no clear evidence that the brain uses anything similar to backpropagation.

As I mentioned in the article, backprop is not really biologically plausible. Targetprop is, and there are good reasons to suspect the brain is using something like targetprop - as that theory is the latest result in a long line of work attempting to understand how the brain could be doing long range learning. Investigating and testing the targetprop theory and really confirming it could take a while - even decades. On the other hand, if targetprop or some variant is proven to work in a brain-like AGI, that is something of a working theory that could then help accelerate neuroscience confirmation.

There seems to be a trend in AI where for any technique that is currently hot there are people who say: "This is how the brain works. We don't know all the details, but studies X, Y and Z clearly point in this direction." After a few years and maybe an AI (mini)winter the brain seems to work in another way...

I did not say deep learning is "how the brain works". I said instead the brain is - roughly - a specific biological implementation of a ULH, which itself is a very general model which also will include any practical AGIs.

I said that DL helps indirectly confirm the ULH of the brain, specifically by showing how the complex task specific circuitry of the cortex could arise through a simple universal learning algorithm.

Computational modeling is key - if you can't build something, you don't understand it. To the extent that any AI model can functionally replicate specific brain circuits, it is useful to neuroscience. Period. Far more useful than psychological theorizing not grounded in circuit reality. So computational neuroscience and deep learning (which really is just the neuroscience inspired branch of machine learning) naturally have deep connections.

Some of the most successful deep learning approaches, such as modern convnets for computer vision, rely on quite un-biological features such as weight sharing and rectified linear units

Biological plausibility was one of the heavily discussed aspects of RELUs.

From the abstract:

"While logistic sigmoid neurons are more biologically plausible than hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of . . "

Weight sharing is unbiological: true. It is also an important advantage that von-neumman (time-multiplexed) systems have over biological (non-multiplexed). The neuromorphic hardware approaches largely cannot handle weight-sharing. Of course convnents still work without weight sharing - it just may require more data and or better training and regularization. It is interesting to speculate how the brain deals with that, as is comparing the details of convent learning capability vs bio-vision. I don't have time to get into that at the moment, but I did link to at least one article comparing convents to bio vision in the OP.

"Deep learning" is a quite vague term anyway,

Sure - so just taboo it then. When I use the term "deep learning", it means something like "the branch of machine learning which is more related to neuroscience" (while still focused on end results rather than emulation).

Perhaps most importantly, deep learning methods generally work in supervised learning settings and they have quite weak priors: they require a dataset as big as ImageNet to yield good image recognition performances

Comparing two learning systems trained on completely different datasets with very different objective functions is complicated.

In general though, CNNs are a good model of fast feedforward vision - the first 150ms of the ventral stream. In that domain they are comparable to biovision, with the important caveat that biovision computes a larger and richer output parameter map than most any CNNs. Most CNNs (there are many different types) are more narrowly focused, but also probably learn faster because of advantages like weight sharing. The amount of data required to train the CNN up to superhuman performance on narrow tasks is comparable or less than that required to train a human visual system up to high performance. (but again the cortex is doing something more like transfer learning, which is harder)

Past 150 ms or so and humans start making multiple saccades and also start to integrate information from a larger number of brain regions, including frontal and temporal cortical regions. At that point the two systems aren't even comparable, humans are using more complex 'mental programs' over multiple saccades to make visual judgements.

Of course, eventually we will have AGI systems that also integrate those capabilities.

days of continuous simulated gameplay on the ATARI 2600 emulator to obtain good scores

That's actually extremely impressive - superhuman learning speed.

Therefore I would say that deep learning methods, while certainly interesting from an engineering perspective, are probably not very much relevant to the understanding of the brain, at least given the current state of the evidence.

In that case, I would say you may want to read up more on the field. If you haven't yet, check out the original sparse coding paper (over 3000 citations), to get an idea of how crucial new computational models have been for advancing our understanding of cortex.

Comment author: nshepperd 07 September 2015 04:32:22AM 0 points [-]

The ferret rewiring experiments, the tongue based vision stuff, the visual regions learning to perform echolocation computations in the blind, this evidence together is decisive against the evolved modularity hypothesis as I've defined that hypothesis, at least for the cortex. The EMH posits that the specific cortical regions rely on complex innate circuitry specialized for specific tasks. The evidence disproves that hypothesis.

It seems a little strange to treat this as a triumphant victory for the ULH. At the most, you've shown that the "fundamentalist" evolved modularity hypothesis is false. You didn't really address how the ULH explains this same evidence.

And there are other mysteries in this model, such as the apparent universality of specific cognitive heuristics and biases, or of various behaviours like altruism, deception, sexuality that seems obviously evolved. And, as V_V mentioned, the lateral asymmetry of the brain's functionality vs the macroscopic symmetry.

Otherwise, the conclusion I would draw from this is that both theories are wrong, or that some halfway combination of them is true (say, "universal" plasticity plus a genetic set of strong priors somehow encoded in the structure).

Comment author: nshepperd 29 August 2015 05:01:19AM 12 points [-]

I donated $10. I can't afford not to.

Comment author: DanielLC 21 April 2015 09:24:48PM 7 points [-]

I've come up with an interesting thought experiment I call oracle mugging.

An oracle comes up to you and tells you that either you will give them a thousand dollars or you will die in the next week. They refuse to tell you which. They have done this many times, and everyone has either given them money or died. The oracle isn't threatening you. They just go around and find people who will either give them money or die in the near future, and tell them that.

Should you pay the oracle? Why or why not?

Comment author: nshepperd 22 July 2015 04:00:14AM *  0 points [-]

This is essentially just another version of the smoking lesion problem, in that there is no connection, causal or otherwise, beween the thing you care about and the action you take. Your decision theory has no specific effect on your likelyhood of dying, that being determined entirely by environmental factors that do not even attempt to predict you. All you are paying for is to determine whether or not you get a visit from the oracle.

ETA: Here's a UDT game tree (see here for an explanation of the format) of this problem, under the assumption that oracle visits everyone meeting his criteria, and uses exclusive-or:

ETA2: More explanation: the colours are states of knowledge. Blue = oracle asks for money, Orange = they leave you alone. Let's say the odds of being healthy are α. If you Pay the expected reward is α(-1000) + (1-α) DEATH; if you Don't Pay the expected reward is α 0 + (1-α) DEATH. Clearly (under UDT) paying is worse by a term of -1000α.

Comment author: TheAncientGeek 18 May 2015 09:17:46AM *  1 point [-]

Loosemore, Yudkowsky, and myself are all discussing AIs that have a goal misaligned with human values that they nevertheless find motivating.

If that is supposed to be a universal or generic AI, it is a valid criticiYsm to point out that not all AIs are like that.

If that is supposed to be a particular kind of AI, it is a valid criticism to point out that no realistic AIs are like that.

You seem to feel you are not being understood, but what is being said is not clear,

1 Whether or not "superintelligent" is a meaningful term in this context

"Superintelligence" is one of the clearer terms here, IMO. It just means more than human intelligence, and humans can notice contradictions.

This comment seems to be part of a concernabout "wisdom", assumed to be some extraneous thing an AI would not necessarily have. (No one but Vaniver has brought in wisdom) The counterargument is that compartmentalisation between goals and instrumental knowledge is an extraneous thing an AI would not necessarily have, and that its absence is all that is needed for a contradictions to be noticed and acted on.

2 Whether we should expect generic AI designs to recognize misalignments, or whether such a realization would impact the goal the AI pursues.

It's an assumption, that needs justification, that any given AI will have goals of a non trivial sort. "Goal" is a term that needs tabooing.

Neither Yudkowsky nor I think either of those are reasonable to expect--as a motivating example, we are happy to subvert the goals that we infer evolution was directing us towards in order to better satisfy "our" goals. I

While we are anthopomirphising, it might be worth pointing out that humans don't show behaviour patterns of relentlessly pursuing arbitrary goals.

oals. I suspect that Loosemore thinks that viable designs would recognize it, but agrees that in general that recognition does not have to lead to an alignment

Loosemore has put forward a simple suggestion, which MIRI appears not to have considered at all, that on encountering a contradiction, an AI could lapse into a safety mode, if so designed,

3 ...sees cleverness and wisdom as closely tied together

You are paraphrasing Loosemoreto sound less technical and more handwaving than his actual comments. The ability to sustain contradictions in a system that is constantly updating itself isnt a given....it requires an architectural choice in favour of compartmentalisation.

Comment author: nshepperd 18 May 2015 09:45:52AM *  3 points [-]

All this talk of contradictions is sort of rubbing me the wrong way here. There's no "contradiction" in an AI having goals that are different to human goals. Logically, this situation is perfectly normal. Loosemore talks about an AI seeing its goals are "massively in contradiction to everything it knows about <BLAH>", but... where's the contradiction? What's logically wrong with getting strawberries off a plant by burning them?

I don't see the need for any kind of special compartmentalisation; information about "normal use of strawberries" is already inert facts with no caring attached by default.

If you're going to program in special criteria that would create caring about this information, okay, but how would such criteria work? How do you stop it from deciding that immortality is contradictory to "everything it knows about death" and refusing to help us solve aging?

Comment author: Richard_Loosemore 14 May 2015 03:20:26PM 1 point [-]

Not really.

The problem is that nshepperd is talking as if the term "intention" has some exact, agreed-upon technical definition.

It does not. (It might have in nshepperd's mind, but not elsewhere)

I have made it clear that the term is being used, by me, in a non-technical sense, whose meaning is clear from context.

So, declaring that the AI "does not have good intentions" is neither here nor there. It makes no difference if you want to describe the AI that way or not.

Comment author: nshepperd 15 May 2015 04:10:55AM *  2 points [-]

That would be fine if you and everyone else who tries to argue on this side of the debate do not proceed to then conclude from the statement that the AI has "good intentions" that it is making some sort of "error" when it fails to act on our cries that "doing X isn't good!" or "doing X isn't what we meant!".

Saying an AI has "good intentions" strongly implies that it cares about what is good, which is, y'know, completely false for a pleasure maximiser. (No-one is claiming that FAI will do evil things just because it's clever, but a pleasure maximiser is not FAI.)

You can't use words any way you like.

Comment author: Richard_Loosemore 12 May 2015 01:50:31PM 0 points [-]

This comment is both rude and incoherent (at the same level of incoherence as your other comments). And it is also pedantic (concentrating as it does on meanings of words, as if those words were being used in violation of some rules that ... you just made up).

Sorry to say this but I have to choose how to spend my time, in responding to comments, and this does not even come close to meriting the use of my time. I did that before, in response to your other comments, and it made no impact.

Comment author: nshepperd 12 May 2015 03:38:05PM 4 points [-]

Equivocation is hardly something I just made up.

Here's an exercise to try. Next time you go to write something on FAI, taboo the words "good", "benevolent", "friendly", "wrong" and all of their synonyms. Replace the symbol with the substance. Then see if your arguments still make sense.

Comment author: Richard_Loosemore 05 May 2015 12:59:04PM *  3 points [-]

You ask

What is your probability estimate that an AI would be a psychopath

and you give me a helpful hint:

(Hint: All computer systems produced until today are psychopaths by this definition.)

Well, first please note that ALL artifacts at the present time, including computer systems, cans of beans, and screwdrivers, are psychopaths because none of them are DESIGNED to possess empathy. So your hint contains zero information. :-)

What is the probability that an AI would be a psychopath if someone took the elementary step of designing it to have empathy? Probability would be close to 1, assuming the designers knew what empathy was, and knew how to design it.

But your question was probably meant to target the situation where someone built an AI and did not bother to give it empathy. I am afraid that that is outside the context we are examining here, because all of the scenarios talk about some kind of inevitable slide toward psychpathic behavior, even under the assumption that someone does their best to give the AI an empathic motivation.

But I will answer this: if someone did not even try to give it empathy, that would be like designing a bridge and not even trying to use materials that could hold up a person's weight. In both cases the hypothetical is not interesting, since designing failure into a system is something any old fool could do.

Your second remark is a classic mistake that everyone makes in the context of this kind of discussion. You mention that the phrase "benevolence toward humanity" means "benevolence" as defined by the computer code.

That is incorrect. Let's try, now, to be really clear about that, because if you don't get why it is incorrect we might waste a lot of time running around in circles. It is incorrect for two reasons. First, because I was consciously using the word to refer to the normal human usage, not the implementation inside the AI. Second, it is incorrect because the entire issue in the paper is that there is a discrepancy between the implementation inside the AI and normal usage, and that discrepancy is then examined in the rest of the paper. By simply asserting that the AI may believe, "correctly" that benevolence is the same as violence toward people, you are pre-empting the discussion.

In the remarks you make after that, you are reciting the standard line contained in all the scenarios that the paper is addressing. That standard line is analyzed in the rest of the paper, and a careful explanation is given for why it is incoherent. So when you simply repeat the standard line, you are speaking as if the paper did not actually exist.

I can address questions that refer to the arguments in the paper, but I cannot say anything if you only recite the standard line that is demolished in the course of the paper's argument. So if you could say something about the argument itself.....

Comment author: nshepperd 12 May 2015 08:04:01AM *  2 points [-]

This is an absolutely blatant instance of equivocation.

Here's the sentence from the post:

[believes that benevolence toward humanity might involve forcing human beings to do something violently against their will.]

Assume that "benevolence" in that sentence refers to "benevolence as defined by the AI's code". Okay, then justification of that sentence is straightforward: The fact that the AI does things against the human's wishes provides evidence that the AI believes benevolence-as-defined-by-code to involve that.

Alternatively, assume that "benevolence" there refers to, y'know, actual human benevolence. Then how do you justify that claim? Observed actions are clearly insufficient, because actual human benevolence is not programmed into its code, benevolence-as-defined-by-code is. What makes you think the AI has any opinions about actual human benevolence at all?

You can't have both interpretations.

(As an aside, I do disapprove of Muehlhauser's use of "benevolence" to refer to mere happiness maximisation. "Apparently benevolent motivations" would be a better phrase. If you're going to use it to mean actual human benevolence then you can certainly complain that the FAQ appears to assert that a happiness maximiser can be "benevolent", even though it's clearly not.)

View more: Next