I don't see why the "pattern matching" is invalid. Pattern matching is predominantly implemented using Bayesian methods in software, and humans perform comparably well or better, implying that it is internally Bayesian. In this particular case the issue is that probability of legitimate independent reinvention of hell from different first principles seems dramatically smaller than probability of re-rationalizing hell in a different framework, and subsequently when you see a variation on the theme of hell it is, in fact, much more likely that it was the latter. It would only become more likely to be former if the arguments were coherent and rigid, which they are not.
The pattern matching's conclusions are wrong because the information it is matching on is misleading. The article implied that there was widespread belief that the future AI should be assisted, and this was wrong. Last I looked it still implied widespread support for other beliefs incorrectly.
This isn't an indictment of pattern matching so much as a need for the information to be corrected.
I've had people come to me who are traumatised by basilisk considerations. From what I can tell almost all of the trauma is attributable to Eliezer's behavior. The descriptions of the experience give clear indications (ie. direct self reports that are coherent) that a significant reason that they "take the basilisk seriously" is because Eliezer considers it a sufficiently big deal that he takes such drastic and emotional action. Heck, without Eliezer's response it wouldn't even have earned that title. It'd be a trivial backwater game theory question to which there are multiple practical answers.
I get the people who've been frightened by it because EY seems to take it seriously too. (Dmytry also gets them, which is part of why he's so perpetually pissed off at LW. He does his best to help, as a decent person would.) More generally, people distressed by it feel they can't talk about it on LW, so they come to RW contributors - addressing this was why it was made a separate article. (I have no idea why Warren Ellis then Charlie Stross happened to latch onto it - I wish they hadn't, because it was totally not ready, so I had to spend the past few days desperately fixing it up, and it's still terrible.) EY not in fact thinking it's feasible or important is a point I need to address in the last section of the RW article, to calm this concern.
It would be nice if you'd also address the extent to which it misrepresents other LessWrong contributors as thinking it is feasible or important (sometimes to the point of mocking them based on its own misrepresentation). People around LessWrong engage in hypothetical what-if discussions a lot; it doesn't mean that they're seriously concerned.
Lines like "Though it must be noted that LessWrong does not believe in or advocate the basilisk ... just in almost all of the pieces that add up to it." are also pretty terrible given we know only a fairly small percentage of "LessWrong" as a whole even consider unfriendly AI to be the biggest current existential risk. Really, this kind of misrepresentation of alleged, dubiously actually held extreme views as the perspective of the entire community is the bigger problem with both the LessWrong article and this one.
I agree with what you are saying about scaling, as exemplified by sharded databases. But I am not convinced that any problem can be sharded that easily; as you yourself have said:
Figuring out a scalable equivalent to a non-parallel algorithm is hard. Scalable databases, for example, don't support the same set of queries as a simple MySQL server...
This is one reason why even Google's datastore, AFAIK, does not implement exactly this kind of architecture -- though it is still heavily sharded. This type of a datastructure does not easily lend itself to purely general computation, either, since it relies on precomputed indexes, and generally exploits some very specific property of the data that is known in advance. And, as you also mentioned, even with these drastic tradeoffs you still get O(n log(n)).
You mention Amazon (in addition to Google) as one example of a massively distributed system, but note that both Google and Amazon are already forced to build redundant data centers in separate areas of the Earth, in order to reduce network latency. This is important, because we aren't dealing with abstract tree nodes, but with physical machines, which have a certain volume (among other things). This means that, even in an absolutely ideal situation where we can ignore power, heat dissipation, and network congestion, you will still run into the speed of light as a limiting factor. In fact, high-frequency trading systems are already running up against this limit even today. This means that you'll run out of room to scale a lot faster than you run out of atoms of the Earth.
First, examining the dispute over whether scalable systems can actually implement a distributed AI...
This is one reason why even Google's datastore, AFAIK, does not implement exactly this kind of architecture -- though it is still heavily sharded. This type of a datastructure does not easily lend itself to purely general computation, either, since it relies on precomputed indexes, and generally exploits some very specific property of the data that is known in advance.
That's untrue; Google App Engine's datastore is not built on exactly this architecture, but is built on one with these scalability properties, and they do not inhibit its operation. It is built on BigTable, which builds on multiple instances of Google File System, each of which has multiple chunk servers. They describe this as intended to scale to hundreds of thousands of machines and petabytes of data. They do not define a design scaling to an arbitrary number of levels, but there is no reason an architecturally similar system like it couldn't simply add another level and add on another potential roundtrip. I also omit discussion of fault-tolerance, but this doesn't present any additional fundamental issues for the described functionality.
In actual application, its architecture is used in conjunction with a large number of interchangeable non-data-holding compute nodes which communicate only with the datastore and end users rather than each other, running identical instances of software running on App Engine. This layout runs all websites and services backed by Google App Engine as distributed, scalable software, assuming they don't do anything to break scalability. There is no particular reliance of "special properties" of the data being stored, merely limited types of searching of the data which is possible. Even this is less limited than you might imagine; full text search of large texts has been implemented fairly recently. A wide range of websites, services, and applications are built on top of it.
The implication of this is that there could well be limitations on what you can build scalably, but they are not all that restrictive. They definitely don't include anything for which you can split data into independently processed chunks. Looking at GAE some more because it's a good example of a generalised scalable distributed platform, the software run on the nodes is written in standard Turing-complete languages (Python, Java, and Go) and your datastore access includes read and write by key and by equality queries on specific fields, as well as cursors. A scalable task queue and cron system mean you aren't dependent on outside requests to drive anything. It's fairly simple to build any such chunk processing on top of it.
So as long as an AI can implement its work in such chunks, it certainly can scale to huge sizes and be a scalable system.
And, as you also mentioned, even with these drastic tradeoffs you still get O(n log(n)).
And as I demonstrated, O(n log n) is big enough for a Singularity.
And now on whether scalable systems can actually grow big in general...
You mention Amazon (in addition to Google) as one example of a massively distributed system, but note that both Google and Amazon are already forced to build redundant data centers in separate areas of the Earth, in order to reduce network latency.
Speed of light as an issue is not a problem for building huge systems in general, so long as the number of roundtrips rises as O(n log n) or less, because for any system capable of at least tolerating roundtrips to the other side of the planet (few hundred milliseconds), it doesn't become more of an issue as a system gets bigger, until you start running out of space on the planet surface to run fibre between locations or build servers.
The GAE datastore is already tolerating latencies sufficient to cover distances between cities to permit data duplication over wide areas, for fault tolerance. If it was to expand into all the space between those cities, it would not have the time for each roundtrip increase until after it had filled all the space between them with more servers.
Google and Amazon are not at all forced to build data centres in different parts of the Earth to reduce latency; this is a misunderstanding. There is no technical performance degradation caused by the size of their systems forcing them to need the latency improvements to end users or the region-scale fault tolerance that spread out datacentres permit. They can just afford it more easily. You could argue there are social/political/legal reasons they need it more, higher expectations of their systems and similar, but these aren't relevant here. This spreading out is actually largely detrimental to their systems since spreading out this way increases latency between them, but they can tolerate this.
Heat dissipation, power generation, and network cabling needs all also scale as O(n log n), since computation and communication do and those are the processes which create those needs. Looking at my previous example, the amount of heat output, power needed, and network cabling required per amount of data processed will increase by maybe an order of magnitude in scaling such a system upwards by tens of orders of magnitude, 5x for 40 orders of magnitude in the example I gave. This assumes your base amount of latency is still enough to cover the distance between the most distant nodes (for an Earth bound system, one side of the planet to the other), which is entirely reasonable latency-wise for most systems; a total of 1.5 seconds for a planet-sized system.
This means that no, these do not become an increasing problem as you make a scalable system expand, any more so than provision of the nodes themselves. You are right in that that heat dissipation, power generation, and network cabling mean that you might start to hit problems before literally "running out of planet", using up all the matter of the planet; that example was intended to demonstrate the scalability of the architecture. You also might run out of specific elements or surface area.
These practical hardware issues don't really create a problem for a Singularity, though. Clusters exist now with 560k processors, so systems at least this big can be feasibly constructed at reasonable cost. So long as the software can scale without substantial overhead, this is enough unless you think an AI would need even more processors, and that the software could is the point that my planet-scale example was trying to show. You're already "post Singularity" by the time you seriously become unable to dissipate heat or run cables between any more nodes.
This means that, even in an absolutely ideal situation where we can ignore power, heat dissipation, and network congestion, you will still run into the speed of light as a limiting factor. In fact, high-frequency trading systems are already running up against this limit even today.
HFT systems desire extremely low latency; this is the sole cause of their wish to be close to the exchange and to have various internal scalability limitations in order to improve speed of processing. These issues don't generalise to typical systems, and don't get worse at above O(n log n) for typical bigger systems.
It is conceivable that speed of light limitations might force a massive, distributed AI to have high, maybe over a second latency in actions relying on knowledge from all over the planet, if prefetching, caching, and similar measures all fail. But this doesn't seem like nearly enough to render one at all ineffective.
There really aren't any rules of distributed systems which says that it can't work or even is likely not to.
I may be wrong, but don't all distributed systems suffer from diminishing returns in this way ? For example, doubling the number of CPUs in a computing cluster does not allow you to solve your calculations twice as quickly. Your overhead, such as control infrastructure and plain old network latency, increases faster than linearly with every CPU you add, and eventually outgrows the useful processing power you can get out of new CPUs.
This is one of the many reasons why I'm not worried about the Singularity...
Restricting the topic to distributed computation, the short answer is "essentially no". The rule is that you get at best linear returns, not that your returns diminish greatly. There are a lot of problems which are described as "embarassingly parallel", in that scaling them out is easy to do with quite low overhead. In general, any processing of a data set which permits it to be broken into chunks which can be processed independently would qualify, so long as you were looking to increase the amount of data processed by adding more processors rather than process the same data faster.
For scalable distributed computation, you use a system design whose total communication overhead rises as O(n log n) or lower. The upper bound here is superlinear, but gets closer to linear the more additional capacity is added, and so scales well enough that with a good implementation you can run out of planet to make the system out of before you get too slow. Such systems are quite achievable.
The DNS system would be an important example of a scalable distributed system; if adding more capacity to the DNS system had substantially diminishing returns, we would have a very different Internet today.
An example I know well enough to walk through in detail is a scalable database in which data is allocated to shards, which manage storage of that data. You need a dictionary server to locate data (DNS-style) and handle moving blocks of it between shards, but this can then be sharded in turn. The result is akin to a really big tree; number of lookups (latency) to find the data rises with the log of the data stored, and the total number of dictionary servers at all levels does not rise faster than the number of shards with Actual Data at the bottom level. Queries can be supported by precomputed indexes stored in the database themselves. This is similar to how Google App Engine's datastore operates (but much simplified).
With this fairly simple structure, the total cost of all reads/writes/queries theoretically rises superlinearly with the amount of storage (presuming read/write/queries and amount of data scale linearly with each other), due to the dictionary server lookups, but only as O(n log(n)). If you were trying, with current day commodity hard disks and a conceptually simple on-disk tree, a dictionary server could reasonably store information for ten billion shards (500 bytes * 10 billion = ~5 TB), two levels of sharding giving you a hundred billion billion data-storing shards, three giving a thousand billion billion billion data-storing shards. Five levels, five latency delays would give you more bottom-level shards than there are atoms on Earth. This is why, while scalability will eventually limit a O(n log(n)) architecture, in this case because the cost of communicating with subshards of subshards becomes too high, you can run out of planet first.
This can be generalised; if you imagine that each shard performs arbitrary work on the data sent to it, and when the data is read back you get the results of the processing on that data, you get a scalable system which does any processing on a dataset than can be done by processing chunks of data independently from one another. Image or voice recognition matching a single sample against a huge dataset would be an example.
This isn't to trivialise the issues of parallelising algorithms. Figuring out a scalable equivalent to a non-parallel algorithm is hard. Scalable databases, for example, don't support the same set of queries as a simple MySQL server because a MySQL server implements some queries by iterating all the data, and there's no known way to perform them in a scalable way. Instead, software using them finds other ways to implement the feature.
However, scalable-until-you-run-out-of-planet distributed systems are quite possible, and there are some scalable distributed systems doing pretty complex tasks. Search engines are the best example which comes to mind of systems which bring data together and do complex synthesis with it. Amazon's store would be another scalable system which coordinates a substantial amount of real world work.
The only question is whether a (U)FAI specifically can be implemented as a scalable distributed system, and considering the things we know can be divided or done scalably, as well as everything which can be done with somewhat-desynchronised subsystems which correct errors later (or even are just sometimes wrong), it seems quite likely that (assuming one can be implemented at all) it could implement its work in the form of problems which can be solved in a scalable fashion.
On a neutral note - We aren't enemies here. We all have very similar utility functions, with slightly different weights on certain terminal values (PR) - which is understandable as some of us have more or less to lose from LW's PR.
I disagree that this is the entire source of the dispute. I think that even when constrained to optimizing only for good PR, this is an instrumentally ineffective method of achieving that. Censorship is worse for PR than the problem in question, especially given that that problem in question is thus far nonexistent
To convince Eliezer - you must show him a model of the world given the policy that causes ill effects he finds worse than the positive effects of enacting the policy.
This is trivially easy to do, since the positive effects of enacting the policy are zero, given that the one and only time this has ever been a problem, the problem resolved itself without censorship, via self-policing.
Well... the showing him the model part is trivially easy anyway. Convincing him... apparently not so much.
This model trivially shows that censoring espousing violence is a bad idea, if and only if you accept the given premise that censorship of espousing violence is a substantial PR negative. This premise is a large part of what the dispute is about, though.
Not everyone is you; a lot of people feel positively about refusing to provide a platform to certain messages. I observe a substantial amount of time expended by organisations on simply signalling opposition to things commonly accepted as negative, and avoiding association with those things. LW barring espousing violence would certainly have a positive effect through this.
Negative effects from the policy would be that people who do feel negatively about censorship, even of espousing violence, would view LW less well.
The poll in this thread indicates that a majority of people here would be for moderators being able to censor people espousing violence. This suggests that for the majority here it is not bad PR for the reason of censorship alone, since they agree with its imposition. I would expect myself for people outside LW to have an even stronger preference in favour of censorship of advocacy of unthinkable dangerous ideas, suggesting a positive PR effect.
Whether people should react to it in this manner is a completely different matter, a question of the just world rather than the real one.
And this is before requiring any actual message be censored, and considering the impact of any such censorship, and before considering what the particular concerns of the people who particularly need to be attracted are.
This poll, like EY's original question, conflates two things that don't obviously belong together. (1) Advocating certain kinds of act. (2) "Asking about" the same kind of act.
I appreciate that in some cases "asking about" might just be lightly-disguised advocacy, or apparent advocacy might just be a particularly vivid way of asking a question. I'm guessing that the quotes around "asking about" are intended to indicate something like the first of these. But what, exactly?
I think in this context, "asking about" might include raising for neutral discussion without drawing moral judgements.
The connection I see between them is that if someone starts neutral discussion about a possible action, actions which would reasonably be classified as advocacy have to be permitted if the discussion is going to progress smoothly. We can't discuss whether some action is good or bad without letting people put forward arguments that it is good.
Or Really Extreme Altruism?
This is an example of why I support this kind of censorship. Lesswrong just isn't capable of thinking about such things in a sane way anyhow.
The top comment in that thread demonstrates AnnaSalamon being either completely and utterly mindkilled or blatantly lying about simple epistemic facts for the purpose of public relations. I don't want to see the (now) Executive Director of CFAR doing either of those things. And most others are similarly mindkilled, meaning that I just don't expect any useful or sane discussion to occur on sensitive subjects like this.
(ie. I consider this censorship about as intrusive as forbidding peanuts to someone with a peanut allergy.)
I think that a discussion in which only most people are mindkilled can still be a fairly productive one on these questions in the LW format. LW is actually one of the few places where you would get some people who aren't mindkilled, so I think it is actually good that it achieves this much.
They seem fairly ancillary tor LW as a place for improving instrumental or epistemic rationality, though. If you think testing the extreme cases of your models of your own decision-making is likely to result in practical improvements in your thinking, or just want to test yourself on difficult questions, these things seem like they might be a bit helpful, but I'm comfortable with them being censored as a side effect of a policy with useful effects.
I thus have to disagree that an indefinite ban for ignorance of advice or an unwritten policy would be an appropriate or optimum response.
I'd end "indefinite" the moment the offending material was redacted with apologies. Stop breaking the rule, stop being excluded. Continue breaking the rule, stay excluded.
Ah, I see. That makes sense. They weren't actually asked to remove the whole of the quoting, just to remove some unrelated lines, which has been complied with, so there's no unimplemented requests as far as I know.
Of course, it might just have not asked for because having it pulled at this point could cause a worse mess than leaving it up, with more reputation damage. Some third party moderator could request it to avoid that issue, but I think at this point the horse is long gone and going to the work of closing the barn door might not be worth it.
It'd be reasonable for a hypothetical moderator taking an appropriate action to request they replace the whole thing with a summary, though; that makes sense.
The morally (and socially) appropriate thing to do at this point would be to apologize and pledge not to use that kind of language on IRC in the future
It would be appropriate to first announce that startling has been banned from the IRC channel until further notice for violating the rather clear privacy agreements. What the folks at #lesswrong decide to do after that about altering or enforcing norms about what may be said on #lesswrong then depends on said people's preferences and any public relations concerns they may have.
My explicit use of "the folks at #lesswrong" above leads me to the more important point: Someone suitably official from here (lesswrong.com) should clearly disavow any affiliation with that IRC channel. As far as I know it is in no way official and is related to lesswrong.com only in as much as some of the users from lesswrong also have accounts there (just as some users from here also participate on RationalWiki). Those times that anything about #lesswrong has bled over to lesswrong.com the impression I've been given of the former is rather unappealing.
(It is probably unenforceable but asking the group politely to rename their channel seems like a wise move and is certainly what I would do were I a lesswrong.com authority!)
Quoting without permission was clearly a mistake, but describing it as a "rather clear privacy agreement" is not particularly apt; Freenode policy on this is written as strong advice rather than "rules" as such, and the channel itself had no clear policy. As it was, it was mostly a social convention violation. I thus have to disagree that an indefinite ban for ignorance of advice or an unwritten policy would be an appropriate or optimum response. What's happened so far- the person being corrected quite sharply here and on the channel, and a clear privacy agreement added to the IRC channel topic for next time- seems like a reasonable remedy.
More specifically, the Freenode policy item in question is entitled "If you're considering publishing channel logs, think it through.", the section on constant public logging by the channel staff says "should" throughout, and the bit at the end about quoting publicly as a user ends with "Avoid the temptation to publish or distribute logs without permission in order to portray someone in a bad light. The reputation you save will most likely be your own." rather than stating that it is actually a violation of anything in particular.
What is fairly solid Freenode policy, though, is that unofficial channels of things have to use the ##<name> format, and #<name> format is reserved for generally official project channels. I don't know if the Less Wrong site admins and #lesswrong admins overlap, but if hypothetically Less Wrong wanted to disaffiliate #lesswrong, it is actually entirely possible for Less Wrong administrators to force #lesswrong to, at the least, migrate to ##lesswrong or a different IRC network.
As a #lesswrong user since I started reading the Sequences originally, though, I don't think this is a good idea. Having a real-time discussion channel is a nice thing for those that benefit from it. The IRC channel, listed on the wiki, was the first place I gravitated towards for discussing LW stuff, preferring it to comments. It is fairly Less Wrong focused; links to and discussions of Less Wrong posts are the key focus, even with a lot of other interesting conversations, evaluations, thoughts, etc, perhaps having more actual conversation time. What you remember as having bled over is unrepresentative, I feel.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
It's the belief pushed by the organization running this place. You could poll scientologists and see that majority probably haven't even heard of that xenu thing; does not matter, the xenu thing is far more informative of scientology than majority opinion is.
Assuming by "it" you refer to the decision theory work, that UFAI is a threat, Many Worlds Interpretation, things they actually have endorsed in some fashion, it would be fair enough to talk about how the administrators have posted those things and described them as conclusions of the content, but it should accurately convey that that was the extent of "pushing" them. Written from a neutral point of view with the beliefs accurately represented, informing people that the community's "leaders" have posted arguments for some unusual beliefs (which readers are entitled to judge as they wish) as part of the content would be perfectly reasonable.
It would also be reasonable to talk about the extent to which atheism is implicitly pushed in stronger fashion; theism is treated as assumed wrong in examples around the place, not constantly but to a much greater degree. I vaguely recall that the community has non-theists as a strong majority.
The problem is that this is simply not what the articles say. The articles imply strongly that the more unusual beliefs posted above are widely accepted- not that they are posted in the content but that they are believed by Less Wrong members, part of the identity of someone who is a Less Wrong user. This is simply wrong. And the difference is significant; it is incorrectly accusing all people interested in the works of a writer of being proponents of that writer's most unusual beliefs, discussed only in a small portion of their total writings. And this should be fixed so they convey an accurate impression.
The Scientology comparison is misleading in that Scientology attempts to use cult practices to achieve homogeneity of beliefs, whereas Less Wrong does not- the poll solidly demonstrates that homogeneity of beliefs is not a thing which is happening. A better analogy would be a community of fans of the works of a philosopher who wrote a lot of stuff and came to some outlandish conclusions in parts, but the fans don't largely believe that outlandish stuff. Yeah, their outlandish stuff is worth discussing- but presenting it as the belief of the community is wrong even if the philosopher alleges it all fits together. Having an accurate belief here matters, because it has greatly different consequences. There are major practical differences in how useful you'd expect the rest of the content to be, and how you'd perceive members of the community.
At present, much of the articles are written as "smear pieces" against Less Wrong's community. As a clear and egregious example, it alleges they are "libertarian", for example, clearly a shot at LW given RW's readerbase, when surveys tell us that the most common political affiliation is "liberalism", and while "libertarianism" is second, "socialism" is third. It does this while citing one of the surveys in the article itself. Many of the problems here are not subtle.
If by "it" you meant the evil AI from the future thing, it most certainly is not "the belief pushed by the organization running this place"; any reasonable definition of "pushing" something would have to meancommunicating it to people and attempting to convince them of it, and if anything they're credibly trying to stop people from learning about it. There are no secret "higher levels" of Less Wrong content only shown to the "prepared", no private venues conveying it to members as they become ready, so we can be fairly certain given publicly visible evidence that they aren't communicating it or endorsing it as a belief to even 'selected' members.
It doesn't obviously follow from anything posted on Less Wrong, it requires putting a whole bunch of parts together and assuming it is true.