Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Kaj_Sotala 20 June 2017 11:29:18PM *  0 points [-]

conflict seems to be the most plausible scenario (and one which has a high prior placed on it as we can observe that much suffering today is caused by conflict), but it seems to be less and less likely of a scenario once you factor in superintelligence, as multi-polar scenarios seem to be either very short-lived or unlikely to happen at all.

This seems plausible but not obvious to me. Humans are superintelligent as compared to chimpanzees (let alone, say, Venus flytraps), but humans have still formed a multipolar civilization.

Comment author: tristanm 21 June 2017 12:07:52AM 0 points [-]

When thinking about whether s-risk scenarios are tied to or come about by similar means as x-risk scenarios (such as a malign intelligence explosion), the relevant issue to me seems to be whether or not such a scenario could result in a multi-polar conflict of cosmic proportions. I think the chance of that happening is quite low, since intelligence explosions seem to be most likely to result in a singleton.

Comment author: cousin_it 20 June 2017 01:46:47PM *  9 points [-]

Wow!

Many thanks for posting that link. It's clearly the most important thing I've read on LW in a long time, I'd upvote it ten times if I could.

It seems like an s-risk outcome (even one that keeps some people happy) could be more than a million times worse than an x-risk outcome, while not being a million times more improbable, so focusing on s-risks is correct. The argument wasn't as clear to me before. Does anyone have good counterarguments? Why shouldn't we all focus on s-risk from now on?

(Unsong had a plot point where Peter Singer declared that the most important task for effective altruists was to destroy Hell. Big props to Scott for seeing it before the rest of us.)

Comment author: tristanm 20 June 2017 09:40:06PM 1 point [-]

The only counterarguments I can think of would be:

  • The claim that the likelihood of s-risks being close to that of x-risks seems not well argued to me. In particular, conflict seems to be the most plausible scenario (and one which has a high prior placed on it as we can observe that much suffering today is caused by conflict), but it seems to be less and less likely of a scenario once you factor in superintelligence, as multi-polar scenarios seem to be either very short-lived or unlikely to happen at all.

  • We should be wary of applying anthropomorphic traits to hypothetical artificial agents in the future. Pain in biological organisms may very well have evolved as a proxy to negative utility, and might not be necessary in "pure" agent intelligences which can calculate utility functions directly. It's not obvious to me that implementing suffering in the sense that humans understand it would be cheaper or more efficient for a superintelligence to do instead of simply creating utility-maximizers when it needs to produce a large number of sub-agents.

  • High overlap between approaches to mitigating x-risk and approaches to mitigating s-risks. If the best chance of mitigating future suffering is trying to bring about a friendly artificial intelligence explosion, then it seems that the approaches we are currently taking should still be the correct ones.

  • More speculatively: If we focus heavily on s-risks, does this open us up to issues regarding utility-monsters? Can I extort people by creating a simulation of trillions of agents and then threaten to minimize their utility? (If we simply value the sum of utility, and not necessarily the complexity of the agent having the utility, then this should be relatively cheap to implement).

Comment author: RomeoStevens 20 June 2017 02:26:41AM 1 point [-]

Agree about creation:critique ratio. Generativity/creativity training is the rationalist communities' current bottleneck IMO.

Comment author: tristanm 20 June 2017 03:28:47AM 2 points [-]

And I think we're mostly still trapped in a false implicit dogma that creativity is an innate talent that is possessed by some rare individuals and can't be duplicated in anyone who isn't already creative. What I'm hoping to be true is that you can train people to come up with good ideas, and that more importantly, if we can harness the ability of this community to look for errors in reasoning, even bad ideas can slowly be transformed into good ones, as long as we can come up with a decent framework for making that process robust.

Comment author: tristanm 17 June 2017 03:19:08PM *  6 points [-]

I have a few thoughts about this.

First I believe that there is always likely to be a much higher ratio of critique than content creation going on. This is not a problem in and of itself. But as has been mentioned and which motivated my post on the norm one principle, heavy amounts of negative feedback are likely to discourage content creation. If the incentives to produce content are outweighed by the likelihood that there will be punishments for bad contributions, then there will be very little productive activity going on, and we will be filtering out not just noise but also potentially useful stuff as well. So I am still heavily for establishing norms that regulate this kind of thing.

Secondly it seems that they very best content creators spend some time writing and making information freely available, detailing their goals and so on, and then eventually go off to pursue those goals more concretely, and the content creation on the site goes down. This is sort of what happened with the original creators of this site. This is not something to prevent, simply something we should expect to happen periodically. Ideally we would like people to still engage with each other even if primary content producers leave.

It's hard to figure out what the "consensus" is on specific ideas, or whether or not they should be pursued or discussed further, or whether people even care about them still. Currently the way content is produced is more like a stream of consciousness of the community as a whole. It goes in somewhat random directions, and it's hard to predict where people will want to go with their ideas or when engagement will suddenly stop. I would like some way of knowing what the top most important issues are and who is currently thinking about them, so I know who to talk to if I have ideas.

This is related to my earlier point about content creators leaving. We only occasionally get filtered down information about what they are working on. If I wanted to help them, I don't know who to contact about that, or what the proper protocols are about trying to become involved in those projects. I think the standard way these projects happen is a handful of people who are really interested simply start working on it, but they are essentially radio silent until they get to a point where they are either finished or feel they can't proceed further. This seems less than ideal to me.

A lot of these problems seem difficult to me, and so far my suggestions have mostly been around discourse norms. But again this is why we need more engagement. Speak up, and even if your ideas suck, I'll try to be nice and help you improve on them.

By the way, I think it's important to mention that even asking questions is actually really helpful. I can't count the number of times someone has asked me to clarify a point I made about something, and in the process of clarifying, I actually discovered some new issues or important details that I had previously missed, and it caused me to update because of that. So even if you don't think you can offer much insight, even just asking about things can be helpful, and you shouldn't feel discouraged about doing this.

Comment author: tristanm 14 June 2017 09:45:34PM 1 point [-]

The way that I choose to evaluate my overall experience is generally through the perception of my own feelings. Therefore, I assume this simulated world will be evaluated in a similar way: I perceive the various occurrences within it and rate them according to my preferences. I assume the AI will receive this information and be able to update the simulated world accordingly. The main difference then, appears to be that the AI will not have access to my nervous system, if my avatar is being represented in this world and that is all the AI has access to, which would prevent it from wire-heading by simply manipulating my brain however it wants. Likewise it would not have access to its own internal hardware or be able to model it (since that would require knowledge of actual physics). It could in theory be able to interact with buttons and knobs in the simulated world that were connected to its hardware in the real world.

I think this is basically the correct approach and it actually is being considered by AI researchers (take Paul's recent paper for example, human yes-or-no feedback on actions in a simulated environment). The main difficulty then becomes domain transfer, when the AI is "released" into the physical world - it now has access to both its own hardware and human "hardware", and I don't see how to predict its actions once it learns these additional facts. I don't think we have much theory for what happens then, but the approach is probably very suitable for narrow AI and for training robots that will eventually take actions in the real world.

Comment author: deluks917 07 June 2017 04:12:51AM 1 point [-]

Really great post. I really enjoyed the theoretical justification for a very practical idea. Overall I found the machine learning argument caused me to update significantly in favor of "norm-one" criticism. Some comments and questions:

1) Its not that clear to me how to estimate the "norm" of one's criticism. We aren't going to do math to commute this stuff. What kind of heuristics can we use? Notably the community requires some degree of consistency in how people estimate criticism norms.

2) If you strongly disagree with a proposition X it might be hard to give any norm-one criticism. Maybe someone is suggesting plan X and you very storngly think they should abandon the plan. It might feel dishonest, insincere, or immoral to give advice on how to make plan X go slightly less badly.

3) Say a friend of yours asks you to critique their writing. This advice basically says you should hold back on some/much of your feedback. In theory you should try to only send the feedback thats most useful but fits inside a "norm-one" limit. This seems different from the "wall of red ink" technique that is commonly praised in wiriting cricles. (Though I find the walls of red ink demoralizing I am not a writer).

4) Is it ever useful for someone to say: "Ignore the norm-one limit. Just give me all the criticism you have". Will it become "low status" not to ask for unlimited-norm-criticism?

Comment author: tristanm 07 June 2017 11:08:12PM 0 points [-]

1) Its not that clear to me how to estimate the "norm" of one's criticism. We aren't going to do math to commute this stuff. What kind of heuristics can we use? Notably the community requires some degree of consistency in how people estimate criticism norms.

I think that in any situation in which the overall quality of a contribution must be estimated, we will have the same problem. Ultimately, I believe it is going to require either some kind of averaged community sentiment, in a similar way to how things are upvoted / downvoted right now, or require heavy moderator involvement (and have lots of mods). Personally I think moderators have pretty good incentives to be honest and thorough in their judgement (since they could easily lose their status by making poor calls). I think they could be encouraged to notify people which portions of their comments need to be edited or removed, and allow time for changes like that to happen before taking any disciplinary actions. Being objectively close to norm one is probably not possible, but it is much more possible to determine when things are far away from this norm, which I think is the important thing.

2) If you strongly disagree with a proposition X it might be hard to give any norm-one criticism. Maybe someone is suggesting plan X and you very storngly think they should abandon the plan. It might feel dishonest, insincere, or immoral to give advice on how to make plan X go slightly less badly.

I think it is possible to decouple norm-one criticism from your overall appraisal of the plan itself. Personally, I believe it is possible to be sincere when giving advice on how to slightly improve the plan without stating any disapproval you may have. It may not be the most candid and transparent summary of your feelings, and I realize that some might feel difficulty repressing the urge to express them, but if I have to be honest about what my plan suggests, then that is what I believe has to be done.

There might yet be a place for overall appraisal to be given in each critique, separately from the rest of the critique that follows norm one. But I still think it is good to avoid appraisal that is overly negative. The reason I'm not very worried about this particular issue is that, for most proposals or plans that require collective action, there has to be a level of support that must be reached before any progress on it can be made. Therefore, I do not think there is much risk in not making disapproval well-known. You can simply opt-out of participation. I think there is room for exceptions in the case that someone is planning on taking dangerous actions by themselves, in which case it might be the correct action to try and stop them.

3) Say a friend of yours asks you to critique their writing. This advice basically says you should hold back on some/much of your feedback. In theory you should try to only send the feedback thats most useful but fits inside a "norm-one" limit. This seems different from the "wall of red ink" technique that is commonly praised in wiriting cricles. (Though I find the walls of red ink demoralizing I am not a writer).

Hm, I'm not at all familiar with the "wall of red ink" technique. I too would feel completely overwhelmed by that kind of thing. Funnily, just by randomly Googling a bit I found a writing education company called "NoRedInk".

4) Is it ever useful for someone to say: "Ignore the norm-one limit. Just give me all the criticism you have". Will it become "low status" not to ask for unlimited-norm-criticism?

That's a difficult question. I think it is possible that asking for unlimited criticism could become a status-signalling kind of thing, but I also feel that it wouldn't be subtle enough for it to really work, especially if the norm-one limit is a visible community principle. Then it might be possible to get called-out for doing that.

Comment author: lifelonglearner 06 June 2017 04:46:41AM *  1 point [-]

I found this to be useful. I had not explicitly reasoned about the hypothesis generation and subsequent iteration process like this.

For this part about updating with regards to criticism:

One of my strongest hopes is that whomever is playing the part of the "generator" is able to compile the list of critiques easily and use them to update somewhere close to the optimal direction. This would be difficult if the sum of all critiques is either directionless (many critics point in opposite or near-opposite directions) or very high-magnitude (Critics simply say to get as far away from here as possible).

I'm curious exactly what that might entail. Are there any good examples you can give where someone gives a hypothesis, and then some critique in a certain direction / magnitude causes them to shift? What is the analogy when applied to, say, posts on LW about motivation, for example?

(Maybe someone gives an equation for motivation that satisfies certain qualities. And then someone critiques by bringing up an important quality the equation misses out?)

Comment author: tristanm 06 June 2017 04:17:43PM 2 points [-]

I'm curious exactly what that might entail. Are there any good examples you can give where someone gives a hypothesis, and then some critique in a certain direction / magnitude causes them to shift?

Well, I think the recent Dragon Army post and subsequent discussion was a good example. It generated a huge volume of critique, much of it following Norm One, and some of it not. The stuff that did follow Norm One actually did point mainly in the same direction, and mostly consisted of suggestions for how to make the system more robust to failure and implementing proper safe-guards. This did seem to cause Duncan to update his plan in that direction, and made it a lot more palatable to some (consider Scott Alexander's shift of opinion on it).

Contrast that with the more hostile criticism from that discussion, which probably caused no one to update in any direction, and if anything made it more likely for people to become entrenched in their views.

Comment author: lmn 05 June 2017 11:14:47PM 1 point [-]

Magnitude - Is the criticism too harsh, does it point to something completely unlike the original proposal, or otherwise require changes that aren't feasible for the generator to make?

I'm confused, I thought the point was to avoid getting stuck in local maxima. Discouraging criticisms that are too harsh or demand too many changes sees a weird way of doing that.

Comment author: tristanm 05 June 2017 11:28:19PM 1 point [-]

The point is that when someone is exploring / testing an idea, it might be better for them to explore the region of small updates around the original proposal, instead of easily giving up and trying something completely different. Many ideas fail because of small details that were gotten wrong. When criticism is too harsh, it prevents people from doing even this. They might instead just keep proposing something close to what's already being tried. This is how you actually end up in a local minima.

Mode Collapse and the Norm One Principle

14 tristanm 05 June 2017 09:30PM

[Epistemic status: I assign a 70% chance that this model proves to be useful, 30% chance it describes things we are already trying to do to a large degree, and won't cause us to update much.] 

I'm going to talk about something that's a little weird, because it uses some results from some very recent ML theory to make a metaphor about something seemingly entirely unrelated - norms surrounding discourse. 

I'm also going to reach some conclusions that surprised me when I finally obtained them, because it caused me to update on a few things that I had previously been fairly confident about. This argument basically concludes that we should adopt fairly strict speech norms, and that there could be great benefit to moderating our discourse well. 

I argue that in fact, discourse can be considered an optimization process and can be thought of in the same way that we think of optimizing a large function. As I will argue, thinking of it in this way will allow us to make a very specific set of norms that are easy to think about and easy to enforce. It is partly a proposal for how to solve the problem of dealing with speech that is considered hostile, low-quality, or otherwise harmful. But most importantly, it is a proposal for how to ensure that the discussion always moves in the right direction: Towards better solutions and more accurate models. 

It will also help us avoid something I'm referring to as "mode collapse" (where new ideas generated are non-diverse and are typically characterized by adding more and more details to ideas that have already been tested extensively). It's also highly related to the concepts discussed in the Death Spirals and the Cult Attractor portion of the Sequences. Ideally, we'd like to be able to make sure that we're exploring as much of the hypothesis space as possible, and there's good reason to believe we're probably not doing this very well.  

The challenge: Making sure we're searching for the global optimum in model-space sometimes requires reaching out blindly into the frontiers, the not well-explored regions, which runs the risk of ending up somewhere very low-quality or dangerous. There are also sometimes large gaps between very different regions of model-space where the quality of the model is very low in-between, but very high on each side of the gap. This requires traversing through potentially dangerous territory and being able to survive the whole way through.

(I'll be using terms like "models" and "hypotheses" quite often, and I hope this isn't confusing. I am using them very broadly, to refer to both theoretical understandings of phenomenon and blueprints for practical implementations of ideas). 

We desire to have a set of principles which allows us to do this safely - to think about models of the world that are new and untested, solutions for solving problems that have never been done in a similar way - and they should ensure that, eventually, we can reach the global optimum. 

Before we derive that set of principles, I am going to introduce a topic of interest from the field of Machine Learning. This topic will serve as the main analogy for the rest of this piece, and serve as a model for how the dynamics of discourse should work in the ideal case. 

I. The Analogy: Generative Adversarial Networks

For those of you who are not familiar with the recent developments in deep-learning, Generative Adversarial Networks (GANs)[intro pdf here] are a new type of generative model class that are ideal for producing high-quality samples from very high-dimensional, complex distributions. They have caused great buzz and hype in the deep-learning community due to how impressive some of the samples they produce are, and how efficient they are at generation.

Put simply, a generator model and a critic (sometimes called a discriminator) model perform a two-player game where the critic is trained to distinguish between samples produced by the generator and the "true" samples taken from the data distribution. In turn, the generator is trained to maximize the critic's loss function. Both models are usually parametrized by deep neural networks and can be trained by taking turns running a gradient descent step on each. The Nash equilibrium of this game is when the generator's distribution matches that of the data distribution perfectly. This is never really borne out in practice, but sometimes it gets so close that we don't mind. 

GANs have one principal failure mode, which is often thought to be due to the instability of the system, which is often called "mode collapse" (a term I'm going to appropriate to refer to a much broader concept). It was often believed that, if a careful balance between the generator and critic could not be maintained, one would eventually overpower the other - leading the critic to provide either useless or overly harsh information to the generator. Useless information will cause the generator to update very slowly or not at all, and overly harsh information will lead the samples to "collapse" to a small region of the data space that are the easiest targets for the generator to hit.  

This problem was essentially solved earlier this year due to a series of papers that propose modifications to the loss functions that GANs use, and, most crucially, add another term to the critic's loss which stabilizes the gradient (with respect to the inputs) to have a norm close to one. It was recognized that we actually desire an extremely powerful critic so that the generator can make the best updates it possibly can, but the updates themselves can't go beyond what the generator is capable of handling. With these changes to the GAN formulation, it became possible to use crazy critic networks such as ultra-deep ResNets and train them as much as desired before updating the generator network.  

The principle behind their operation is rather simple to describe, but unfortunately, it is much more difficult to explain why they work so well. However, I believe that as long as we know how to make one, and know specific implementation details that improve their stability, then I believe their principles can be applied more broadly to achieve success in a wide variety of regimes. 

II. GANs as a Model of Discourse

In order to use GANs as a tool for conceptual understanding of discourse, I propose to model of the dynamics of debate as a collection of hypothesis-generators and hypothesis-critics. This could be likened to the structure of academia - researchers publish papers, they go through peer-review, the work is iterated on and improved - and over time this process converges to more and more accurate models of reality (or so we hope). Most individuals within this process play both roles, but in theory this process would still work even if they didn't. For example, Isaac Newton was a superb hypothesis generator, but he also had some wacky ideas that most of us would consider to be obviously absurd. Nevertheless, calculus and Newtonian physics became a part of our accepted scientific knowledge, and alchemy didn't. The system adopted and iterated on his good ideas while throwing away the bad. 

Our community should be capable of something similar, while doing it more efficiently and not requiring the massive infrastructure of academia. 

A hypothesis-generator is not something that just randomly pulls out a model from model-space. It proposes things that are close modifications of things it already holds to be likely within its model (though I expect this point to be debatable). Humans are both hypothesis-generators and hypothesis-critics. And as I will argue, that distinction is not quite as sharply defined as one would think. 

I think there has always been an underlying assumption within the theory of intelligence that creativity and recognition / distinction are fundamentally different. In other words, one can easily understand Mozart to be a great composer, but it is much more difficult to be a Mozart. Naturally this belief entered it's way into the field of Artificial Intelligence too, and became somewhat of a dogma. Computers might be able to play Chess, they might be able to play Go, but they aren't doing anything fundamentally intelligent. They lack the creative spark, they work on pure brute-force calculation only, with maybe some heuristics and tricks that their human creators bestowed upon them.  

GANs seem to defy this principle. Trained on a dataset of photographs of human faces, a GAN generator learns to produce near-photo-realistic images that nonetheless do not fully match any the faces the critic network saw (one of the reasons why CelebA was such a good choice to test these on), and are therefore in some sense producing things which are genuinely original. It may have once been thought that there was a fundamental distinction between creation and critique, but perhaps that's not really the case. GANs were a surprising discovery, because they showed that it was possible to make impressive "creations" by starting from random nonsense and slowly tweaking it in the direction of "good" until it eventually got there (well okay, that's basically true for the whole of optimization, but it was thought to be especially difficult for generative models).

What does this mean? Could someone become a "Mozart" by beginning a musical composition from random noise and slowly tweaking it until it became a masterpiece?

The above seems to imply "yes, perhaps." However, this is highly contingent on the quality of the "tweaking." It seems possible only as long as the directions to update in are very high quality. What if they aren't very high quality? What if they point nowhere, or in very bad directions?

I think the default distribution of discourse is that it is characterized by a large number of these directionless, low quality contributions. And that it's likely that this is one of the main factors behind mode collapse. This is related to what has been noted before: Too much intolerance for imperfect ideas (or ideas outside of established dogma) in a community prevent useful tasks from being accomplished, and progress from being made. Academia does not seem immune to this problem. Where low-quality or hostile discussion is tolerated is where this risk is greatest.  

Fortunately, making sure we get good "tweaks" seems to be the easy part. Critique is in high abundance. Our community is apparently very good at it. We also don't need to worry much about the ratio of hypothesis-generators to hypothesis-critics, as long as we can establish good principles that allow us to follow GANs as closely as possible. The nice feature of the GAN formulation is that you are allowed to make the critic as powerful as you want. In fact, the critic should be more powerful than the generator (If the generator is too powerful, it just goes directly to the argmax of the critic). 

(In addition, any collection of generators is a generator, and any collection of critics is a critic. So this formulation can be applied to the community setting).

III. The Norm One Principle

So the question then becomes, how do we take an algorithm governing a game between models much simpler than a human, and use the same tweaks which consist of nothing more than a few very simple equations? 

Here what I devise is a strategy for taking the concept of the norm of the critic gradient being as close to one as possible, and using that as a heuristic for how to structure appropriate discourse. 

(This is where my argument gets more speculative and I expect to update this a lot, and where I welcome the most criticism).

What I propose is that we begin modeling the concept of "criticism" based on how useful it is to the idea-generator receiving the criticism. Under this model, I think we should start breaking down criticism into two fundamental attributes:

  1. Directionality - does the criticism contain highly useful information, such that the "generator" knows how to update their model / hypothesis / proposal?
  2. Magnitude - Is the criticism too harsh, does it point to something completely unlike the original proposal, or otherwise require changes that aren't feasible for the generator to make?

My claim is that any contribution to a discussion should satisfy the "Norm One Principle." In other words, it should have a well-defined direction, and the quantity of change should be feasible to implement.

If a critique can satisfy our requirements for both directionality and magnitude, then it serves a useful purpose. The inverse claim to this is that if we can't follow these requirements, we risk falling into mode collapse, and the ideas commonly proposed are almost indistinguishable from the ones which preceded them, and ideas which deviate too far from the norm are harshly condemned and suppressed. 

I think it's natural to question whether or not restricting criticism to follow certain principles is a form of speech suppression that prevents useful ideas from being considered. But the pattern I'm proposing doesn't restrict the "generation" process, the creative aspect which produces new hypotheses. It doesn't restrict the topics that can be discussed. It only restricts the criticism of those hypotheses, such that they are maximally useful to the source of the hypothesis. 

One of the primary fears behind having too much criticism is that it discourages people from contributing because they want to avoid the negative feedback. But under the Norm One Principle, I think it is useful to distinguish between disagreement and criticism. I think if we're following these norms properly, we won't need to consider criticism to be a negative reward. In fact, criticism can be positive. Agreement could be considered "criticism in the same direction you are moving in." Disagreement would be the opposite. And these norms also eliminate the kind of feedback that tends to be the most discouraging. 

For example, some things which violate "Norm One":

  • Ad hominem attacks (typically directionless). 
  • Affective Death Spirals (unlimited praise or denunciation is usually directionless, and usually very high magnitude). 
  • Signs that cause aversion (things I "don't like", that trigger my System 1 alarms, which probably violates both directionality and magnitude). 
  • Lengthy lists of changes to make (norm greater than 1, ideally we want to try to focus on small sets of changes that have the highest priority). 
  • Repetition of points that have already been made (norm greater than one). 

One of my strongest hopes is that whomever is playing the part of the "generator" is able to compile the list of critiques easily and use them to update somewhere close to the optimal direction. This would be difficult if the sum of all critiques is either directionless (many critics point in opposite or near-opposite directions) or very high-magnitude (Critics simply say to get as far away from here as possible). 

But let's suppose that each individual criticism satisfies the Norm One principle. We will also assume that the generator is weighing each critique by their respect for whoever produced it, which I think is highly likely. Then the generator should be able to move in a direction unless the sum of the directions completely cancel out. It is unlikely for this to happen - unless there is very strong epistemic disagreement in the community over some fundamental assumptions (in which case the conversation should probably move over to that). 

In addition, it also becomes less likely for the directions to cancel out as the number of inputs increases. Thus, it seems that proposals for new models should be presented to a wide audience, and we should avoid the temptation to keep our proposals hidden to all except for a small set of people we trust.

So I think that in general, this proposed structure should tend to increase the amount of collective trust we have in the community, and that it favors transparency and favors diversity of viewpoints. 

But what of the possible failure modes of this plan? 

This model should fail if the specific details of its implementation either remove too much discussion, or fail to deal with individuals who refuse to follow the norms and refuse to update. Any implementation should allow room for anyone to update. Someone who posts an extremely hostile, directionless comment should be allowed chances to modify their contribution. The only scenario in which the "banhammer" becomes appropriate is when this model fails to apply: The cardinal sin of rationality, the refusal to update. 

IV. Building the Ideal "Generator"

As a final point, I'll note that the above assumes that generators will be able to update their models incrementally. The easy part, as I mentioned, was obtaining the updates, the hard part is accumulating them. This seems difficult with the infrastructure we have in place. What we do have is a good system for posting proposals and receiving feedback (The blog post / comment thread set-up), but this assumes that each "generator" is keeping track of their models by themselves and has to be fully aware of the status of other models on their own. There is no centralized "mixture model" anywhere that contains the full set of models weighted by how much probability they are given by the community. Currently, we do not have a good solution for this problem. 

However, it seems that the first conception of Arbital was centered around finding a solution to this kind of problem:

Arbital has bigger ambitions than even that. We all dream of a world that eliminates the duplication of effort in online argument - a world where, the same way that Wikipedia centralized the recording of definite facts, an argument only needs to happen once, instead of being reduplicated all over the Internet; with all the branches of the argument neatly recorded in the same place, along with some indication of who believes what. A world where 'just check Arbital' had the same status for determining the current state of debates, as 'just check Wikipedia' now has when somebody starts arguing about the population of Melbourne. There's entirely new big subproblems and solutions, not present at all in the current Arbital, that we'd need to tackle that considerably more difficult problem. But to solve 'explaining things' is something of a first step. If you have a single URL that you can point anyone to for 'explaining Bayes', and if you can dispatch people to different pages depending on how much math they know, you're starting to solve some of the key subproblems in removing the redundancy in online arguments.

If my proposed model is accurate, then it suggests that the problem Arbital aims to solve is in fact quite crucial to solve, and that the developers of Arbital should consider working through each obstacle they face without pivoting from this original goal. I feel confident enough that this goal should be high priority that I'd be willing to support its development in whatever way is deemed most helpful and is feasible for me (I am not an investor, but I am a programmer and would also be capable of making small donations, or contributing material). 

The only thing that this model would require for Arbital to do would be to make it as open as possible to contribute, and then perform heavy moderation or filtering of contributed content (but importantly not the other way around, where it is closed to small group of trusted people).

Currently, the incremental changes that would have to be made to LessWrong and related sites like SSC would simply be increased moderation of comment quality. Otherwise, any further progress on the problem would require overcoming much more serious obstacles requiring significant re-design and architecture changes. 

Everything I've written above is also subject to the model I've just outlined, and therefore I expect to make incremental updates as feedback to this post accrues.

My initial prediction for feedback to this post is that the ideas might be considered helpful and offer a useful perspective or a good starting point, but that there are probably many details that I have missed that would be useful to discuss, or points that were not quite well-argued or well thought-out. I will look out for these things in the comments.   

Comment author: tcheasdfjkl 01 June 2017 03:47:27AM 3 points [-]

I'm new on this site (though I've been in other rationalist spaces) and have some technical questions!

  1. How do I upvote things? I do not see an upvote button. Is something broken or am I missing something?

  2. On mobile (Android), I can type a comment but I cannot submit it (there is no submit button). Is this a known issue, or again, am I missing something?

Comment author: tristanm 04 June 2017 04:24:23PM 0 points [-]

On mobile (Android), I can type a comment but I cannot submit it (there is no submit button). Is this a known issue, or again, am I missing something?

I've gotten around it by typing a comment (long enough that it causes a slider bar on the side of the box to show up) and then clicking on the bottom right corner and sliding it around so that the box resizes. For some reason, this causes the comment button to show up.

View more: Next