Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Viliam 10 February 2017 10:53:49AM 6 points [-]

The original purpose of downvoting was to allow community moderation. Here, "moderation" means two things: (1) Giving higher visibility to high-quality content. This functionality we still have, it's the upvotes. (2) Removing low-quality content. Comments with karma below -5 and their whole subthreads are collapsed by default. This is especially important when some newcomers start spamming LW with a lot of low-quality comments. It happened more frequently in the past when LW was more popular.

And the "community" aspect means that these decisions about what to show prominently and what to hide are done by the local "hive mind", i.e. everyone, more precisely anyone above some amount of karma. This is good for several reasons: "wisdom of the crowds", preventing a few people from getting disproportional power, but most practically because moderators are busy and unable to review everything.

Why was it disabled:

The previous political debates on LW attracted one very persistent and very "mind-killed" person, known as Eugine. This guy made it his personal mission to promote neoreactionary politics on LW, and to harass away everyone who disagrees (because people who disagree with him or with neoreaction are by definition irrational people and don't belong here). To achieve this, he abused the downvoting system.

The first form of abuse was punishing everyone who disagreed with him by going through their comment history and downvoting all their previous comments. That means, one day you wrote a comment he didn't like, and the next day you lost hundreds of karma points. And afterwards, any comment you wrote, immediately had one downvote.

This was against how the karma system was supposed to be used (you were supposed to vote on specific comments, not users), and pretty much ruined our important feedback system. Eugine was asked to stop doing this, he didn't give a fuck. So his account was banned, but he created another one, and then another. So it became a game of whack-a-mole, where Eugine created hundreds of accounts, and moderators tried to find and remove them. Even worse, with multiple accounts Eugine started multiple voting, which means that if he disliked a comment, he downvoted it from dozen accounts, immeditely moving its karma into negative numbers. He typically downvoted all comments that disagreed with neoreactionary politics, or which mentioned Eugine.

LessWrong code is a clone of Reddit; it is not an elegant code, and the database is even less elegant. A few professional web developers tried to implement a few changes; most of them left crying, and the few changes that were successfully implemented took a lot of time. Fighting with Eugine was a huge drain of resources, and one of the main reasons why currently LW is "dead".

What now:

The short-term solution was to disable downvotes, thus removing from Eugine his ability to censor comments he doesn't like. Yeah, it has a few negative side-effects.

A long-term solution is to move the whole website to a completely different codebase, which will be easier to maintain. This is a work in progress. Respectful of the planning fallacy I will not give any estimates, except "it will be done when it will be done". On the new software, downvoting (or some other method of removing low-quality content) will presumably exist.

Comment author: VincentYu 10 February 2017 03:05:06PM 1 point [-]

Thanks for writing such a comprehensive explanation!

Comment author: VincentYu 10 February 2017 06:25:09AM 0 points [-]

Why is downvoting disabled, for how long has it been like this, and when will it be back?

Comment author: gjm 09 February 2017 11:58:53AM 2 points [-]

the whole point of this forum

It really isn't. One of the reasons for the founding of this forum, yes. But what this forum is meant to be for is advancing the art of human rationality. If compelling evidence comes along that AI safety research is useless and AI research is vanishingly unlikely to have the sort of terrible consequences feared by the likes of MIRI, then "this forum" should be very much in the business of advocating against AI safety research.

Comment author: VincentYu 09 February 2017 02:03:18PM *  1 point [-]

In support of your point, MIRI itself changed (in the opposite direction) from its former stance on AI research.

You've been around long enough to know this, but for others: The former ambition of MIRI in the early 2000s—back when it was called the SIAI—was to create artificial superintelligence, but that ambition changed to ensuring AI friendliness after considering the "terrible consequences [now] feared by the likes of MIRI".

In the words of Zack_M_Davis 6 years ago:

(Disclaimer: I don't speak for SingInst, nor am I presently affiliated with them.)

But recall that the old name was "Singularity Institute for Artificial Intelligence," chosen before the inherent dangers of AI were understood. The unambiguous for is no longer appropriate, and "Singularity Institute about Artificial Intelligence" might seem awkward.

I seem to remember someone saying back in 2008 that the organization should rebrand as the "Singularity Institute For or Against Artificial Intelligence Depending on Which Seems to Be a Better Idea Upon Due Consideration," but obviously that was only a joke.

I've always thought it's a shame they picked the name MIRI over SIFAAIDWSBBIUDC.

Comment author: maxjmartin 06 February 2017 12:15:23PM *  4 points [-]

(some previous discussion of predictionbook.com here)

[disclaimer: I have only been using the site seriously for around 5 months]

I was looking at the growth of predictionbook.com recently, and there has been a pretty stable addition of about 5 new public predictions per day since 2012 (that is counting only new predictions, not including additional wagers on existing predictions). I was curious why the site did not seem to be growing, and how little it is mentioned or linked to on lesswrong and related blogs.

(sidebar: Total predictions (based on the IDs of the public predictions) are growing at about double that rate although there was huge growth around 2015 (graph) that I assume was either a script generating automated predictions, or just testing by the devs maybe -- does anyone know what caused this?)

Personally I find predictionbook to be very useful for

  • reducing hindsight bias
  • revealing planning fallacy
  • making me more objective, reducing effects of narrative fallacy
  • forcing me to think through questions more thoroughly by considering base rates, what the world would need to look like now for the prediction to come to pass, noticing composite predictions and considering each part individually, etc.
  • making me more aware of other people's failure at prediction, or when they are careful to make hard to verify predictions.
  • making me more wary of post-hoc rationalization of events I would not have predicted
  • fun

Gwern covers many other benefits of making and tracking predictions here

I would expect predictionbook to be more popular, since I am not aware of any similar services, and I find predictions to be so useful. I was therefore wondering:

  • who on lesswrong tracks their predictions outside of predictionbook, and their thoughts on that method
  • who is not tracking their predictions at all, and why they made that decision
Comment author: VincentYu 09 February 2017 11:04:58AM 0 points [-]
  • who on lesswrong tracks their predictions outside of predictionbook, and their thoughts on that method

Just adding to the other responses: I also use Metaculus and like it a lot. In another thread, I posted a rough note about its community's calibration.

Compared to PredictionBook, the major limitation of Metaculus is that users cannot create and predict on arbitrary questions, because questions are curated. This is an inherent limitation/feature for a website like Metaculus because they want the community to focus on a set of questions of general interest. In Metaculus's case, 'general interest' translates mostly to 'science and technology'; for questions on politics, I suggest taking a look at GJ Open instead.

Comment author: VincentYu 05 February 2017 12:16:15PM 3 points [-]

Here is the full text article that was actually published by Kahneman et al. (2011) in Harvard Business Review, and here is the figure that was in HBR:

12 Questions to Ask Before You Make That Big Decision

Comment author: 9eB1 03 February 2017 07:32:40PM *  1 point [-]

Is there any information on how well-calibrated the community predictions are on Metaculus? I couldn't find anything on the site. Also, if one wanted to get into it, could you describe what your process is?

Comment author: VincentYu 04 February 2017 03:00:44PM 3 points [-]

Is there any information on how well-calibrated the community predictions are on Metaculus?

Great question! Yes. There was a post on the official Metaculus blog that addressed this, though this was back in Oct 2016. In the past, they've also sent to subscribed users a few emails that looked at community calibration.

I've actually done my own analysis on this around two months ago, in private communication. Let me just copy two of the plots I created and what I said there. You might want to ignore the plots and details, and just skip to the "brief summary" at the end.

(Questions on Metaculus go through an 'open' phase then a 'closed' phase; predictions can only be made and updated while the question is open. After a question closes, it gets resolved either positive or negative once the outcome is known. I based my analysis on the 71 questions that have been resolved as of 2 months ago; there are around 100 resolved questions now.)

Plot for median predictions

First, here's a plot for the 71 final median predictions. The elements of this plot:

  • Of all monotonic functions, the black line is the one that, when applied to this set of median predictions, performs the best (in mean score) under every proper scoring rule given the realized outcomes. This can be interpreted as a histogram with adaptive bin widths. So for instance, the figure shows that, binned together, predictions from 14% to 45% resolved positive around 0.11 of the time. This is also the maximum-likelihood monotonic function.

  • The confidence bands are for the null hypothesis that the 71 predictions are all perfectly calibrated and independent, so that we can sample the distribution of counterfactual outcomes simply by treating the outcome of each prediction with credence p as an independent coin flip with probability p of positive resolution. I sampled 80,000 sets of these 71 outcomes, and built the confidence bands by computing the corresponding maximum-likelihood monotonic function for each set. The inner band is pointwise 1 sigma, whereas the outer is familywise 2 sigma. So the corner of the black line that exceeds the outer band around predictions of 45% is a p < 0.05 event under perfect calibration, and it looks to me that predictions around 30% to 40% are miscalibrated (underconfident).

  • The two rows of tick marks below the x-axis show the 71 predictions, with the upper green row comprising positive resolutions, and the lower red row comprising negatives.

  • The dotted blue line is a rough estimate of the proportion of questions resolving positive along the range of predictions, based on kernel density estimates of the distributions of predictions giving positive and negative resolutions.

Plot for all predictions

Now, a plot of all 3723 final predictions on the 71 questions.

  • The black line is again the monotonic function that minimizes mean proper score, but with the 1% and 99% predictions removed because—as I expected—they were especially miscalibrated (overconfident) compared to nearby predictions.

  • The two black dots indicate the proportion of question resolving positive for 1% and 99% predictions (around 0.4 and 0.8).

  • I don't have any bands indicating dispersion here because these predictions are a correlated mess that I can't deal with. But for predictions below 20%, the deviation from the diagonal looks large enough that I think it shows miscalibration (overconfidence).

  • Along the x-axis I've plotted kernel density estimates of the predictions resolving positive (green, solid line) and negative (red, dotted line). Kernel densities were computed under log-odds with Gaussian kernels, then converted back to probabilities in [0, 1].

  • The blue dotted line is again a rough estimate of the proportion resolving positive, using these two density estimates.

Brief summary:

  • Median predictions around 30% to 40% occur less often than claimed.
  • User predictions below around 20% occur more often than claimed.
  • User predictions at 1% and 99% are obviously overconfident.
  • Other than these, calibration seems okay everywhere else; at least, they aren't obviously off.
  • I'm very surprised that user predictions look fairly accurate around 90% and 95% (resolving positive around 0.85 and 0.90 of the time). I expected strong overconfidence like that shown by the predictions below 20%.

Also, if one wanted to get into it, could you describe what your process is?

Is there anything in particular that you want to hear about? Or would you rather have a general description of 1) how I'd suggest starting out on Metaculus, and/or 2) how I approach making and updating predictions on the site, and/or 3) something else?

(The FAQ is handy for questions about the site. It's linked to by the 'help' button at the button of every page.)

Comment author: Grothor 01 February 2017 09:03:20PM *  7 points [-]

I'm a cyclist and a PhD student, and I've noticed some patterns in the way that my exercise habits affect my productivity. I get a lot of data from every ride. While I'm riding, I measure heart rate and power, and if I'm outside, I also measure distance and speed. I've found that the total amount of energy that I produce, as measured by the power meter on my bike, is a useful metric for how I should expect to feel the rest of the day and the next day. In particular, if I generate between 800 kJ and 1000 kJ, I usually feel alert, but not worn out. If I do less, I feel like I've not had enough exercise, and I either feel restless or like my body is in lazy recovery mode. If I do more, I feel physically worn out enough that it's hard to work for an extended period of time, especially on the days that I am working in the lab.

What I think is most curious about this is that it is relatively independent of my fitness or the intensity of the ride. If I go balls-out the whole time, it takes slightly fewer kJ to make it hard to focus, and if I go super easy, it takes a bit more. It's the same with fitness. The difference between the power I can sustain for an hour when I'm in form for racing vs when I've barely been riding at all is about 25-30%, but the difference in the amount of mechanical work to make me unproductive is about 10%. (You might notice this gives me an incentive to stay in shape; I can do the same amount of work for the same productivity boost in less time when I'm more fit.)

So, what's definitely true is that the amount of work I put in on the bike is a useful metric for maximizing my productivity. What's unclear is if the amount of work is in some way fundamental to the mental state that it puts me in. The most obvious possibility is that it mainly has to do with the number of calories I burn; this is consistent with the finding that I need to do more work to feel tired when I'm more fit, since training will make you more efficient. But it's not obvious to me why this would be the case. When I'm in poor shape, an 800 kJ ride will have a much more drastic effect on my blood sugar than it will when I'm fit enough to race. It would be useful to venture outside the 800-1000 kJ range on days when I need to get work done.

I don't really know enough physiology to get any further than this. Does anybody else have experience with this sort of thing? Does anyone have empirically testable hypotheses? (Non-testable or not-testable-for-me hypotheses may be interesting as well.)

Comment author: VincentYu 02 February 2017 11:12:16AM 1 point [-]

That's some neat data and observation! Could there be other substantial moderating differences between the days when you generate ~900 kJ and the days when you don't? (E.g., does your mental state before you ride affect how much energy you generate? This could suggest a different causal relationship.) If there are, maybe some of these effects can be removed if you independently randomize the energy you generate each time you ride, so that you don't get to choose how much you ride.

To make this a single-blinded experiment, just wear a blindfold; to double blind, add a high-beam lamp to your bike; and to triple blind, equip and direct high beams both front and rear.

… okay, there will be no blinding.

Comment author: VincentYu 02 February 2017 09:37:34AM *  2 points [-]

Polled.

  1. I generally do only a quick skim of post titles and open threads (edit: maybe twice a month on average; I'll try visiting more often). I used to check LW compulsively prior to 2013, but now I think both LW and I have changed a lot and diverged from each other. No hard feelings, though.

  2. I rarely click link posts on LW. I seldom find them interesting, but I don't mind them as long as other LWers like them.

  3. I mostly check LW through a desktop browser. Back in 2011–2012, I used Wei Dai's "Power Reader" script to read all comments. I also used to rely on Dbaupp's "scroll to new comments" script after they posted it in 2011, but these days I use Bakkot's "comment highlight" script. (Thanks to all three of you!)

  4. I've been on Metaculus a lot over the past year. It's a prediction website focusing on science and tech (the site's been mentioned a few times on LW, and in fact that's how I heard of it). It's sort of like a gamified and moderated PredictionBook. (Edit: It's also similar to GJ Open, but IMO, Metaculus has way better questions and scoring.) It's a more-work-less-talk kind of website, so it's definitely not a site for general discussions.

    I've been meaning to write an introductory post about Metaculus… I'll get to that sometime.

    Given that one of LW's past focus was on biases, heuristics, and the Bayesian interpretation of probability, I think some of you might find it worthwhile and fun to do some real-world practice on manipulating subjective probabilities based on finding evidence. Metaculus is all about that sort of stuff, so join us! (My username there is 'v'. I recognize a few of you, especially WhySpace, over there.) The site itself is under continual improvement and work, and I know that the admins have high ambitions for it.

Edit: By the way, this is a great post and idea. Thanks!

Comment author: VincentYu 08 December 2016 07:51:28AM 0 points [-]

I haven't been around for a while, but I expect to start fulfilling the backlog of requests after Christmas. Sorry for the long wait.

Comment author: knb 02 May 2016 10:51:30AM *  1 point [-]

BBC News is running a story claiming that the creator of Bitcoin known as Satoshi Nakamoto is an Australian named Craig Wright.

Comment author: VincentYu 02 May 2016 01:37:43PM 0 points [-]

Do we know which country Wright was living in during 2010?

View more: Next