Eva Vivalt here. Was sent the link to this discussion, happy to discuss, though I won't have time to really go back and forth because as you can imagine there is a big time crunch and replying to people on the internet who want more things to be up is not an effective way of getting things up! :) (That may not be your main concern but it is mine.)
Ironically, I think the points characterizing the post as mistaken or misleading are, well, mistaken or misleading. Responding to the bullets in turn:
The argument isn't that making comparisons between outcome measures is not the place of a charity evaluator, it is that if you are going down that route you had better have a good basis for your comparisons. I would agree that discussing what outcome measures are important is valuable. That's precisely my point - imho, GiveWell is not "encouraging discussion about what outcome measures are important" but rather discouraging it by providing recommendations that don't acknowledge that there is an issue here. I'm told they used to highlight this more and shifted because people wanted a simple number, but that's not obvious to the casual reader.
Not sure how this adds: "Also, saying this is better for 'people with choice paralysis or who don't have any idea how to evaluate different types of outcomes' seems to be missing the point. It is a significant, largely empirical challenge to determine which intermediate outcome measures most matter for the things we ultimately care about. Whether or not GiveWell does that passably, it is clearly something which needs to be done and which individual donors are not well-equipped to do." For the record, as it may not be clear, "people with choice paralysis or who don't have any idea how to evaluate different types of outcomes" aren't my words. Agree that the issue is important, disagree they do a good job at it, and therein lies the rub.
My impression was that they themselves would agree they are not in a great position to evaluate studies that did not use randomization.
First, thanks to paulfchristiano for the moderation. I'm also trying to be moderate, but it's sometimes hard to gauge one's own tone on the internet.
Now, apologies for replying to numerous points from different people in one post, but I would feel strange posting all over the place here and this is probably my last post. It would be helpful if people have more questions to send them directly and I can try to address them on the blog as multitasking (as well as so that more people can benefit from the answers, since as good as Less Wrong is, I doubt it would be the most appropriate long-term home for the main questions people have about AidGrade): http://www.aidgrade.org/blog.
Re: ygert's "I don't care about how many people are dying of malaria. I just don't. What I do care about is people dying, or suffering, of anything": We're trying to build up to this, but not there yet. Hang on, please. GiveWell in 2013 is also much better than GiveWell 1.0 was.
Just to quickly add: I've also separately been informed GiveWell's rationale for simplifying was because donors themselves seemed to focus on global health, with reference to section 2 of http://blog.givewell.org/2011/02/04/givewells-annual-self-evaluation-and-plan-a-big-picture-change-in-priorities/. My gut says that if they had picked a different organization as their #1 rated organization, they would see less emphasis on global health, but I can understand wanting to focus on what their main donors supported. It's a fair point -- if QALYs are what people want, that's what people want. But do people really put no weight on education, etc.? If you think to the big philosophers, you don't think of Nussbaum or Singer or whoever else saying okay, QALYs are all that matter. I'm not saying who's right here, but I do think there's a greater diversity of opinion than is being reflected here; the popularity of QALYs might in part be due to the fact we have a measure for it (as opposed to e.g. something that aggregates education (EALYs?) or aggregates across all fields or is harder to measure).
Re: meta-analysis -- first, a meta-analysis tool should in principle be weakly better than (at least as good as) looking at any one study. (See: http://www.aidgrade.org/faq/how-will-you-provide-information-on-context.) An advantage of gathering all these data and coding up different characteristics of studies is that it allows easier filtering of studies later on to allow people to look at results in different settings. If results widely vary by setting, you can see that, too. Second, all the things that go into a literature review of a topic also go into a meta-analysis, which is more like a superset. So if you don't think a paper was particularly good for whatever reason you can flag that and exclude it from the meta-analysis. We have some quality measures, not that you can tell that from what's currently online unfortunately.
My overall impression is that since GiveWell has quite rightly been supported by pretty much everyone who cares about aid and data, it's particularly hard to say anything that's different. Hardly anyone has any tribal affiliations to AidGrade yet, relatively speaking, there's the unknown, etc. But while I feel the concern (and excitement) here has been from people considering AidGrade as a competitor, I would like to point out each stands to benefit from the other as well. (Actually, now I see that paulfchristiano makes that point as well.)
And on that note, I'll try to bow out / carry on the conversation elsewhere.