Eva Vivalt here. Was sent the link to this discussion, happy to discuss, though I won't have time to really go back and forth because as you can imagine there is a big time crunch and replying to people on the internet who want more things to be up is not an effective way of getting things up! :) (That may not be your main concern but it is mine.)
Ironically, I think the points characterizing the post as mistaken or misleading are, well, mistaken or misleading. Responding to the bullets in turn:
The argument isn't that making comparisons between outcome measures is not the place of a charity evaluator, it is that if you are going down that route you had better have a good basis for your comparisons. I would agree that discussing what outcome measures are important is valuable. That's precisely my point - imho, GiveWell is not "encouraging discussion about what outcome measures are important" but rather discouraging it by providing recommendations that don't acknowledge that there is an issue here. I'm told they used to highlight this more and shifted because people wanted a simple number, but that's not obvious to the casual reader.
Not sure how this adds: "Also, saying this is better for 'people with choice paralysis or who don't have any idea how to evaluate different types of outcomes' seems to be missing the point. It is a significant, largely empirical challenge to determine which intermediate outcome measures most matter for the things we ultimately care about. Whether or not GiveWell does that passably, it is clearly something which needs to be done and which individual donors are not well-equipped to do." For the record, as it may not be clear, "people with choice paralysis or who don't have any idea how to evaluate different types of outcomes" aren't my words. Agree that the issue is important, disagree they do a good job at it, and therein lies the rub.
My impression was that they themselves would agree they are not in a great position to evaluate studies that did not use randomization.
On your 'about' page as well as in the linked article, you criticize GiveWell for vote counting. The example you cite of this is their microfinance review. I don't know how solid this review was, but there are at least plausible reasons for treating the null results as negative evidence in this case, and I would bet on GiveWell's analysis over your meta-analysis but not confidently.
Do you stand by the claim that GiveWell's analysis is badly flawed and that donors should trust your meta-analysis of microfinance instead? If so, I'll look into the case more closely and update my views accordingly.
AidGrade is a new charity evaluator that looks to be comparable to GiveWell. Their primary difference is that they *only* focus on how charities compare along particular measured outcomes (such as school attendance, birthrate, chance of opening a business, malaria), without making any effort to compare between types of charities. (This includes interesting results like "Conditional Cash Transfers and Deworming are better at improving attendance rates than scholarships")
GiveWell also does this, but designs their site to direct people towards their top charities. This is better for people with don't have the time to do the (fairly complex) work of comparing charities across domains, but AidGrade aims to be better for people that just want the raw data and the ability to form their own conclusions.
I haven't looked it enough to compare the quality of the two organizations' work, but I'm glad we finally have another organization, to encourage some competition and dialog about different approaches.
This is a fun page to play around with to get a feel for what they do:
http://www.aidgrade.org/compare-programs-by-outcome
And this is a blog post outlining their differences with GiveWell:
http://www.aidgrade.org/uncategorized/some-friendly-concerns-with-givewell