All of Raelifin's Comments + Replies

Raelifin4y10

If you have multiple quality metrics then you need a way to aggregate them (barring more radical proposals). Let’s say you sum them (the specifics of how they combine are irrelevant here). What has been created is essentially a 25-star system with a more explicit breakdown. This is essentially what I was suggesting. Rate each post on 5 dimensions from 0 to 2, add the values together, and divide by two (min 0.5), and you have my proposed system. Perhaps you think the interface should clarify the distinct dimensions of quality, but I think UI simplicity is p... (read more)

Raelifin4y10

I agree that there are benefits to hiding karma, but it seems like there are two major costs. The first is in reducing transparency; I claim that people like knowing why something is selected for them, and if karma becomes invisible the information becomes hidden in a way that people won’t like. (One could argue it should be hidden despite people’s desires, but that seems less obvious.) The other major reason is one cited by Habryka: creating common knowledge. Visible Karma scores help people gain a shared understanding of what’s valued across the site. Ra... (read more)

Raelifin4y50

I suggested the 5-star interface because it's the most common way of giving things scores on a fixed scale. We could easily use a slider, or a number between 0 and 100 from my perspective. I think we want to err towards intuitive/easy interfaces even if it means porting over some bad intuitions from Amazon or whatever, but I'm not confident on this point.

I toyed with the idea of having a strong-bet option, which lets a user put down a stronger QJR bet than normal, and thus influence the community rating more than they would by default (albeit exposing them... (read more)

Raelifin4y10

I agree with the expectation that many posts/comments would be nearly indistinguishable on a five-star scale. I'm not sure there's a way around this while keeping most of the desirable properties of having a range of options, though perhaps increasing it from 10 options (half-stars) to 14 or 18 options would help.

My basic thought is that if I can see a bunch of 4.5 star posts, I don't really need the signal as to whether one is 4.3 stars vs 4.7 stars, even if 4.7 is much harder to achieve. I, as a reader, mostly just want a filter for bad/mediocre posts, a... (read more)

5habryka4y

One of my ideas for this (when thinking about voting systems in general) is to have a rating that is trivially inconvenient to access. Like, you have a ranking system from F to A, but then you can also hold the A button for 10 seconds, and then award an S rank, and then you can hold the S button for 30 seconds, and award a double S rank, and then hold it for a full minute, and then award a triple S rank. The only instance I've seen of something like this implemented is Medium's clap system, which allows you to give up to 50 claps, but you do have to click 50 times to actually give those claps.

Raelifin4y30

Ah! This looks good! I'm excited to try it out.

Raelifin4y50

Yep. I'm aware of that. Our karma system is better in that regard, and I should have mentioned that.

Raelifin4y70

Nice. Thank you. How would you feel about me writing a top-level post reconsidering alternative systems and brainstorming/discussing solutions to the problems you raised?

8habryka4y

Seems great! It's a bit on ice this week, but we've been thinking very actively about changes to the voting system, and so right now is the right time to strike the iron if you want to change the teams opinion on how we should change things, and what we should experiment with.

Raelifin4y40

I also want to note that this proposal isn't mutually exclusive with other ideas, including other karma systems. It seems fine to have there be an additional indicator of popularity that is distinct from quality. Or, more to my liking, would be a button that simply marks that you thought a post was interesting and/or express gratitude towards the writer, without making a statement about how bulletproof the reasoning was. (This might help capture the essence of Rule Thinkers In, Not Out and reward newbies for posting.)

Raelifin4y60

One answer is that if you don't like the mods, you can go somewhere else. Vote with your feet, etc.

A more turtles-all-the-way-down answer is that the stakeholders of LW (the users, and possibly influential community members/investors?) agree on an a... (read more)

Viliam4y100

I guess the question is, what is the optimal amount of consensus. Where do we want to be, on the scale from Eternal September to Echo Chamber?

Seems the me that the answer depends on how much correct we are, on average. To emphasise: how much correct we actually are, not how much correct we want to be, or imagine ourselves to be.

On a website where moderators are correct about almost everything, most disagreement is a noise. (It may provide a valuable feedback on "what other people believe", but not on how things actually are.) It is okay to punish disagreem... (read more)

Raelifin4y90

To my mind the primary features of this system that bear on Duncan's top-level post are:

High-reputation judges can confidently set the quality signal for a piece of writing, even if they're in the minority. The truth is not a popularity contest, even when it comes to quality.
The emphasis on betting means that people who "upvote" low-quality posts or "downvote" high-quality ones are punished, making "this made me feel things, and so I'm going to bandwagon" a dangerous mental move. And people who make this sort of move would be efficiently sidelined.

In conce... (read more)

habryka4y*120

I like this idea. It has a lot of nice attributes.

I wrote some in the past about what all the different things are that a voting/karma system on LW is trying to produce, with some thoughts on some proposals that feel a bit similar to this: https://www.lesswrong.com/posts/EQJfdqSaMcJyR5k73/habryka-s-shortform-feed?commentId=8meuqgifXhksp42sg

Grass Valley, CA – ACX Meetups Everywhere 2021

Raelifin4y290

First of all, thank you, Duncan, for this post. I feel like it captures important perspectives that I've had, and problems that I can see and puts them together in a pretty good way. (I also share your perspective that the post Could Be Better in several ways, but I respect you not letting the perfect be the enemy of the good.)

I find myself irritated right now (bothered, not angry) that our community's primary method of highlighting quality writing is by karma-voting. It's a similar kind of feeling to living in a democracy--yes, there are lots of systems t... (read more)

6Yoav Ravid4y

I think this is too complex for a comment system, but upvoted for an interesting and original idea.

4Raelifin4y

6Raelifin4y

One obvious flaw with this proposal is that the quality-indicator would only be a measure of expected rating by a moderator. But who says that our moderators are the best judges of quality? Like, the scheme is ripe for corruption, and simply pushing the popularity contest one level up to a small group of elites. One answer is that if you don't like the mods, you can go somewhere else. Vote with your feet, etc. A more turtles-all-the-way-down answer is that the stakeholders of LW (the users, and possibly influential community members/investors?) agree on an aggregate set of metrics for how well the moderators are collectively capturing quality. Then, for each unit of time (eg year) and each potential moderator, set up a conditional prediction market with real dollars on whether that person being a moderator causes the metrics to go up/down compared to the previous time unit. Hire the ones that people predict will be best for the site.

9Raelifin4y

To my mind the primary features of this system that bear on Duncan's top-level post are: * High-reputation judges can confidently set the quality signal for a piece of writing, even if they're in the minority. The truth is not a popularity contest, even when it comes to quality. * The emphasis on betting means that people who "upvote" low-quality posts or "downvote" high-quality ones are punished, making "this made me feel things, and so I'm going to bandwagon" a dangerous mental move. And people who make this sort of move would be efficiently sidelined. In concert, I expect that it would be much easier to bring concentrated force down on low-quality bits of writing. Which would, in turn, I think make the quality price/signal a much more meaningful piece of information, instead of the current karma score which is as others noted, is overloaded as a measure.

Raelifin4y10

Update: I decided that I like the grass south of the baseball diamond better. Let's meet there.

Grass Valley, CA – ACX Meetups Everywhere 2021

Raelifin4y10

Hey all, Max here. I was bad/busy on the weekend when I was supposed to provide a more specific location, so I've updated the what3words to a picnic table near the dog/skate park. I reserve the right to continue to adjust the meetup location in the coming weeks if I find even better places, so be sure to check on the 18th for specifics.

I'm an AI safety researcher and author of Crystal Society. I did a bunch of community leading/organizing in Ohio, including running a rationality dojo. I moved out to the bay area in 2016, and to Grass Valley in June. If you... (read more)

1Raelifin4y

Update: I decided that I like the grass south of the baseball diamond better. Let's meet there.

Feedback on LW 2.0

Raelifin8y30

Issue 2 is about to be fixed: https://github.com/Discordius/Lesswrong2/pull/188

Raelifin10y30

I picked 7 Habits because it's pretty clearly rationality in my eyes, but is distinctly not LW style Rationality. Perhaps I should have picked something worse to make my point more clear.

4Vaniver10y

I suspect the point will be clearer if stated without examples? I think you're pointing towards something like "most self-help does not materially improve the lives of most self-help readers," which seems fairly ambiguous to me. Most self-help, if measured by titles, is probably terrible simply by Sturgeon's Law. But is most self-help, as measured by sales? I haven't looked at sales figures, but I imagine it's not that unlikely that half of all self-help books actually consumed are the ones that are genuinely helpful. It also seems to me that the information content of useful self-help is about pointing to places where applying effort will improve outcomes. (Every one of the 7 Habits is effortful!) Part of scientific self-help is getting an accurate handle on how much improvement in outcomes comes from expenditure of effort for various techniques / determining narrowly specialized versions. But if someone doesn't actually expend the effort, the knowledge of how they could have doesn't lead to any improvements in outcomes. Which is why the other arm of self-help is all about motivation / the emotional content. It's not clear to me that LW-style rationality improves on the informational or emotional content of self-help for most of the populace. (I think it's better at the emotional content mostly for people in the LW-sphere.) Most of the content of LW-style rationality is philosophical, which is very indirectly related to self-help.

Raelifin10y20

Ah, perhaps I misunderstood the negative perception. It sounds like you see him as incompetent, and since he's working with a subject that you care about that registers as disgusting?

I can understand cringing at the content. Some of it registers that way to me, too. I think Gleb's admitted that he's still working to improve. I won't bother copy-pasting the argument that's been made elsewhere on the thread that the target audience has different tastes. It may be the case that InIn's content is garbage.

I guess I just wanted to step in and second jsteinhardt's comment that Gleb is a very growth-oriented and positive, regardless of whether his writing is good enough.

-3Lumifer10y

Not only that -- let me again stress the point that his texts cause the "Ewwww" reaction, not "Oh, this is dumb". The slime-and-snake-oil feeling would still be there even if he were writing in the same way about, say, the ballet in China. As to "positive", IlyaShpitser mentioned chutzpah which I think is a better description :-/

Raelifin10y20

I agree! Having good intentions does not imply the action has net benefit. I tried to communicate in my post that I see this as a situation where failure isn't likely to cause harm. Given that it isn't likely to hurt, and it might help, I think it makes sense to support in general.

(To be clear: Just because something is a net positive (in expectation) clearly doesn't imply one ought to invest resources in supporting it. Marginal utility is a thing, and I personally think there are other projects which have higher total expected-utility.)

0Lumifer10y

A failure isn't likely to cause major harm, but by similar reasoning success is not likely to lead to major benefits as well. In simpler terms, InIn isn't likely to have a large impact of any kind. Given this, I still see no reason why minor benefits are more likely than minor harm.

Raelifin10y100

Okay well it seems like I'm a bit late to the discussion party. Hopefully my opinion is worth something. Heads up: I live in Columbus Ohio and am one of the organizers of the local LW meetup. I've been friends with Gleb since before he started InIn. I volunteer with Intentional Insights in a bunch of different ways and used to be on the board of directors. I am very likely biased, and while I'm trying to be as fair as possible here you may want to adjust my opinion in light of the obvious factors.

So yeah. This has been the big question about Intentional In... (read more)

3Vaniver10y

This strikes me as a weird statement, because 7 Habits is wildly successful and seems very solid. What about it bothers you? (My impression is that "a word to the wise is sufficient," and so most clever people find it aggravating when someone expounds on simple principles for hundreds of pages, because of the implication that they didn't get it the first time around. Or they assume it's less principled than it is.)

0Lumifer10y

That does not follow at all. The road to hell is in excellent condition and has no need of maintenance. Having a good goal in no way guarantees that what you do has net benefit and should be supported.

Raelifin10y70

I just wanted to interject a comment here as someone who is friends with Gleb in meatspace (we're both organizers of the local meetup). In my experience Gleb is kinda spooky in the way he actually updates his behavior and thoughts in response to information. Like, if he is genuinely convinced that the person who is criticizing him is doing so out of a desire to help make the world a more-sane place (a desire he shares) then he'll treat them like a friend instead of a foe. If he thinks that writing at a lower-level than most rationality content is currently... (read more)

0ChristianKl10y

I'm not sure whether that's a good idea. Writing that feels weird to the author is also going to transmit that vibe to the audience. We don't want rationality to be associated with feeling weird and unpleasant.

4Gleb_Tsipursky10y

Yeah, we're not optimizing for warm-fuzzies from Less Wrongers, but for a broad impact. Thanks for the sympathetic words, my friend. This road of effective cognitive altruism is a hard one to travel, neither being really appreciated, at least at first, by the ones who we are trying to reach, nor by the ones among our peers whose ideas we are bringing to the masses. Well, if my liver gets consumed daily by vultures, this is the road I've chosen. Glad to have you by my side, and hope this doesn't rebound on you much.

-3OrphanWilde10y

Yes. That's what the status quo is, and how it works. More, there's are multiple levels to the reasons for its existence, and the implicit suggestion that sticking to the status quo would be a tragedy neglect those reasons in favor of romantic notions of fixing the world.

-1Lumifer10y

I can't speak for other people, of course, but he never looked much like a manipulator. He looks like a guy who has no clue. He doesn't understand marketing (or propaganda), the fine-tuned practice of manipulating people's minds for fun and profit. He decided he needs to go downmarket to save the souls drowning in ignorance, but all he succeeded in doing -- and it's actually quite impressive, I don't think I'm capable of it -- is learning to write texts which cause visceral disgust. Notice the terms in which people speak of his attempts. It's not "has a lot of rough edges", it's slime and spiders in human skin and "painful" and all that. Gleb's writing does reach System I, but the effect has the wrong sign.

Raelifin10y20

Impostor entries were generally more convincing than genuine responses. I chalk this up to impostors trying harder to convince judges.

But who knows? Maybe you were a vegetarian in a past life! ;)

0OrphanWilde10y

Doesn't surprise me, but I lean towards the opposite reasoning. I think the majority of people understand vegetarian/vegan arguments, so the imposters don't have any kind of disadvantage - but vegetarian/vegan people likely think the majority of people don't understand those arguments (or else why wouldn't they arrive at the same conclusions), which results in a miscalibration about how to represent their beliefs to people. ETA: Likewise the reverse.

Raelifin10y10

You're right, but I'm pretty confident that the difference isn't significant. We should probably see it as evidence that rationalists omnivores are about as capable as rationalist vegetarians.

If we look at average percent of positive predictions (predictions that earn more than 0 points):

Omnivores: 51%

Vegetarians: 46%

If we look at non-negative predictions (counting 50% predictions):

Omnivores: 52%

Vegetarians: 49%

Raelifin10y10

As Douglas_Knight points out, it's only 10/12, a probability of ~0.016. In a sample of ~50 we should see about one person at that level of accuracy or inaccuracy, which is exactly what we see. I'm no more inclined to give #14 a medal than I am to call #43 a dunce. See the histogram I stuck on to the end of the post for more intuition about why I see these extreme results as normal.

I absolutely will fess up to exaggerating in that sentence for the sake of dramatic effect. Some judges, such as yourself, were MUCH less wrong. I hope you don't mind me outing y... (read more)

8switzer10y

I'm #43 and I'll accept my dunce cap. I responded just after I began lurking here. I remember having little confidence in my responses and yet I apparently answered as if I did. I really have no insight into why I answered this way. My cringeworthy results reinforce to me the importance of sticking around and improving my thinking.

3gjm10y

Or, of course, just lucky. If you aren't giving #14 a medal, you shouldn't be giving me one either. (Though, as it happens, I have some reason to think my calibration is pretty good.) And yes, I was still slightly overconfident, and my intention in what I wrote above was to make it clear that I recognize that.

Raelifin10y60

In retrospect I ought to have included options closer to 50%. I didn't expect that they'd be so necessary! You are absolutely right, though.

A big part of LessWrong, I think, is learning to overcome our mental failings. Perhaps we can use this as a lesson that the best judge writes down their credence before seeing the options, then picks the option that is the best match to what they wrote. I know that I, personally, try (and often fail) to use this technique when doing multiple-choice tests.

9gjm10y

Yup. I do a similar thing when contemplating buying things like books: I don't look at the price, then I ask myself "How much would I pay to have this?", then I check the price. (And then, if it's a book, I too often buy the damn thing anyway even though the price is higher than the one I decided on. Perfectly rational action is, alas, beyond me.)

Raelifin10y30

Every judge being close to 50% would be bizarre. If I flip 13 coins 53 times I would expect that many of those sets of 13 will stray from the 6.5/13 expected ratio. The big question is whether anyone scored high enough or low enough that we can say "this wasn't just pure chance".

0jsteinhardt10y

Yes, I agree, I meant the (unobserved) probability that each judge gets a given question correct (which will of course differ from the observed fraction of the time each judge is correct. But it appears that at least one judge may have done quite well (as gjm points out). I don't think that the analysis done so far provides much evidence about how many judges are doing better than chance. It's possible that there just isn't enough data to make such an inference, but one possible thing you could do is to plot the p-values in ascending order and see how close they come to a straight line.