Raemon

LessWrong team member / moderator. I've been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I've been interested in improving my own epistemic standards and helping others to do so as well.

Sequences

Feedbackloop-First Rationality
The Coordination Frontier
Privacy Practices
Keep your beliefs cruxy and your frames explicit
LW Open Source Guide
Tensions in Truthseeking
Project Hufflepuff
Rational Ritual
Drawing Less Wrong

Comments

Sorted by
Raemon20

An individual Social Psychology lab (or lose collection of labs) can choose who to let in.

Frontier Lab AI companies can decide who to hire, and what sort of standards they want internally (and maybe, in a lose alliance with other Frontier Lab companies).

The Immoral Mazes outlines some reasons that you might think large institutions are dramatically worse than smaller ones (see: Recursive Middle Manager Hell for a shorter intro, although I don't spell out the part argument about how mazes are sort of "contagious" between large institutions)

But the simpler argument is "the fewer people you have, the easier it is for a few leaders to basically make personal choices based on their goals and values," rather than selection effects resulting in the largest institutions being better modeled as "following incentives" rather than "pursuing goals on purpose." (If an organization didn't follow the incentives, they'd be outcompeted by one that does)

Raemon64

This claim looks like it's implying that research communities can build better-than-median selection pressures but, can they? And if so why have we hypothesized that scientific fields don't?

I'm a bit surprised this is the crux for you. Smaller communities have a lot more control over their gatekeeping because, like, they control it themselves, whereas the larger field's gatekeeping is determined via openended incentives in the broader world that thousands (maybe millions?) of people have influence over. (There's also things you could do in addition to gatekeeping. See Selective, Corrective, Structural: Three Ways of Making Social Systems Work)

(This doesn't mean smaller research communities automatically have good gatekeeping or other mechanisms, but it doesn't feel like a very confusing or mysterious problem on how to do better)

Raemon50

Curated. This was a practically useful post. A lot of the advice here resonated with stuff I've tried and found valuable, so insofar as you were like "well I'm glad this worked for Shoshannah but I dunno if it'd work for me", well, I personally also have found it useful to:

  • have a direction more than a goal
  • do what I love but always tie it back
  • try random things and see what affordances they give me
Raemon84

Yeah, I didn't read this post and come away with "and this is why LessWrong works great", I came away with a crisper model of "here are some reasons LW performs well sometimes", but more importantly "here is an important gear for what LW needs to work great."

Raemon50

Nod. 

One of the things we've had a bunch of internal debate about is "how noticeable should this be at all, by default?" (with opinions ranging from "it should be about as visible as the current green links are" to "it'd be basically fine if it jargon-terms weren't noticeable at all by default."

Another problem is just variety in monitor and/or "your biological eyes." When I do this:

Turn your screen brightness up a bunch and the article looks a bit like Swiss cheese (because the contrast between the white background and the black text increases, the relative contrast between the white background and the gray text decreases).

What happens to me when I turn my macbook brightness to the max is that I stop being able to distinguish the grey and the black (rather than the contrast between white and grey seeming to decrease). I... am a bit surprised you had the opposite experience (I'm on a ~modern M3 macbook. What are you using?)

I will mock up a few options soon and post them here.

For now, here are a couple random options that I'm not currently thrilled with:

1. the words are just black, not particularly noticeable, but use the same little ° that we use for links.

2. Same, but the circle is green:

Raemon20

This feels like you have some way of thinking about responsibility that I'm not sure I'm tracking all the pieces of.

  1. Who literally meant the individuals? No one (or, some random alien mind).
  2. Who should take actions if someone flags that an unapproved term is wrong? The author, if they want to be involved, and site-admins (or me-in-particular), if they author does not want to be involved.
  3. Who should be complained to if this overall system is having bad consequences? Site admins, me-in-particular or habryka-in-particular (Habryka has more final authority, I have more context on this feature. You can start with me and then escalate, or tag both of us, or whatever)
  4. Who should have Some Kind of Social Pressure Leveraged At them if reasonable complaints seem to be falling on deaf ears and there are multiple people worried? Also the site admins, and habryka-and-me-in-particular. 

It seems like you want #1 to have a better answer, but I don't really know why.

Raemon42

Part of the uncertainties we're aiming to reduce here are "can we make thinking tools or writing tools that are actually good, instead of bad?" and our experiments so far suggest "maybe". We're also designing with "six months from now" in mind – the current level of capabilities and quality won't be static.

Our theory of "secret sauce" is "most of the corporate Tech World in fact has bad taste in writing, and the LLM fine-tunings and RLHF data is generated by people with bad taste. Getting good output requires both good taste and prompting skill, and you're mostly just not seeing people try."

We've experimented with jailbroken Base Claude which does a decent job of actually having different styles. It's harder to get to work reliably, but, not so much harder that it feels intractable.

The JargonHovers currently use regular Claude, not jailbroken claude. I have guesses of how to eventually get them to write it in something like the author's original style, although it's a harder problem so we haven't tried that hard yet.

Raemon30

it becomes just another purveyor of AI “extruded writing product”.

If it happened here the way it happened on the rest of the internet, (in terms of what the written content was like) I'd agree it'd be straightforwardly bad. 

For things like jargon-hoverovers, the questions IMO are:

  • is the explanation accurate?
  • is the explanation helpful for explaining complex posts, esp. with many technical terms?
  • does the explanation feel like soulless slop that makes you feel ughy the way a lot of the internet is making you feel ughy these days?

If the answer to the first two is "yep", and the third one is "alas, also yep", then I think an ideal state is for the terms to be hidden-by-default but easily accessible for people who are trying to learn effectively, and are willing to put up with somewhat AI-slop-sounding but clear/accurate explanations.

If the answer to the first two is "yep", and the third one is "no, actually is just reads pretty well (maybe even in the author's own style, if they want that)", then IMO there's not really a problem.

I am interested in your actual honest opinion of, say, the glossary I just generated for Unifying Bargaining Notions (1/2) (you'll have to click option-shift-G to enable the glossary on lesswrong.com). That seems like a post where you will probably know most of the terms to judge them on accuracy, while it still being technical enough you can imagine being a person unfamiliar with game theory trying to understand the post, and having a sense of both how useful they'd be and how aesthetically they feel.

My personal take is that they aren't quite as clear as I'd like and not quite as alive-feeling as I'd like, but over the threshold of both that I much rather having them than not having them, esp. if I knew less game theory than I currently do.

Raemon70

The most important thing is "There is a small number of individuals who are paying attention, who you can argue with, and if you don't like what they're doing, I encourage you to write blogposts or comments complaining about it. And if your arguments make sense to me/us, we might change our mind. If they don't make sense, but there seems to be some consensus that the arguments are true, we might lose the Mandate of Heaven or something."

I will personally be using my best judgment to guide my decisionmaking. Habryka is the one actually making final calls about what gets shipped to the site, insofar as I update that we're doing a wrong thing, I'll argue about it."

It happening at all already constitutes “going wrong”.

This particular sort of comment doesn't particularly move me. I'm more likely to be moved by "I predict that if AI used in such and such a way it'll have such and such effects, and those effects are bad." Which I won't necessarily automatically believe, but, I might update on if it's argued well or seems intuitively obvious once it's pointed out.

I'll be generally tracking a lot of potential negative effects and if it seems like it's turning out "the effects were more likely" or "the effects were worse than I thought", I'll try to update swiftly.

Raemon20

Whoops, should be fixed now.

Load More