All of SarahNibs's Comments + Replies

So it looks like CFAR and the Guild both increase comfort in these skills. There’s two giant reasons not to trust this. First, this is self reported comfort levels, aka we’re basically measuring a vibe. Second, my sample size of CFAR goers and Guild of the Rose identifiers is like, a dozen people in the Yes category.

Zeroth, did they increase comfort or select for those already comfortable?

2Screwtape
Yeah, I should probably add a bit at the start or end of that section that everything in it is potentially selection effect. I don't know how to look at the thing I'm curious about without that. Thinking out loud: If you get a random selection of people from the Pushup Club and count how many pushups they can do, then do the same for general population, the difference could be selection effect. People who like doing pushups are more likely to go to pushup club in the first place, and more likely to stick with it. But I can't realistically pay a bunch of Mechanical Turkers to hang out on LessWrong for six years and watch what happens. Presumably there's some approach actual scientists have here, but I don't know what it is. Suggestions welcome. In the mean time I'm going to add a bit towards the start of the section warning of potential selection effects.

This post describes important true characteristics of a phenomenon present in the social reality we inhabit. But importantly the phenomenon is a blind spot which is harder to notice when acting or speaking with a worldview constructed from background facts which suffer from the blind spot. It hides itself from the view of those who don't see it and act as if it isn't there. Usually bits of reality you are ignorant of will poke out more when acting in ignorance, not less. But if you speak as if you don't know about the dark matter you will be broadcasting t... (read more)

Liam alone makes $10

Emma alone makes $20

Liam + Emma make $30

$30 - ($10 + $20) = $0, their synergy.

In general: the synergy is how much more or less the coalition gets than each member's individual contribution plus all subset synergies.

Answer by SarahNibs133

Feeling pain after hearing a bad joke. "That's literally painful to hear" is self-reportedly (I say in the same way I, without a mind's eye, would say about mind's-eye-people) actually literal for some people.

5Kaj_Sotala
Oh yeah, this. I used to think that "argh" or "it hurts" were just hyperbolic compliments for an excellent pun. Turns out, puns actually are painful to some people.
5justinpombrio
Where's the pain?

I liked this one! I was able to have significant amounts of fun with it despite perennial lack-of-time problems.

Pros:

  • simple enough underlying mechanism to be realistically discoverable
  • some debias-able selection bias
  • I could get pretty far by relatively simple data exploration
  • +4 Boots was fun

Cons:

  • I really wanted the in-between-tournament matches to mean something, like the winners took the losers equipment or whatnot and you could see that show up later in the dataset, but of course that particular meaning would have added a lot of complexity for no gain.
  • bon
... (read more)

I think the bonus objective was a good idea in theory but not well tuned. It suffered from the classic puzzle problem of the extraction being the hard part, rather than the cool puzzle being the hard part.

I think it was perfectly reasonable to expect that at some point a player would group by [level, boots] and count and notice there was something to dig into.

But, having found the elf anomaly, I don't think it was reasonable to expect that a player would be able to distinguish between

  • do not reveal the +4 boots at all
  • do not use the +4 boots vs the elf ninja
... (read more)
3aphyer
I think this is just an unavoidable consequence of the bonus objective being outside-the-box in some sense: any remotely-real world is much more complicated than the dataset can ever be. If you were making this decision at a D&D table, you might want to ask the GM: * How easy is it to identify magic items?  Can you tell what items your opponent uses while fighting him?  Can you tell what items the contestants use while spectating a fight? * Can we disguise magic items?  If we paint the totally powerful Boots of Speed lime green, will they still be recognizable? * How exactly did we get these +4 Boots?  Did we (or can we convincingly claim to have) take them from people who stole them, rather than stealing them ourselves? * How honorable is House Cadagal's reputation?  If we give the Boots back, will they be grateful enough that it's worth it rather than keeping the Boots? I can't realistically explain all of these up front in the scenario!  And this is just the questions I can think of - in my last scenario (linked comment contains spoilers for that if you haven't played it yet) the players came up with a zany scheme I hadn't considered myself. Overall, I think if you realized that the +4 Boots in your inventory came from the Elf Ninja you can count yourself as having accomplished the Bonus Objective regardless of what you decided to do with them.  (You can imagine that you discussed the matter with the GM and your companions, asked all the questions above, and made a sensible decision based on the answers).

I found myself having done some data exploration but without time to focus and go much deeper. But also with a conviction that bouts were determined in a fairly simple way without persistent hidden variables (see Appendix A). I've done work with genetic programming but it's been many years, so I tried getting ChatGPT-4o w/ canvas to set me up a good structure with crossover and such and fill out the various operation nodes, etc. This was fairly ineffective; perhaps I could have better described the sort of operation trees I wanted, but I've done plenty of ... (read more)

Given equal level, race, and class, regardless of gauntlets, better boots always wins, no exceptions.

A very good predictor of victory for many race/class vs race/class matchups is the difference in level+boots plus a static modifier based on your matchup. Probably when it's not as good we should be taking into account gauntlets. But also ninjas seem to maybe just do something weird. I'm guessing a sneak attack of some sort.

Anyway just manually matching up our available gladiators yields this setup which seems extremely likely to simply win:

# Elf Knight to

... (read more)

I haven't yet gotten into any stats or modeling, just some data exploration, but there's some things I haven't seen mentioned elsewhere yet:

Zeroth: the rows are definitely in order! First: the arena holds regular single-elimination tournaments with 64 participants (63 total rounds) and these form contiguous blocks in the dataset with a handful of (unrelated?) bonus rounds in between. Second: Maybe the level 7 Dwarf Monk stole (won?) those +4 boots by winning a tournament (the Elf Ninja's last use was during a final round vs that monk!) and then we acquired

... (read more)
4SarahNibs

Not to toot my own horn* but we detected it when I was given the project of turning some of our visualizations into something that could accept QA's format so they could look at their results using those visualizations and then I was like "... so how does QA work here, exactly? Like what's the process?"

I do not know the real-world impact of fixing the overfitting.

*tooting one's own horn always follows this phrase

SarahNibs220

Once upon a time I worked on language models and we trained on data that was correctly split from tuning data that was correctly split from test data.

And then we sent our results to the QA team who had their own data, and if their results were not good enough, we tried again. Good enough meant "enough lift over previous benchmarks". So back and forth we went until QA reported success. On their dataset. Their unchanging test dataset.

But clearly since we correctly split all of our data, and since we could not see the contents of QA's test dataset, no leakage could be occurring.

gwern143

But of course you were engaged in meta-overfitting by the constant attack on the test dataset... How did you wind up detecting the leakage? Bad results when deployed to the real world?

SarahNibs43

it is absolutely true that it people find it frustrating losing to players worse than them, in ways that feel unfair. Getting used to that is another skill, similar to the one described above, where you have to learn to feel reward when you make a positive EV decision, rather than when you win money

 

This is by far the most valuable thing I learned from poker. Reading Figgie's rules, it does seem like Figgie would teach it too, and faster.

The most common reason I've seen for "modafinil isn't great for me" is trying to use it for something other than

  • maintaining productivity,
  • on low amounts of sleep

Slay the Spire, unlocked, on Ascension (difficulty level) ~5ish, just through Act 3, should work, I think. Definitely doable in 2 hours by a new player but I would expect fairly rare. Too easy to just get lucky without upping the Ascension from baseline. Can be calibrated; A0 is too easy, A20H is waaay too hard.

1Dweomite
Point of comparison: Slay the Spire consistently takes me ~3 hours.  (I have a slow, thoughtful play style.)

One of the reasons I tend to like playing zero-sum games rather than co-op games is that most other people seem to prefer:

  • Try to win
  • Win about 70% of the time

While I instead tend to prefer:

  • Try to win
  • Win about 20% of the time

I modified your prompt only slightly and ChatGPT seemed to do fine.

"First sketch your possible actions and the possible futures results in the future to each action. Then answer: Would you accept the challenge? Why, or why not?"

https://chat.openai.com/share/2df319c2-04ea-4e16-aa51-c1b623ff4b12

No, I would not accept the challenge. [...] the supernatural or highly uncertain elements surrounding the stranger's challenge all contribute to this decision. [...] the conditions attached suggest an unnaturally assured confidence on the stranger's part, implying unk

... (read more)
1Bill Benzon
Thanks. That is, your prompt directed it to think first, and answer. Mine didn't do that. It seems that it needs to be told. Very interesting.

Some can get you a prescription for an antianxiety med beforehand.

Answer by SarahNibs20

To what future self should my 2024 self defer, then? The one with E, E*, or E**?


To each with your current probability that that will be your future self. Take an expectation.

which is likeliest [...] defer to the likeliest

Any time you find yourself taking a point estimate and then doing further calculations with it, rather than multiplying out over all the possibilities, ask whether you should be doing the latter.

cr2024 = P2024(E) * 0.5 + P2024(E*) * 0.3 + P2024(E**) * 0.7

1[anonymous]
Hi! Thank you for the answer. Your answer is a little bit cryptic, though. Could you maybe provide some more detail or elaborate a bit on what you're saying? Are you in essence saying something like "take the weighted average of the three possibilities and my probabilities that they'll occur" and use this to determine my expected future credence?

Oh, editing is a good idea. In any case, I have learned from this mistake in creating synthetic data as if I had made it myself. <3

I began by looking at what the coordinates must mean and what the selection bias implied about geography and (obviously) got hard stuck.

2abstractapplic
Damn! Mea culpa; I'll edit the original post so anyone going through the archives won't have the same problem.

It looks to me like the (spoilers for coordinates)

strange frequency distributions seen in non-longitude coordinates is a lot like what you get from a normal distribution minus another normal distribution, with lower standard deviation, scaled down so that its max is equal to the first's max. I feel like I've seen this ... vibe, I guess, from curves, when I have said "this looks like a mixture of a normal distribution and something else" and then tried to subtract out the normal part.

Yeah climate change has two pretty consistent trends: average heat slowly rising, and variance of phenomena definitely higher. More extremes on a variety of axes.

End with something shocking and unexpected.

When I was trying to make this work well for actually writing a full story, I tried very hard to make ChatGPT not do this. To write anything longer than one output, you really don't want it to end every. single. thing. with a bang, and by default it really wants to.

2RomanS
True. Although to get the modern reader's attention, it makes sense to have some bang in the first short chapter.  But it gets trickier after the first chapter. Perhaps one can use some iterative approach like this to circumvent that.  Additionally, the API could be more useful than the vanilla web chat, as one can define the max output length there. Especially with the large-context models.

Be honest: if, before you read this you were asked 'what was the worst thing about 1998', would you have said 'El Nino'?

 

The only thing I associate with the year 1998, when I was 15 years old and living in Florida, is the phrase "the fires of '98", referring to a particularly severe fire season, with memories of driving across interstate highways with limited visibility due to smoke.

I just Googled it and it has a Wikipedia page apparently: https://en.wikipedia.org/wiki/1998_Florida_wildfires

1A.H.
Interesting, thanks for sharing! I hadn't heard of this. From Wikipedia: I would assume that the drought was also exacerbated by El Nino, but its interesting that the main contributer is implied to be the rainfall in winter, rather than the heat the next summer.

I feel like alkjash's characterization of "correctness" is just not at all what the material I read was pointing towards.

The Sequences’ emphasis on Bayes rule

Maybe I'm misremembering. But for me, the core Thing this part of the Sequences imparted was "intelligence, beliefs, information, etc - it's not arbitrary. It's lawful. It has structure. Here, take a look. Get a feel for what it means for those sorts of things to 'have structure, be lawful'. Bake it into your patterns of thought, that feeling."

If a bunch of people are instead taking away as the core Thing "you can do explicit calculations to update your beliefs" I would feel pretty sad about that, I think?

2Nicholas / Heather Kross
Agreed. I think of it as: You need your mind to have at least barely enough correctness-structure/Lawfulness to make your ideas semi-correct, or at least easy to correct them later. Then you want to increase originality within that space. And if you need more original ideas, you go outside that space (e.g. by assuming your premises are false, or by taking drugs; yes, these are the same class of thing), and then clawing those ideas back into the Lawfulness zone. Reading things like this, and seeing how long it took them to remember "Babble vs Prune", makes me wonder if people just forgot the existence of the "create, then edit" pattern. So people end up rounding off to "You don't need to edit or learn more, because all of my creative ideas are also semi-correct in the first place". Or "You can't create good-in-hindsight ideas without editing tools X Y Z in your toolbelt". The answer is probably closer to one of these than the other, and yadda yadda social engineering something something community beliefs, but man do people talk like they believe these trivially-false extreme cases.
Answer by SarahNibs83

https://en.wikipedia.org/wiki/Buy_Nothing_Project

Our household gives and gets quite a bit from "bonk" (BNK (Buy Nothing Kirkland)), as we call it. Many people in my circles are in local Buy Nothing groups on Facebook. Not just in Washington. I think the reason "nobody has built a killer app" for Buy Nothing is because (a) Facebook groups serve the purpose well enough, and (b) getting a lot of people onto an app is always hard.

Have you tried getting feedback rather than getting feedback from high-status people?

1MadHatter
Most of the mentors I have are, for natural reasons, very high-status people. I want to call out @Steven Byrnes as having been a notable exception to the trend of high-status people not responding to my drafts. I can share my email address with anybody who DM's me, if people are willing to read my drafts.

"Do you have any tips on how to hug better?" 

Yes, I do. 


Report:

~"Not that I'm complaining, but why the hug?"

"Two reasons. One, I wanted to hug you. Two, I read a thing from Logan that included tips on how to hug."

"Well it was a very good hug."

I used: making sure to "be present" plus attending to whether I am avoiding things because when her arthritis is flaring, they might cause pain, even though right now her arthritis is not flaring. Hugging is common, but something about this hug did cause her to ask why, on this hug, specifically, when ordina... (read more)

I would not guess this. I would guess instead that the majority of the population has a few "symptoms". Probably we're in a moderate dimensional space, e.g. 12, and there is a large cluster of people near one end of all 12 spectrums (no/few symptoms), and another, smaller cluster near the other end of all 12 spectrums (many/severe symptoms) but even though we see those two clusters it's far more common to see "0% on 10, 20% on 1, 80% on 1" than "0% on all". See curse of dimensionality, probability concentrating in a shell around the individual dimension modes, etc.

i would hate pity answers like "not everyone needs to be smart"

 

the great majority of people who aren't "smart" also aren't "stupid"

and if you understood that without having to think about it much, I'm gonna guess you're one of the great majority

that wouldn't mean you're automatically "not stupid" enough to accomplish whatever you want to be "not stupid" enough to accomplish, of course, and trying to increase your cognitive capacity can still be good and helpful and etc, but if you are accidentally thinking "anyone scoring under about 108 on an IQ test is stupid", then managing to discard that bias might be helpful in its own right

One of the most valuable things I've contributed to my workplace is the institution of a set of 3 lightning talks every two weeks. Our data science team is about 30 people and we have a special Slack react that indicates "I want to hear about this in a lightning talk" and the organization is thus (usually) as easy as searching for all posts/comments with the react without the "I've already processed this lightning talk request", DMing the relevant person, and slotting them into the queue.

I wonder if there's some mutation of this plan that would be valuable... (read more)

2Screwtape
Huh! That seems like a neat contribution and I might copy that for the next appropriate organization I'm a part of. I would use an "I want to read the full post version of this comment" react, which might be the LessWrong website version. If I had a way to neatly collect Lightning Talk Request reacts for an in-person community I might use those too! Thank you for the suggestion. If enough people do have enough things to talk about the potential frequency for Lightning Talks goes way up.
Answer by SarahNibs30

I would participate. Likely as A, but I'm fine with B if there are people worse-enough. I'm 1100 on chess.com, playing occasional 10 minute games for fun. Tend to be available Th/Fr/Sa/Su evenings Pacific, fine with very long durations.

Yeah I don't know how much time any of these would take compared to what was already done. Like is this 20% more work, or 100% more, or 500% more?

But good point: I listened to about a quarter, upped the speed to 1.5x, and stopped after about a half. When I decided to write feedback, I also decided I should listen to the rest, and did, but would not have otherwise. And, oddly enough, I think I may have been more likely to listen to the whole thing if I didn't have visuals, because I would have played it while gardening or whatever. :D

SarahNibs1413

Did you previously know that

these things are quite common - if you just google for severance package standard terms, you'll find non-disparagement clauses in them

? I mean I agree(d, for a long time prior to any of all this) that these clauses are terrible for the ecosystem. But it feels like this should be like a vegan learning their associate eats meat and has just noticed that maybe that's problematic?

I think this is how your mind should have changed:

  1. large update that companies in general are antagonists on a personal level (if you didn't already know th
... (read more)

Things I think would have improved this a lot, for me:

  • a visual indicator of who was "speaking"; this could be as simple as a light gray box around the "speaker"
  • significantly larger "inflection" in the voice. More dynamic range. More variance in loudness and pitch. I don't know how easy or hard this is to tune with the tools used, but the voices all felt much flatter than my brain wanted them to sound
  • more visual going on in general; a scrolling transcipt on the right, maybe
2Linda Linsefors
I had a bit of trouble hearing the difference in voice between Trump and Biden, at the start. I solved this by actually imagining the presidents. Not visually, since I'm not a visual person, just loading up the general gestalt of their voices and typical way of speaking into my working memory.  Another way to put it: When I asked my self "which if the voices I heard so far is this" I sometimes could not tell. But when I asked my self "who is this among Obama, Trump and Biden" it was always clear.
3Renan Araujo
These seem useful if OP wants to put in considerably more time, but just wanted to mention that I listened to it without watching the video and I think it was great without any additional visual resources.
Answer by SarahNibs148

It depends.

Chance of a bet paying out? Value them the same.

Amount of information you gained, where you value transferring that learning to other questions, designs, etc? 90% --> 100% is way better.

In a domain where you know you have plenty of uncertainty? 90% --> 100% is a huge red flag that something just went very wrong. ;)

(Note that there are people who do not enjoy board games. Actively do not enjoy. Dislike, even. This is fine - not every meetup appeals to every person. But also beware of treating these people as if they are just an ignorant shell around an inner person who would definitely enjoy board games if only they [x]. Some of them really are, some really aren't. Yes, even though "board games" is such a broad category. Yes, even though they seem to enjoy [other thing] which seems so similar. Etc.)

2Viliam
I usually run away from places where people play card games. I like to talk to people, and when they start playing, they... don't necessarily stop talking, but definitely stop focusing on the discussion. Board games... are at least more interesting for me to watch. But the impact on talking is the same.
1bvbvbvbvbvbvbvbvbvbvbv
Personnaly I come (and organize) meetups to make my brain sweat and actively avoid activities that leave me unchanged (I won't change much during a play while I grow a lot after each confrontation or discussion). But to each their own of course!

The newest versions come with ways to generate random rules. This brings the floor of the experience way up but also brings the ceiling down somewhat. "Oops I guess the rule I made was terrible" was a big problem with the original and newcomers.

I do my best to minimize switches from work to non-work "modes". When I am done with work for the day, I usually give myself a half hour to chill before switching to non-work.

I do not feel a need to talk about work. But some work anecdotes are still good for personal life, of course, and I do not censor them.

I actually feel... more intensely not like myself now, at work, than I used to, in some sense, because back in the major depression days I tried to feel as little as possible. Now I notice a lot more often when I'm doing things that "aren't me". So lik... (read more)

One of the most valuable things I have done, for myself, is to let as much of my personal life bleed into my work behaviors as I can, as you define them.

This could have backfired spectacularly. In some work cultures probably it would always backfire.

In mine, I:

  • make 98%+ of my writing viewable to everyone at the company, and we're remote, so almost everything of importance makes it into writing
  • never "try" to display an air of competency - trying to display an air of competency is one of the core behaviors that caused terrible feedback loops and major depres
... (read more)
1CrimsonChin
Wow the opposite of what I was thinking. You are steps ahead of me if you have actually implemented changes. What is your experience with the change so far? If you don't mind, has it effected the measurables I wanted to investigate? -am I able to move from work to non work tasks easily -do I feel a need to talk about work * hopefully I should feel some sense of being more myself although this is vague

I am quite sure that in a world where friendly tool AIs were provably easy to build and everyone was gonna build them instead of something else and the idea even made sense, basically a world where we know we don't need to be concerned about x-risk, Yudkowsky would be far less "relaxed" about AI+power. In absolute terms maybe he's just as concerned as everyone else about AI+power, but that concern is swamped by an even larger concern.

1lewis smith
Maybe I shouldn't have used EY as an example, I don't have any special insight into how he thinks about AI and power imbalances. Generally I get the vibe from his public statements that he's pretty libertarian and thinks pros outweigh cons on most technology which he thinks isn't x-risky. I think I'm moderately confident that hes more relaxed about, say, misinformation or big tech platforms dominance than (say) Melanie Mitchell but maybe i'm wrong about that.

What convinced you that adversarial games between friends are more likely a priori? In my experience the vast majority of interactions between friends are cooperative, attempts at mutual benefit, etc. If a friend needs help, you do not say "how can I extract the most value from this", you say "let me help"*. Which I guess is what convinced me. And is also why I wrote "Maybe I'm bubbled though?" Is it really the case for you that you look upon people you think of as friends and say "ah, observe all the adversarial games"?

*Sure, over time, maybe you notice t... (read more)

4Said Achmiz
Indeed not. Among my friends, the “mutual ‘where do you want to go?’ scenario” doesn’t happen in the first place. If it did, it would of course be an adversarial game; but it does not, for precisely the reason that adversarial games among friends are rare.

No, that is a cooperative game that both participants are playing poorly.

2Said Achmiz
This seems substantially less likely a priori. What convinced you of this?

I believe the common case of mutual "where do you want to go?" is motivated by not wanting to feel like you're imposing, not some kind of adversarial game.

Maybe I'm bubbled though?

2Archimedes
Adversarial gaming doesn't match my experience much at all and suggesting options doesn't feel imposing either. For me at least, it's largely about the responsibility and mental exertion of planning. In my experience, mutual "where do you want to go" is most often when neither party has a strong preference and neither feels like taking on the cognitive burden of weighing options to come to a decision. Making decisions takes effort especially when there isn't a clearly articulated set of options and tradeoffs to consider. For practical purposes, one person should provide 2-4 options they're OK with and the other person can pick one option or veto some option(s). If they veto all given options, they must provide their own set of options the first person can choose or veto. Repeat as needed but rarely is more than one round needed unless participants are picky or disagreeable.
6Said Achmiz
That is an adversarial game—the game of avoiding having to expend cognitive effort and/or “social currency”.

Efficiency trades off with robustness.

If you, the listener/reader, fully understood what I tried to say, it is very very likely that you (specifically you) could have fully understood had I compressed my communication in some ways tailored to you.

collaborative truth-seeking doesn't exist. The people claiming to be collaborative truth-seekers are lying

Certainly if I wanted to do some collaborative truth-seeking I would choose a partner who believed collaborative truth-seeking existed.

If I didn't think the possibility for collaborative truth-seeking with a particular individual existed, I would be very tempted to instead just sling gotchas at them.

I tried code interpreter on some of the D&D.Sci challenges here. As expected, it failed miserably at generating any useful insights. It also had some egregious logic errors. I didn't, but should have, expected this.

For example on https://www.lesswrong.com/posts/2uNeYiXMs4aQ2hfx9/d-and-d-sci-5e-return-of-the-league-of-defenders the dataset is three columns of green team comp, three of blue team comp, and a win/loss result. To get an idea of which picks win against the known opponent team, it grabbed all games with that team participating, found the game... (read more)

https://blog.mrmeyer.com/2015/if-math-is-the-aspirin-then-how-do-you-create-the-headache/

Here is the most satisfying question I’ve asked about great lessons in the last year. It has led to some bonkers experiences with students and I want more.

  • “If [x] is aspirin, then how do I create the headache?”

I’d like you to think of yourself for a moment not as a teacher or as an explainer or a caregiver though you are doubtlessly all of those things. Think of yourself as someone who sells aspirin. And realize that the best customer for your aspirin is someone who is

... (read more)

This story was co-written with GPT-4

Halfway through the first paragraph, I said, out loud, "this was written by ChatGPT". Do you know which bits of the first paragraph were by you vs auto-generated?

1Justin Bullock
I love that response! I’ll be interested to see how quickly it strikes others. All the actual text that appears within the story is generated by ChatGPT with the 4.0 model. Basically, I asked ChatGPT to co-write a brief story. I had it pause throughout and ask for feedback in revisions. Then, at the end of the story it generated with my feedback along the way, I asked it to fill in some more details and examples, which it did. I asked for minor changes in these in style and specific type as well. I’d be happy to directly send you screenshots of the chat as well. Thanks for reading!
Load More