All of Chipmonk's Comments + Replies

tagged this post as self-fulfilling prophecies!

yes!! “focusing on what you want!” (i talk a little more about this and self-fulfilling prophecies here)

**aiming** at what you want. vector. (teleology, not etiology)

Now we just need to ask Sonnet to formalize VDT

what are the examples of the curriculum/activities you're considering/planning?

3Mateusz Bagiński
Emergence of utility-function-consistent stated preferences in LLMs might be an example (0.1<p<0.6) though going from reading stuff on utility functions to the kind of behavior revealed there requires more inferential steps than going from reading stuff on reward hacking to reward hacking.
Answer by Chipmonk40

Situational Awareness and race dynamics? h/t Jan Kulveit @Jan_Kulveit 

3Xavi CF
Situational Awareness probably caused Project Stargate to some extent. Getting the Republican party to take AI seriously enough to let them launch in the White House is no joke and less likely without the essay. It also started the website-essay meta which is part of why AI 2027, The Compendium, and Gradual Disempowerment all launched the way they did, so there are knock-on effects too.

I don't feel like I learned anything new from the post.

This surprises me! Wait so-

  • The "How does one-shotting happen?" section didn't have anything interesting for you? (Have you seen stuff like this elsewhere?)
  • Did you already know one-shotting was possible?
2niplav
"One-shotting is possible" is a live hypothesis that I got from various reports from meditation traditions. I do retract "I learned nothing from this post", the "How does one-shotting happen" section is interesting, and I'd like it to be more prominent. Thanks for poking, I hope I'll find the time to respond to your other comment too.

since your bullet-point list in the beginning isn't detailed enough for anyone to try to replicate the method.

Wait I'm confused- this is not the purpose of the post

Also notable is that you only have positive examples for your method

The purpose of this post is not advertisement. It's to discuss one-shots

Especially, how would you be able to distinguish between your approach convincing your customers they were helped, instead of actually changing their behavior?

See above

Chipmonk*40

Would anyone like to help me edit a better version of this?

Oh I like "patients" ("clients"). I'll think about the rest, thanks. I'm just not sure how to write anything useful and legible without talking about my own experience and what I have the most data for?

Also I see the point of your last bullet where "my business" is the subject hm

any suggestions for how to talk about this stuff without having it read like an advertisement? i'm genuinely interested in the idea of one-shotting and legibilizing evidence that quick growth is possible

0niplav
I gave your post to Claude and gave it the prompt "Dearest Claude, here's the text for a blogpost I've written for LessWrong. I've been told that "it sounds a lot like an advertisement". Can you give me feedback/suggestions for how to improve it for that particular audience? I don't want to do too much more research, but a bit of editing/stylistic choices." (All of the following is my rephrasing/rethinking of Claude output plus some personal suggestions.) Useful things that came out of the answer were explaining more about the method you've used to achieve this, since your bullet-point list in the beginning isn't detailed enough for anyone to try to replicate the method. Also notable is that you only have positive examples for your method, which activates my filtered evidence detectors. Either make clear that you indeed did only have positive results, or name how many people you coached, for how long, and that they were all happy with what you provided. Finally, some direct words from Claude that I just directly endorse: Especially, how would you be able to distinguish between your approach convincing your customers they were helped, instead of actually changing their behavior? That feels like the failure mode of most self-help techniques—they're "self-recommending".
5ROM
Hey Chris! I have a few thoughts on this, though I have strong anti-advertising sentiments and might be overly sensitive to these things, so take it with a grain of salt.  The title sounds a little click baity. It's directed at the reader. The title "Do patients need years of therapy, or can one conversation resolve their issue?" is functionally identical, but feels less like an advert. The opening reads somewhat like a common advert tactic: "I hated how business did [thing x] since it was bad for the customer, so I started my practice by doing [thing y] which is both more appealing to a potential customer and delivers better results!'.  I think the advertising vibe might also come from the continued references to your personal practice / mentions of it's successes: * "So when I started my business, I made payment contingent on results:" * "Our clients are often surprised at how we do things because it’s so different than the therapy or other coaching they’ve done before:" * "Several of my clients have resolved lifelong issues like anxiety in one shot" * "My business is expanding to help more people in deeper and more efficient ways." Finally, it concludes with a link to where people can schedule a call with you. 

any updates on how this is going btw? (doing retroactive funding research)

what came of this? (doing research on bounties, prizes, and retroactive funding rn)

fwiw, FABRIC was able to get funding in November 2024 (who knows if this date is correct though)

nvm this was an "exit grant" lmao

Now that this is "over", I'd be fascinated to see a post about what the fundraising process was like for you and what can be learned. Seems like a big L for retroactive funding for example

https://x.com/ohabryka/status/1882579367110586459 

3DaneelO
And related to that thread: how does one find out about how to donate when there is no fundraiser? I cannot find any info on the About page or FAQ page. If someone wants to donate in a couple of months when this post is not as visible, how will they find the donation link? I don’t know if adding a donation link to FAQ and About will make much of a difference in practice. I suspect it won’t since that depends on people more spontaneously realizing they want to donate. But it seems pretty relevant to the complaint raised in the tweet thread that people only donate when you do large dramatic calls for funding. I think it wouldn’t hurt to lessen the friction and make it easier to find out how to donate. 

Aside: I'm surprised you're suggesting people get validation --> people feel secure ?  This does not at all seem like the causality to me (though I'm aware most people probably think like this). 

Prediction: In the absence of radically improved psychotechnology, a significant fraction of people will always find a way to feel insecure.

2Kaj_Sotala
Patterns of emotional security/insecurity are constantly updating (in both directions) through life, though some people's patterns are more resistant (in either direction) than those of others. (This is both my own personal experience and the empirical finding in the literature.) In the insecure -> secure direction, positive experiences as an adult can help reconsolidate negative expectations and provide the kinds of experiences that naturally securely attached people already got earlier: (How can I become more secure?: A grounded theory of earning secure attachment; Olufowote, Fife & Whiting 2019) That said, it's true that the stronger someone's insecure attachment is, the more resistant it is to updating through positive experiences: (Attachment Disturbances in Adults, p. 99-100) Of course, one consideration is also that people with insecure attachment tend to bring various patterns into their relationships that make the other person more likely to respond negatively, making it harder to get the positive experiences that would update the attachment patterns. An AI with infinite patience and understanding that never got triggered would be different in this regard, so might be able to provide corrective experiences for even some of the people who wouldn't normally be capable of changing when dealing with just humans. I would guess/hope that most people's degree of emotional insecurity would be such that they would be able to find security with AIs (especially if the AIs also doubled as expert therapists). With only the most extremely insecure people (e.g. some of the ones who would qualify for a diagnosis of a personality disorder) needing novel psychotech - but of course I can only speculate at this point.

hmm i suspect releasing these metrics could make my customers significantly more annoying. like, early adopters are fun and experimental. but if i make it seem not risky then i get risk-averse people who tend to be prickly

so maybe i will compile and release this data but i would need to figure out how to do it in a way that doesn't change the funnel

Any updates on this?

I wonder if you could set up a conditional donation? “I donate $X, minus if total donations exceed $3M"

i like this thanks. might take a bit of time to put together but interested

made some light edits because of this comment, thanks

oh ok i might start doing that. knowing my calibration on that would be nice

oh ok hm. i also don't want to be incentivized to not give easy-for-me help to people with low odds of success though

4pandamonium
Disclaimer : I would not pay and want to pay that much money anyway - so I am not your intended audience I'd trust you more (and I would think members of the rationalist community would too) if you gave several metrics, even if some of them are not so good, with explanations. Right now, it seems you chose a metric so that it looks good. More metrics would take more time but not much if you have the data easily available. This would be my suggestion : You can provide three percentages ( like when one provides three quantiles instead of just the mean of data values) : * the percentage of success in people you discussed for at least an hour * the percentage among the people with reasonable chances of success (motivated + didn't bail + your expertise + spent at least X hours) * the percentage among people with great chances of success. These percentages, with precise information on what determines in which category clients fall in and the percentage of people treated who fall into each category, would give a first sound idea of the success rate. Taking on low success rate people would not be a problem because their data is treated separately. It's only a problem if 90% of your clients are unlikely to be helped but that would not be a good thing anyway.

could you give a few examples? 

also seems time-intensive hmmmm

also, i thought about it more and i really like the metric of "results generated per hour"

1gw
I think you've already given several examples: It would already be informative if you put numbers on each of these questions (i.e. "how often does talking for 15 minutes accomplish something", "how many bounties have you taken on in/outside of your specialty", "what percent of your clients are 'unagentic and slow' (and what does this actually mean)"). Probably one could do much better by generating several metrics that one would expect to be most useful (or top N%tile useful) and share each of them.

wow this is contraversial (my own vote is +6) 

wonder why

I upvoted for the novelty of a rationalist trying a bounty based career. But also this halfway reads as an advertisement for your life coaching service. I wouldn’t want to see much more in that direction.

boundaries / membranes

  • One-sentence summary: Formalise one piece of morality: the causal separation between agents and their environment. See also Open Agency Architecture.
  • Theory of change: Formalise (part of) morality/safety, solve outer alignment.

Chris Lakin here - this is a very old post and What does davidad want from «boundaries»? should be the canonical link

Update: Bob has recorded a 6-month follow-up here.

Why was this post tagged as boundaries/membranes? I'm inclined to remove the tag.

1Matthew McRedmond
I only skimmed that category but if I'm not mistaken the kind of systems I describe in the piece are special cases of times when the boundary between defining agents and one agent and another is unclear/pivotal/insightful etc.

another weird bug is if i click the link i was just sent in my email, it brings me to a 403 Forbidden page (even though the URLs of this functional page and that 403 page look identical)

4habryka
Should now be fixed. We've blocked traffic to basically all pages and been restoring them incrementally to make sure we don't go down again immediately. I just lifted the last of those blocks.
Chipmonk258

I've run two workshops at LightHaven and it's pretty unthinkable to run a workshop anywhere else in the Bay Area. Lightcone has really made it easy to run overnight events without setup

Yeah i'm confused about what to name it. we can always change it later i guess.

also let me know if you have any posts you want me to definitely tag for it that you think i might miss otherwise

3Mateusz Bagiński
Compositional agency?

Do we have a LessWrong tag for "hierarchical agency" or "multi-scale alignment" or something? Should I make one?

2Jan_Kulveit
I guess make one? Unclear if hierarchical agency is the true name

I just made a twitter list with accounts interested in hierarchical agency (or what i call "multi-scale alignment"). Lmk who should be added

Random but you might like this graphic I made representing hierarchical agency from my post today on a very similar idea. What would you change about it?

Chipmonk124

This was an impressive demonstation of Claude for interviews. Was this one take?

(Also what prompt did you use? I like how your Claude speaks.)

There was some selection of branches, and one pass of post-processing.

It was after ˜30 pages of a different conversation about AI and LLM introspection, so I don't expect the prompt alone will elicit the "same Claude". Start of this conversation was

Thanks! Now, I would like to switch to a slightly different topic: my AI safety oriented research on hierarchical agency. I would like you to role-play an inquisitive, curious interview partner, who aims to understand what I mean, and often tries to check understanding using paraphrasing, giving examples, and si... (read more)

I'm glad you wrote this! I've been wanting to tell othres about ACS's research and finally have a good link

Great question, thanks!

I think you're correct in pointing towards the existence of basically-all-downside genetic conditions, but I still think these are in the minority. Moreover, even most of those don't create a big issue on the object level— compared to how people might feel about the issue as a result.

This argument doesn't extend to conditions like Huntington's, but if a person is missing a pinky finger, most of the issues the person is going to face are related to social factors and their own emotions, not the physical aspect.

I also just say this from experience helping others

I did not say that depression is always a strategy for everyone.

4Archimedes
I didn't mean to suggest that you did. My point is that there is a difference between "depression can be the result of a locally optimal strategy" and "depression is a locally optimal strategy". The latter doesn't even make sense to me semantically whereas the former seems more like what you are trying to communicate.
Answer by Chipmonk80

I wrote about my own experience discovering “feelings in the body” here

Load More