Doing some experiments at METR Sep-Oct 2024
Here is my understanding. Is this right?
Incredible!! I am going to try this myself. I will let you know how it goes.
honesty vector tuning showed a real advantage over honesty token tuning, comparable to honesty vector steering at the best layer and multiplier:
Is this backwards? I'm having a bit of trouble following your terms. Seems like this post is terribly underrated -- maybe others also got confused? Basically, you only need 4 terms, yes?
* base model
* steered model
* activation-tuned model
* token cross-entropy trained model
I think I was reading half the plots backwards or something. Anyway I bet if you reposted with clearer terms/plots then you'd get some good followup work and a lot of general engagement.
Hey!!! Thanks for replying. But did you or anyone you know consider chemical cisgenderization? Or any mention of such in the forums? I would it expect it to be a much stronger effect than eg joining the military. Although I hear it is common for men in the military to take steroids, so maybe there would be some samples there.... I imagine taking cis hormones is not an attractive idea, because if you dislike the result then you're worse off than you started.
(Oh and we were still together then. LK has child now, not sure how that affects the equation.)
Thank you! Seems like this bot works quite well for this task
I have used a number of discourse forums and they just feel bad/wrong but I cannot explain why. I would also vote for more of an old-fashioned php BB with a nice theme. Those are always great, even though all my intuitions tell me they seem like they should suck. Shows how little I know.
Eg https://github.com/phpbb/phpbb
Also has styles: https://www.phpbb.com/customise/db/styles/board_styles-12?sid=6245508b90fd3410be19888406fae215
Basically I'm repeating what Said said
If you have a clear metric to judge candidates on (eg engagement on a linkedin ad) then you might be able to do a super effective and quick performance-based hiring method. Shameless plug: https://www.lesswrong.com/posts/3AZkXwcCJZc5CAFQN/how-to-hire-somebody-better-than-yourself
Good luck!
Thanks for the cached explanation, this is similar to what I thought before a few days ago. But now I'm thinking that an older-but-still-youthful mouse would be better at avoiding predators and could be just as fertile, if mice were long lived. So the food & shelter might be "better spent" on them, in terms of total expected descendants. This would only leave the disease explanation, yes?
Hey thanks much for sharing new info with me. What a nice comment to read. I was sure someone would come by and be pissed and mean as hell, but folks have been engaging in quite good faith.
but I'm more reserved
I think this might point at the central problem with my evidence. People vary in how publicly they live their lives by orders of magnitude. It could be that only 1% of math geniuses are trans women but they post / get views on Twitter 100x more. Or a similar thing in high school and the workplace. Math professors tend to live quiet lives...
Anyway, unfortunately I think this post might be kinda too toxic/hurtful for the average reader to be worthwhile overall (although nobody has mentioned that to me) and I'll probably move it to a pastebin or something.
I think the basic question (whether hormones are fucking or helping your brain long-term) is quite important and deserves a better treatment. I might try to do that eventually.
Wrong link? Looks like this is it https://arxiv.org/abs/2409.06927