
My website is here.

Wikitag Contributions


Sorted by

As a general rule, I try and minimise my phone screen time and maximise my laptop screen time. I can do every "productive" task faster on a laptop than on my phone.

Here are some things object level things I do that I find helpful that I haven't yet seen discussed.

  • Use a very minimalist app launcher on my phone, that makes searching for apps a conscious decision.
  • Use a greyscale filter on my phone (which is hard to turn off), as this makes doing most things on my phone harder.
  • Every time I get a notification I didn't need to get, I instantly disable it. This also generalizes to unsubscribing from emails I don't need to receive.

Yep, this sounds interesting! My suggestion for anyone wanting to run this experiment would be to start with SAD-mini, a subset of SAD with the five most intuitive and simple tasks. It should be fairly easy to adapt our codebase to call the Goodfire API. Feel free to reach out to myself or @L Rudolf L if you want assistance or guidance.

How do you know what "ideal behaviour" is after you steer or project out your feature? How would you differentiate a feature with sufficiently high cosine sim to a "true model feature" and a "true model feature"? I agree you can get some signal on whether a feature is causal, but would argue this is not ambitious enough.

Yes, that's right -- see footnote 10. We think that Transcoders and Crosscoders are directionally correct, in the sense that they leverage more of the models functional structure via activations from several sites, but agree that their vanilla versions suffer similar problems to regular SAEs.

Also related to the idea that the best linear SAE encoder is not the transpose of the decoder.

For another perspective on leveraging startups for improving the world see this blog post by @benkuhn.

A LW feature that I would find helpful is an easy to access list of all links cited by a given post.

Agreed that this post presents the altruistic case.

I discuss both the money and status points in the "career capital" paragraph (though perhaps should have factored them out).

your image of a man with a huge monitor doesn't quite scream "government policymaker" to me

Load More