Joseph Miller - LessWrong

Advice for Activists from the History of Environmentalism

Thanks, this is really useful.

I am of the opinion that you should use good epistemics when talking to the public or policy makers, rather than using bad epistemics to try to be more persuasive.

Do you have any particular examples as evidence of this? This is something I've been thinking a lot about for AI and I'm quite uncertain. It seems that ~0% of advocacy campaigns have good epistemics, so it's hard to have evidence about this. Emotional appeals are important and often hard to reconcile with intellectual honesty.

Of course there are different standards for good epistemics and it's probably bad to outright lie, or be highly misleading. But by EA standards of "good epistemics" it seems less clear if the benefits are worth the costs.

As one example, the AI Safety movement may want to partner with advocacy groups who care about AI using copyrighted data or unions concerned about jobs. But these groups basically always have terrible epistemics and partnering usually requires some level of endorsement of their positions.

As an even more extreme example, as far as I can tell about 99.9% of people have terrible epistemics by LessWrong standards so to even expand to a decently sized movement you will have to fill the ranks with people who will constantly say and think things that you think are wrong.

How To Do Patching Fast

Joseph Miller3d10

I'm not sure if this is intentional but this explanation implies that edge patching can only be done between nodes in adjacent layers, which is not the case.

How To Do Patching Fast

Joseph Miller3d10

Yes you're correct that it does not work with LayerNorm between layers. I'm not aware of any models that do this. Are you?

How To Do Patching Fast

Joseph Miller3d10

Did you try how this works in practice? I could imagine an SGD-based circuit finder could be pretty efficient (compared to brute-force algorithms like ACDC), I'd love to see that comparison some day!

Yes it does work well! I did a kind of write up here but decided not to publish for various reasons.

Do you have a link to a writeup of Li et al. (2023) beyond the git repo?

https://arxiv.org/abs/2309.05973

Rejecting Television

Joseph Miller11d30

I quit YouTube a few years ago and it was probably the single best decision I've ever made.

However I also found that I naturally substitute it with something else. For example, I subsequently became addictived to Reddit. I quit Reddit and substituted for Hackernews and LessWrong. When I quit those I substituted for checking Slack, Email and Discord.

Thankfully being addicted to Slack does seem to be substantially less harmful than YouTube.

I've found the app OneSec very useful for reducing addictions. It's an app blocker that doesn't actually block, it just delays you opening the page, so you're much less likely to delete it in a moment of weakness.

Why I'm doing PauseAI

Joseph Miller12d10

Or is that sentence meant to indicate that an instance running after training might figure out how to hack the computer running it so it can actually change it's own weights?

I was thinking of a scenario where OpenAI deliberately gives it access to its own weights to see if it can self improve.

I agree that it would be more likely to just speed up normal ML research.

Why I'm doing PauseAI

Joseph Miller13d65

While I want people to support PauseAI

the small movement that PauseAI builds now will be the foundation which bootstraps this larger movement in the future

Is one of the main points of my post. If you support PauseAI today you may unleash a force which you cannot control tomorrow.

Thoughts on seed oil

Joseph Miller23d40

If you want to be healthier, we know ways you can change your diet that will help: Increase your overall diet “quality”. Eat lots of fruits and vegetables. Avoid processed food. Especially avoid processed meats. Eat food with low caloric density. Avoid added sugar. Avoid alcohol. Avoid processed food.

I'm confused - why are you so confident that we should avoid processed food. Isn't the whole point of your post that we don't know whether processed oil is bad for you? Where's the overwhelming evidence that processed food in general is bad?

Normalizing Sparse Autoencoders

Joseph Miller1mo10

Reconstruction loss is the CE loss of the patched model

If this is accurate then I agree that this is not the same as "the KL Divergence between the normal model and the model when you patch in the reconstructed activations". But Fengyuan described reconstruction score as:

measures how replacing activations changes the total loss of the model

which I still claim is equivalent.

Normalizing Sparse Autoencoders

Joseph Miller1mo10

I think just showing would be better than reconstruction score metric because $L_{0}$ is very noisy.

LESSWRONG
LW

Posts

Wiki Contributions

Comments