Joseph Miller - LessWrong

It might be nice to move all AI content to the Alignment Forum. I'm not sure the effect you're discussing is real, but if it is, it might be because LW has become a de facto academic journal for AI safety research, so many people are posting without significant engagement with the LW canon or any interested in rationality.

The current rules around who can post on the Alignment Forum seem a bit antiquated. I've been working on alignment research for over 2 years and I don't know off the top of my head how to get permission to post there. And I expect the relevant people to see stuff if it's on LW anyway.

β-redex's Shortform

Joseph Miller5d20

The thread under this comment been yourself and Said seems to conflate two different questions, resulting in you talking past each other:
1. Can Aella predict that people will be offended by things.
2. Can Aella empathize with the offended people / understand why they are offended.

My guess would be that Aella can generally predict people's reactions, but she cannot empathize with their point of view.

The nearby pattern of internet bait that I dislike is when someone says "I don't understand how someone can say/think/do X" where this is implicitly a criticism of the behavior.

I think the reason I find these tweets of Aella's highly engaging and rage-baity is that they generally read as criticisms to me. Perhaps this is uncharitable, but I expect this is also how most others read them.

β-redex's Shortform

Joseph Miller9d*136

I hope I don't have to explain why some people would rather not go near X/Twitter with a ten foot pole.

Right. So for me the even bigger question is "why is someone like Eliezer on twitter at all?" If politics is the mind killer, twitter is the mind mass-murderer.

It delivers you the tribal fights most tailored to trigger you based on your particular interests. Are you a grey tribe who can rise above the usual culture war battles? Well then here's an article about NIMBYs or the state of American education that's guaranteed to make you seethe with anger at people's stupidity.

When I read Eliezer's sequences I feel like I've learned some interesting ideas and spent my time well. When I read Eliezer's twitter I feel frustrated by the world and enraged by various out-group comments.

Or Aella, whose blog is exciting and interesting. But who posts comments on twitter like "I'm shocked that people are offended by [highly decoupled statement] / [extremely norm-breaking behavior]" when it is completely obvious that most highly-coupling / norm-following people would be triggered by that.

Isopropylpod's Shortform

Joseph Miller12d53

This discussion has been had many times before on LessWrong. I suggested taking Why it's so hard to talk about Consciousness as a starting point.

Joseph Miller's Shortform

Joseph Miller18d8946

Anthropic is reportedly lobbying against the federal bill that would ban states from regulating AI. Nice!

The best approaches for mitigating "the intelligence curse" (or gradual disempowerment); my quick guesses at the best object-level interventions

Joseph Miller18d144

Implicit in my views is that the problem would be mostly resolved if people had aligned AI representatives which helped them wield their (current) power effectively.

Can you make the case for this a bit more? How are AI representatives going to help people prevent themselves becoming disempowered / economically redundant? (Especially given that you explicitly state you are skeptical of "generally make humans+AI (rather than just AI) more competitive").

Mandatory interoperability for alignment and fine-tuning

Furthermore, I don't really see how fine-tuning access helps create AI representatives. Models are already trained to be helpful and most people don't have very useful personal data that would make their AI work much better for them (that can't be put in context of any model).

The hope here would be to get the reductions in concentration of power that come from open source

The concentration of power from closed source AI comes from (1) the AI companies' profits and (2) the AI companies having access to more advanced AI than the public. Open source solves (1), but fine-tuning access solves neither. (Obviously your "Deploying models more frequently" proposal does help with (2)).

LessWrong Feed [new, now in beta]

Joseph Miller20d20

Yeah that would be great!

LessWrong Feed [new, now in beta]

Joseph Miller20d20

I appreciate the effort to avoid the usual downsides of feeds. But your arguments that this feed will be good actually don't persuade me. I already spend more time than I endorse on LessWrong, especially since the introduction of quick takes.

Zac Hatfield Dodds's Shortform

Joseph Miller21d90

Does anyone understand the real motivation here? Who at Anthropic makes the call to appoint a random CEO who (presumably) doesn't care about x-risk, and what do they get out of it?

Can We Naturalize Moral Epistemology?

Joseph Miller25d*20

This is the best proposal I've read for making progress on the question of "What should we point the ASI at?"

Are humans such that they can be brought through a series of arguments and endorse all of the premises in the arguments and end up with radically different moral views

I don't feel quite satisfied with this operationalization. It could Goodharted (especially by an ASI) by finding adversarial inputs to humans.

I also think someone could decide that their meta-preference is to endorse their unconsidered preferences and that the act of thinking about moral arguments is actually against their morality. This may sound absurd but probably accurately describes many people's responses to drowning child thought experiments. Most people don't act as if they have utility functions and I don't see why people should be required to adopt one. To the extent that it matters a lot what people think about their preferences upon reflection, they will probably notice the inconsistencies in their preference once they start Dutch-booking themselves by manifesting inconsistent preferences with the help of ASI.

kinds of questions we know how to solve, and could solve with the help of AI

I predict that once people can talk to ASIs, many will quickly see the bizareness of the realist view and a lot more people will start saying "I don't know what you're talking about, can you please make this clearer?".

LESSWRONG
LW

Posts

Wikitag Contributions

Comments