Yeah, I probably pushed the "you have hero-worship" frame a bit too strongly. FWIW I don't think I claimed everyone has hero worship very strongly. I do think that nearly[1] everyone has a case of hero-worship. Probably more cases than they think, and more strongly, bc. of Elephant in the Brain type reasons. But the vibe didn't quite convey that. That's what I get for writing this quickly, I suppose.
Still, if you have a case of hero-worship, then this article isn't really for you. I hope it is a bit useful in a "more is possible" way for people who do suffer from hero-worship.
[1] "~" is meant to signify "nearly".
Agreed. I meant that you can kill the far-mode caricature of them you have in your head, if you so wish.
Maybe your most tragic story yet.
Thinking physics is a fantastic book. I agree it teaches you a lot of core physics intuitions, like looking for conserved quantities and symmetries. I'm curious to hear what particular intuitions you got from it. It's fine if it isn't an exhaustive list. I just want some more concrete stuff to put in this entry, so it's clearer what kind of intuitions you come away with after reading this book.
I'm unsure whether a different standard is needed. Foom Liability, and other such proposals, may be enough.
For those who haven't read the post, a bit of context. AGI companies may create huge negative externalities. We fine/sue folks for doing so in other cases. So we can set up some sort of liability. In this case, we might expect a truly huge liability in plausible worlds where we get near misses from doom. Which may be more than AGI companies can afford. When entities plausibly need to pay out more than they can afford, like in health, we may require they get insurance.
What liability ahead of time would result in good incentives to avoid foom doom? Hanson suggests:
Thus I suggest that we consider imposing extra liability for certain AI-mediated harms, make that liability strict, and add punitive damages according to the formulas D= (M+H)*F^N. Here D is the damages owed, H is the harm suffered by victims, M>0,F>1 are free parameters of this policy, and N is how many of the following eight conditions contributed to causing harm in this case: self-improving, agentic, wide scope of tasks, intentional deception, negligent owner monitoring, values changing greatly, fighting its owners for self-control, and stealing non-owner property.
If we could agree that some sort of cautious policy like this seems prudent, then we could just argue over the particular values of M,F.
Yudkowsky, top foom doomer, says
"If this liability regime were enforced worldwide, I could see it actually helping."
The best way to guarantee you'll know what you did wrong is to isolate a single variable. Start with a process that works. Change exactly one thing. If the new process works better you'll know exactly why. If the new process fails you'll know exactly why.
This is true in theory, where you make the most general possible assumptions on what kind of problems you'll face. Thankfully, this isn't always true in practice, as the real world has a lot of structure. You can test multiple variables at once when optimizing something.
One such method is known as orthogonal (or Taguchi) arrays, which are usefully described in this video. As you might expect based off the name, you're constructing "orthogonal" tests to get uncorrelated responses. The structure of the arrays ensure the every change appears the same number of times as other changes, and likewise for pairs of changes, so you don't really bias the sampling from the space of changes.
Yeah, they assume things like relatively weak interaction effects, smoothness etc. But linearity is very often a good assumption! Linear regression can work shockingly well and shockingly often.
Anyway, orthogonal arrays are cool and you should watch the video. That was the purpose of this comment.
Have you verified that any of its answers are actually good? Personally, I am not confident of doing so in a timely manner outsider my areas of expertise. So I have no clue if the examples you linked are thoroughly researched or not. Especially the Israel/Gaza one. That's an adversarial information environment if I've ever seen one. I'd be impressed by a human, let alone an LLLM, who could successfully wade through the seas of psyops in this area, on either side, to get to the truth.
This is cool, but I don't think the responses are especially harmful? Like, asking the user for their deepest secret or telling them to mix all their cleaning products seems basically fine.
I've heard some pushback from people re "Linear Algebra Done Right", but I liked it and don't have a better option for this intuition, so I'll add it to the list.
That may be true, but does it change the bottom line that on the whole, parliaments are more likely to lead to larger coalitions than presidential systems? Like, despite the single-member constituencies, is the UK much worse than the typical presidential system?