Parts of how that story was written triggers my sense of "this might have been embellished." (It reminds me of viral reddit stories.)
I'm curious if there are other accounts where a Nova persona got a user to contact a friend or family member with the intent of getting them to advocate for the AI persona in some way.
"The best possible life" for me pretty much includes "everyone who I care about is totally happy"?
Okay, I can see it being meant that way. (Even though, if you take this logic further, you could, as an altruist, make it include everything going well for everyone everywhere.) Still, that's only 50% of the coinflip.
And parents certainly do dangerous risky things to provide better future for their children all the time.
Yeah, that's true. I could even imagine that parents are more likely to flip coins that say "you die for sure but your kids get a 50% chance of the perfect life." (Especially if the kids are at an age where they would be able to take care of themselves even under the bad outcome.)
Are you kidding me? What is your discount rate? Not flipping that coin is absurd.
Not absurd. Not everything is "maximize your utilty." Some people care about the trajectory they're on together with other people. Are parents supposed to just leave their children? Do married people get to flip a coin that decides for both of them, or do they have to make independent throws (or does only one person get the opportunity)?
Also, there may be further confounders so that the question may not tell you exactly what you think it tells you. For instance, some people will flip the coin because they're unhappy and the coin is an easy way to solve their problems one way or another -- suicide feels easier if someone else safely does it for you and if there's a chance of something good to look forward to.
Thanks for this newsletter, I appreciate the density of information!
I thought about this and I'm not sure Musk's changes in "unhingedness" require more explanation than "power and fame have the potential to corrupt and distort your reasoning, making you more overconfident." The result looks a bit like hypomania, but I've seen this before with people who got fame and power injections. While Musk was already super accomplished (for justified reasons nonetheless) before taking over Twitter and jumping into politics, being the Twitter owner (so he can activate algorithmic godmode and get even more attention) probably boosted both his actual fame and his perceived fame by a lot, and being close buddies with the President certainly gave him more power too. Maybe this was too much -- you probably have to be unusually grounded and principled to not go a bit off the rails if you're in that sort of position. (Or maybe that means you shouldn't want to maneuver yourself into quite that much power in the first place.)
It feels vaguely reasonable to me to have a belief as low as 15% on "Superalignment is Real Hard in a way that requires like a 10-30 year pause." And, at 15%, it still feels pretty crazy to be oriented around racing the way Anthropic is.
Yeah, I think the only way I maybe find the belief combination "15% that alignment is Real Hard" and "racing makes sense at this moment" compelling is if someone thinks that pausing now would be too late and inefficient anyway. (Even then, it's worth considering the risks of "What if the US aided by AIs during takeoff goes much more authoritarian to the point where there'd be little difference between that and the CCP?") Like, say you think takeoff is just a couple of years of algorithmic tinkering away and compute restrictions (which are easier to enforce than prohibitions against algorithmic tinkering) wouldn't even make that much of a difference now.
However, if pausing now is too late, we should have paused earlier, right? So, insofar as some people today justify racing via "it's too late for a pause now," where were they earlier?
Separately, I want to flag that my own best guess on alignment difficulty is somewhere in between your "Real Hard" and my model of Anthropic's position. I'd say I'm overall closer to you here, but I find the "10-30y" thing a bit too extreme. I think that's almost like saying, "For practical purposes, we non-uploaded humans should think of the deep learning paradigm as inherently unalignable." I wouldn't confidently put that below 15% (we simply don't understand the technology well enough), but I likewise don't see why we should be confident in such hardness, given that ML at least gives us better control of the new species' psychology than, say, animal taming and breeding (e.g., Carl Shulman's arguments somewhere -- iirc -- in his podcasts with Dwarkesh Patel). Anyway, the thing that I instead think of as the "alignment is hard" objection to the alignment plans I've seen described by AI companies, is mostly just a sentiment of, "no way you can wing this in 10 hectic months while the world around you goes crazy." Maybe we should call this position "alignment can't be winged." (For the specific arguments, see posts by John Wentworth, such as this one and this one [particularly the section, "The Median Doom-Path: Slop, Not Scheming"].)
The way I could become convinced otherwise is if the position is more like, "We've got the plan. We think we've solved the conceptually hard bits of the alignment problem. Now it's just a matter of doing enough experiments where we already know the contours of the experimental setups. Frontier ML coding AIs will help us with that stuff and it's just a matter of doing enough red teaming, etc."
However, note that even when proponents of this approach describe it themselves, it sounds more like "we'll let AIs do most of it ((including the conceptually hard bits?))" which to me just sounds like they plan on winging it.
The DSM-5 may draw a bright line between them (mainly for insurance reimbursement and treatment protocol purposes), but neurochemically, the transition is gradual.
That sounded mildly surprising to me (though in hindsight I'm not sure why it did) so I checked with Claude 3.7, and it said something similar in reply to me trying to ask a not-too-leading question. (Though it didn't talk about neurochemistry -- just that behaviorally the transition or distinction can often be gradual.)
In my comments thus far, I've been almost exclusively focused on preventing severe abuse and too much isolation.
Something else I'm unsure about, but not necessarily a hill I want to die on given that government resources aren't unlimited, is the question of whether kids should have a right to "something at least similarly good as voluntary public school education." I'm not sure if this can be done cost-effectively, but if the state had a lot money that they're not otherwise using in better ways, then I think it would be pretty good to have standardized tests for homeschooled kids every now and then, maybe every two to three years. One of them could be an IQ test, the other an abilities test. If the kid has an IQ that suggests that they could learn things well but they seem super behind other children of their age, and you ask them if they want to learn and they say yes with enthusiasm, then that's suggestive of the parents doing an inadequate job, in which case you could put them on homeschooling probation and/or force them to allow their child to go to public school?
More concretely, do you think parents should have to pass a criminal background check (assuming this is what you meant by "background check") in order to homeschool, even if they retain custody of their children otherwise?
I don't really understand why you're asking me about this more intrusive and less-obviously-cost-effective intervention, when one of the examples I spelled out above was a lower-effort, less intrusive, less controversial version of this sort of proposal.
I wrote above:
Like, even if yearly check-ins for everyone turn out to be too expensive, you could at least check if people who sign their kid up for homeschooling already have a history of neglect and abuse, so that you can add regular monitoring if that turns out to be the case. (Note that such background checks are a low-effort action where the article claims no state is doing it so far.)
(In case this wasn't clear, by "regular monitoring" I mean stuff like "have a social worker talk to the kids.")
To make this more vivid, if someone is, e.g., a step dad with a history of child sexual abuse, or there's been a previous substantiated complaint about child neglect or physical abuse in some household, then yeah, it's probably a bad idea if parents with such track records can pull children out of public schools and thereby avoid all outside accountability for the next decade or so, possibly putting their children in a situation where no one would notice if they deteriorated/showed worsening signs of severe abuse. Sure, you're right that the question of custody plays into that. You probably agree that there are some cases where custody should be taken away. With children in school, there's quite a bit of "opportunity for noticing surface" for people potentially noticing and checking in if something seems really off. With children in completely unregulated homeschooling environments, there could be all the way down to zero noticing surface (like, maybe the evil grandma locked the children into a dark basement for the last two years and no one outside the household would know). All I'm saying is: The households that opt for potential high isolation, they should get compensatory check ins.
I even flagged that it may be too much effort to hire enough social workers to visit all the relevant households, so I proposed the option that maybe no one needs to check in yearly if Kelsey Piper and her friends are jointly homeschooling their kids, and instead, monitoring resources could get concentrated on cases where there's a higher prior of severe abuse and neglect.
Again, I don't see how that isn't reasonable.
Habryka claims I display a missing mood of not understanding how costly marginal regulation can be. In turn, I for sure feel like the people I've been arguing with display something weird. I wouldn't call it a missing mood, but more like a missing ambition to make things as good as they can be, think in nuances, and not demonize (and write off without closer inspection) all possible regulation just because it's common for regulation to go too far?
For those interested in this angle (how AI outcomes without humans could still go a number of ways, and what variables could make them go better/worse), I recently brainstormed here and here some things that might matter.