It feels vaguely reasonable to me to have a belief as low as 15% on "Superalignment is Real Hard in a way that requires like a 10-30 year pause." And, at 15%, it still feels pretty crazy to be oriented around racing the way Anthropic is.
Yeah, I think the only way I maybe find the belief combination "15% that alignment is Real Hard" and "racing makes sense at this moment" compelling is if someone thinks that pausing now would be too late and inefficient anyway. (Even then, it's worth considering the risks of "What if the US aided by AIs during takeoff...
The DSM-5 may draw a bright line between them (mainly for insurance reimbursement and treatment protocol purposes), but neurochemically, the transition is gradual.
That sounded mildly surprising to me (though in hindsight I'm not sure why it did) so I checked with Claude 3.7, and it said something similar in reply to me trying to ask a not-too-leading question. (Though it didn't talk about neurochemistry -- just that behaviorally the transition or distinction can often be gradual.)
In my comments thus far, I've been almost exclusively focused on preventing severe abuse and too much isolation.
Something else I'm unsure about, but not necessarily a hill I want to die on given that government resources aren't unlimited, is the question of whether kids should have a right to "something at least similarly good as voluntary public school education." I'm not sure if this can be done cost-effectively, but if the state had a lot money that they're not otherwise using in better ways, then I think it would be pretty good to have standardized tes...
More concretely, do you think parents should have to pass a criminal background check (assuming this is what you meant by "background check") in order to homeschool, even if they retain custody of their children otherwise?
I don't really understand why you're asking me about this more intrusive and less-obviously-cost-effective intervention, when one of the examples I spelled out above was a lower-effort, less intrusive, less controversial version of this sort of proposal.
I wrote above:
...Like, even if yearly check-ins for everyone turn out to be too exp
Thanks for elaborating, that's helpful.
If we were under a different education regime
Something like what you describe would maybe even be my ideal too (I'm hedging because I don't have super informed views on this). But I don't understand how my position of "let's make sure we don't miss out on low-cost, low-downside ways of safeguarding children (who btw are people too and didn't consent to be born, especially not in cases where their parents lack empathy or treat children as not people) from severe abuse" is committed to having to answer this hypote...
Seriously, -5/-11?
I went through my post line by line and I don't get what people are allergic to.
I'm not taking sides. I flagged that some of the criticisms of homeschooling appear reasonable and important to me. I'm pretty sure I'm right about this, but somehow people want me to say less of this sort of thing, because what? Because public schools are "hell"? How is that different from people who consider the other political party so bad that you cannot say one nuanced thing about them -- isn't that looked down upon on this site?
Also, speaking of "h...
I strong-disagreed since I don't think any of your listed criticisms are reasonable. The implied premise is that homeschooling is deviant in a way that justifies a lot of government scrutiny, when really parents have a natural right to educate their children the way they want (with government intervention being reasonable in extreme cases that pass a high bar of scrutiny).
In particular, I think that outside of an existing norm where most students go to public school, the things you listed would be obviously unjust. Do you think that parents who fail a crim...
I got the impression that using only an external memory like in the movie Memento (and otherwise immediately forgetting everything that wasn't explicitly written down) was the biggest hurdle to faster progress. I think it does kind of okay considering that huge limitation. Visually, it would also benefit from learning the difference between what is or isn't a gate/door, though.
It depends on efficiency of the interventions you'd come up with (some may not be much of a "burden" at all) and on the elasticity with which parents who intend to homeschool are turned away by "burdens". You make a good point but what you say is not generally true -- it totally depends on the specifics of the situation. (Besides, didn't the cited study say that both rates of abuse were roughly equal? I don't think anyone suggested that public schooled kids have [edit: drastically] higher abuse rates than home schooled ones? Was it 37% vs 36%?)
I feel like it's worth pointing out the ways homeschooling can go badly wrong. Whether or not there's a correlation between homeschooling and abuse, it's obvious that homeschooling can cover up particulary bad instances of abuse (even if it's not the only way to do that). So, I feel like a position of "homeschooling has the potential to go very bad; we should probably have good monitoring to prevent that; are we sure we're doing that? Can we check?" seems sensible.
The article you call a "hit piece" makes some pretty sensible points. The title isn't s...
If there's less abuse happening in homeschooling than in regular schooling, a policy of "let's impose burdens on homeschooling to crack down on abuse in homeschooling" without a similar crackdown on abuse in non-home-schooling does not decrease abuse.
You can see something similar with self-driving cars. It is bad if a self-driving car crashes. It would be good to do things that reduce that. But if you get to a point where self-driving cars are safer than regular driving, and you continue to crack down on self-driving cars but not on regular driving, this is not good for safety overall.
Andrej Karpathy recently made a video on which model to use under what circumstances. A lot of it probably won't be new to people who read these AI overviews here regularly, but I learned things from it and it's something I'm planning to send to people who are new to working with LLMs.
...I want to flesh out one particular rushed unreasonable developer scenario that I’ve been thinking about lately: there’s ten people inside the AI company who are really concerned about catastrophic risk from misalignment. The AI company as a whole pays lip service to AI risk broadly construed and talks occasionally about risk from AGI, but they don’t take misalignment risk in particular (perhaps especially risk from schemers) very seriously.
[...]
What should these people try to do? The possibilities are basically the same as the possibilities for what
Great reply!
On episodic memory:
I've been watching Claude play Pokemon recently and I got the impression of, "Claude is overqualified but suffering from the Memento-like memory limitations. Probably the agent scaffold also has some easy room for improvements (though it's better than post-it notes and tatooing sentences on your body)."
I don't know much about neuroscience or ML, but how hard can it be to make the AI remember what it did a few minutes ago? Sure, that's not all that's between claude and TAI, but given that Claude is now within the human expert ...
I liked most thoughts in this post even though I have quite opposite intutions about timelines.
I agree timeline intuitions feel largely vibes based, so I struggle to form plausible guesses about the exact source behind our different intuitions.
I thought this passage was interesting in that respect:
Or, maybe there will be a sudden jump. Maybe learning sequential reasoning is a single trick, and now we can get from 4 to 1000 in two more years.
What you write as an afterthought is what I would have thought immediately. Sure, good chance I'm wrong. ...
[Edit: I wrote my whole reply thinking that you were talking about "organizational politics." Skimming the OP again, I realize you probably meant politics politics. :) Anyway, I guess I'm leaving this up because it also touches on the track record question.]
I thought Eliezer was quite prescient on some of this stuff. For instance, I remember this 2017 dialogue (so less than 2y after OpenAI was founded), which on the surface talks about drones, but if you read the whole post, it's clear that it's meant as an analogy to building AGI:
...AMBER: T
If so, why were US electricity stocks down 20-28% (wouldn't we expect them to go up if the US wants to strengthen its domestic AI-related infrastructure) and why did TSMC lose less, percentage-wise, than many other AI-related stocks (wouldn't we expect it to get hit hardest)?
In order to submit a question to the benchmark, people had to run it against the listed LLMs; the question would only advance to the next stage once the LLMs used for this testing got it wrong.
So I think the more rational and cognitively capable a human is, the more likely they'll optimize more strictly and accurately for future reward.
If this is true at all, it's not going to be a very strong effect, meaning you can find very rational and cognitively capable people who do the opposite of this in decision situations that directly pit reward against the things they hold most dearly. (And it may not be true because a lot of personal hedonists tend to "lack sophistication," in the sense that they don't understand that their own feelings of valuing ...
I like all the considerations you point out, but based on that reasoning alone, you could also argue that a con man who ran a lying scheme for 1 year and stole only like $20,000 should get life in prison -- after all, con men are pathological liars and that phenotype rarely changes all the way. And that seems too harsh?
I'm in two minds about it: On the one hand, I totally see the utilitarian argument of just locking up people who "lack a conscience" forever the first time they get caught for any serious crime. On the other hand, they didn't choose how they...
Suppose that a researcher's conception of current missing pieces is a mental object M, their timeline estimate is a probability function P, and their forecasting expertise F is a function that maps M to P. In this model, F can be pretty crazy, creating vast differences in P depending how you ask, while M is still solid.
Good point. This would be reasonable if you think someone can be super bad at F and still great at M.
Still, I think estimating "how big is this gap?" and "how long will it take to cross it?" might quite related, so I expect the skills to be correlated or even strongly correlated.
It surveyed 2,778 AI researchers who had published peer-reviewed research in the prior year in six top AI venues (NeurIPS, ICML, ICLR, AAAI, IJCAI, JMLR); the median time for a 50% chance of AGI was either in 23 or 92 years, depending on how the question was phrased.
Doesn't that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn't thought about this question sufficiently and/or sanely?
It seems irresponsible to me to update even just a small bit to ...
Doesn't that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn't thought about this question sufficiently and/or sanely?
We know that AI expertise and AI forecasting are separate skills and that we shouldn't expect AI researchers to be skilled at the latter. So even if researchers have thought sufficiently and sanely about the question of "what kinds of capabilities are we still missing that would be required for AGI", they would still be lacking t...
Well, the update for me would go both ways.
On one side, as you point out, it would mean that the model's single pass reasoning did not improve much (or at all).
On the other side, it would also mean that you can get large performance and reliability gains (on specific benchmarks) by just adding simple stuff. This is significant because you can do this much more quickly than the time it takes to train a new base model, and there's probably more to be gained in that direction – similar tricks we can add by hardcoding various "system-2 loops" into ...
When the issue is climate change, a prevalent rationalist take goes something like this:
"Climate change would be a top priority if it weren't for technological progress. However, because technological advances will likely help us to either mitigate the harms from climate change or will create much bigger problems on their own, we probably shouldn't prioritize climate change too much."
We could say the same thing about these trends of demographic aging that you highlight. So, I'm curious why you're drawn to this topic and where the normative motivation...
"Climate change would be a top priority if it weren't for technological progress. However, because technological advances will likely help us to either mitigate the harms from climate change or will create much bigger problems on their own, we probably shouldn't prioritize climate change too much."
This attitude deserves a name: technocrastinating.
Technological progress has been happening for a while. At some point, this argument will stop making sense and we must admit that no, this (climate change, fertility, whatever) is not fine, stop technocrastinat...
The tabletop game sounds really cool!
Interesting takeaways.
The first was exactly the above point, and that at some point, ‘I or we decide to trust the AIs and accept that if they are misaligned everyone is utterly f***ed’ is an even stronger attractor than I realized.
Yeah, when you say it like that... I feel like this is gonna be super hard to avoid!
...The second was that depending on what assumptions you make about how many worlds are wins if you don’t actively lose, ‘avoid turning wins into losses’ has to be a priority alongside ‘turn your losses into not l
I agree that it sounds somewhat premature to write off Larry Page based on attitudes he had a long time ago, when AGI seemed more abstract and far away, and then not seek/try communication with him again later on. If that were Musk's true and only reason for founding OpenAI, then I agree that this was a communication fuckup.
However, my best guess is that this story about Page was interchangeable with a number of alternative plausible criticisms of his competition on building AGI that Musk would likely have come up with in nearby worlds. People like Musk (a...
I totally agree. And I also think that all involved are quite serious when they say they care about the outcomes for all of humanity. So I think in this case history turned on a knife edge; Musk would've at least not done this much harm had he and Page had clearer thinking and clearer communication, possibly just by a little bit.
But I do agree that there's some motivated reasoning happening there, too. In support of your point that Musk might find an excuse to do what he emotionally wanted to anyway (become humanity's savior and perhaps emperor for eternit...
I thought the part you quoted was quite concerning, also in the context of what comes afterwards:
Hiatus: Sam told Greg and Ilya he needs to step away for 10 days to think. Needs to figure out how much he can trust them and how much he wants to work with them. Said he will come back after that and figure out how much time he wants to spend.
Sure, the email by Sutskever and Brockman gave some nonviolent communication vibes and maybe it isn't "the professional thing" to air one's feelings and perceived mistakes like that, but they seemed genuine in what ...
Some of the points you make don't apply to online poker. But I imagine that the most interesting rationality lessons from poker come from studying other players and exploiting them, rather than memorizing and developing an intuition for the pure game theory of the game.
It seems important to establish whether we are in fact going to be in a race and whether one side isn't already far ahead.
With racing, there's a difference between optimizing the chance of winning vs optimizing the extent to which you beat the other party when you do win. If it's true that China is currently pretty far behind, and if TAI timelines are fairly short so that a lead now is pretty significant, then the best version of "racing" shouldn't be "get to the finish line as fast as possible." Instead, it should be "use your lead to your advantage." So,...
Even if attaining a total and forevermore cessation of suffering is substantially more difficult/attainable by substantially fewer people in one lifetime, I don't think it's unreasonable to think that most people could suffer at least 50 percent less with dedicated mindfulness practice. I'm curious as to what might feed an opposing intuition for you! I'd be quite excited about empirical research that investigates the tractability and scalability of meditation for reducing suffering, in either case.
My sense is that existing mindfulness studies don't show th...
[...] I am certainly interested to know if anyone is aware of sources that make a careful distinction between suffering and pain in arguing that suffering and its reduction is what we (should) care about.
I did so in my article on Tranquilism, so I broadly share your perspective!
I wouldn't go as far as what you're saying in endnote 9, though. I mean, I see some chance that you're right in the impractical sense of, "If someone gave up literally all they cared about in order to pursue ideal meditation training under ideal circumstances (and during the trainin...
This would be a valid rebuttal if instruction-tuned LLMs were only pretending to be benevolent as part of a long-term strategy to eventually take over the world, and execute a treacherous turn. Do you think present-day LLMs are doing that? (I don't)
Or that they have a sycophancy drive. Or that, next to "wanting to be helpful," they also have a bunch of other drives that will likely win over the "wanting to be helpful" part once the system becomes better at long-term planning and orienting its shards towards consequentialist goals.
On that latter model...
I thought the first paragraph and the boldened bit of your comment seemed insightful. I don't see why what you're saying is wrong – it seems right to me (but I'm not sure).
I am not convinced MIRI has given enough evidence to support the idea that unregulated AI will kill everyone and their children.
The way you're expressing this feels like an unnecessarily strong bar.
I think advocacy for an AI pause already seems pretty sensible to me if we accept the following premises:
Would most existing people accept a gamble with 20% of chance of death in the next 5 years and 80% of life extension and radically better technology? I concede that many would, but I think it's far from universal, and I wouldn't be too surprised if half of people or more think this isn't for them.
I personally wouldn't want to take that gamble (strangely enough I've been quite happy lately and my life has been feeling meaningful, so the idea of dying in the next 5 years sucks).
(Also, I want to flag that I strongly disagree with your optimism.)
we have found Mr Altman highly forthcoming
That's exactly the line that made my heart sink.
I find it a weird thing to choose to say/emphasize.
The issue under discussion isn't whether Altman hid things from the new board; it's whether he hid things to the old board a long while ago.
Of course he's going to seem forthcoming towards the new board at first. So, the new board having the impression that he was forthcoming towards them? This isn't information that helps us much in assessing whether to side with Altman vs the old board. That makes me think: why repo...
Followed immediately by:
I too also have very strong concerns that we are putting a person whose highest stats are political maneuvering and deception, who is very high in power seeking, into this position. By all reports, you cannot trust what this man tells you.
For me, the key question in situations when leaders made a decision with really bad consequences is, "How did they engage with criticism and opposing views?"
If they did well on this front, then I don't think it's at all mandatory to push for leadership changes (though certainly, the worse someones track record gets, the more that speaks against them).
By contrast, if leaders tried to make the opposition look stupid or if they otherwise used their influence to dampen the reach of opposing views, then being wrong later is unacceptable.
Basically, I want to all...
I agree with what you say in the first paragraph. If you're talking about Ilya, which I think you are, I can see what you mean in the second paragraph, but I'd flag that even if he had some sort of plan here, it seems pretty costly and also just bad norms for someone with his credibility to say something that indicates that he thinks OpenAI is on track to do well at handling their great responsibility, assuming he were to not actually believe this. It's one thing to not say negative things explicitly; it's a different thing to say something positive that r...
It seems likely (though not certain) that they signed non-disparagement agreements, so we may not see more damning statements from them even if that's how they feel. Also, Ilya at least said some positive things in his leaving announcement, so that indicates either that he caved in to pressure (or too high agreeableness towards former co-workers) or that he's genuinely not particularly worried about the direction of the company and that he left more because of reasons related to his new project.
Someone serious about alignment seeing dangers better do what is save and not be influenced by a non-disparagement agreement. It might lose them some job prospects and have money and possible lawsuit costs, but if history on earth is on the line? Especially since such a known AI genius would find plenty support from people who supported such open move.
So I hope he assumes talking right NOW it not considered strategically worth it. E.g. He might want to increase his chance to be hired by semi safety serious company (more serious than Open AI, but not enough to hire a proven whistleblower), where he can use his position better.
I agree: appealing to libertarianism shouldn't automatically win someone the argument on whether it's okay to still have factory farms.
The fact that Zvi thought he provided enough of a pointer to an argument there feels weird, in my opinion.
That said, maybe he was mostly focused on wanting to highlight that a large subset of people who are strongly against this ban (and may use libertarian arguments to argue for their position) are only against bans when it suits their agenda. So, maybe the point was in a way more about specific people's hypocrisy in how t...
I think one issue is that someone can be aware about a specific worldview's existence and even consider it a plausible worldview, but still be quite bad at understanding what it would imply/look like in practice if it were true.
For me personally, it's not that I explicitly singled out the scenario that happened and assigned it some very low probability. Instead, I think I mostly just thought about scenarios that all start from different assumptions, and that was that.
For instance, when reading Paul's "What failure looks like" (which I had done multip...
I lean towards agreeing with the takeaway; I made a similar argument here and would still bet on the slope being very steep inside the human intelligence level.
In some of his books on evolution, Dawkins also said very similar things when commenting on Darwin vs Wallace, basically saying that there's no comparison, Darwin had a better grasp of things, justified it better and more extensively, didn't have muddled thinking about mechanisms, etc.
Very cool! I used to think Hume was the most ahead of his time, but this seems like the same feat if not better.
Yeah, you need an enormous bankroll to play $10,000 tournaments. What a lot of pros do is sell action. Let's say you're highly skilled and have a, say, 125% expected return on investment. If you find someone with a big bankroll and they're convinced of your skills, you can you sell them your action at a markup somewhere between 1 and 1.2 to incentivize them to make a profit. I'd say something like 1.1 markup is fairest, so you're paying them a good prize to weather the variance for you. At 1.1 markup, they pay 1.1x whatever it costs you to buy into t...
You also quote this part of the article:
Theo Boer, a healthcare ethics professor at Protestant Theological University in Groningen, served for a decade on a euthanasia review board in the Netherlands. “I entered the review committee in 2005, and I was there until 2014,” Boer told me. “In those years, I saw the Dutch euthanasia practice evolve from death being a last resort to death being a default option.” He ultimately resigned.
I found a submission by this Theo Boer for the UK parliament, where he explains his reasons for now opposing euthanasia in ...
Assisted Suicide Watch
A psychiatrist overstepping their qualifications by saying “It’s never gonna get any better” ((particularly when the source of the suffering is at least partly BPD, for which it's commonly known that symptoms can get better in someone's 40s)) clearly should never happen.
However, I'd imagine that most mental health professionals would be extremely careful when making statements about whether there's hope for things to get better. In fact, there are probably guidelines around that.
Maybe it didn't happen this way at all: I notice I'm con...
If you know you have a winning hand, you do not want your opponent to fold, you want them to match your bet. So you kinda have to balance optimizing for the maximum pool at showdown with limiting the information you are leaking so there is a showdown. Or at least it would seem like that to me, I barely know the rules.
This is pretty accurate.
For simplicity, let's assume you have a hand that has a very high likelihood of winning at showdown on pretty much any runout. E.g., you have KK on a flop that is AK4, and your opponent didn't raise you before the...
I really liked this post! I will probably link to it in the future.
Edit: Just came to my mind that these are things I tend to think of under the heading "considerateness" rather than kindness, but it's something I really appreciate in people either way (and the concepts are definitely linked).
I thought about this and I'm not sure Musk's changes in "unhingedness" require more explanation than "power and fame have the potential to corrupt and distort your reasoning, making you more overconfident." The result looks a bit like hypomania, but I've seen this before with people who got fame and power injections. While Musk was already super accomplished (for justified reasons nonetheless) before taking over Twitter and jumping into politics, being the Twitter owner (so he can activate algorithmic godmode and get even more attention) probably boosted b... (read more)