I like all the considerations you point out, but based on that reasoning alone, you could also argue that a con man who ran a lying scheme for 1 year and stole only like $20,000 should get life in prison -- after all, con men are pathological liars and that phenotype rarely changes all the way. And that seems too harsh?
I'm in two minds about it: On the one hand, I totally see the utilitarian argument of just locking up people who "lack a conscience" forever the first time they get caught for any serious crime. On the other hand, they didn't choose how they...
Suppose that a researcher's conception of current missing pieces is a mental object M, their timeline estimate is a probability function P, and their forecasting expertise F is a function that maps M to P. In this model, F can be pretty crazy, creating vast differences in P depending how you ask, while M is still solid.
Good point. This would be reasonable if you think someone can be super bad at F and still great at M.
Still, I think estimating "how big is this gap?" and "how long will it take to cross it?" might quite related, so I expect the skills to be correlated or even strongly correlated.
It surveyed 2,778 AI researchers who had published peer-reviewed research in the prior year in six top AI venues (NeurIPS, ICML, ICLR, AAAI, IJCAI, JMLR); the median time for a 50% chance of AGI was either in 23 or 92 years, depending on how the question was phrased.
Doesn't that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn't thought about this question sufficiently and/or sanely?
It seems irresponsible to me to update even just a small bit to ...
Doesn't that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn't thought about this question sufficiently and/or sanely?
We know that AI expertise and AI forecasting are separate skills and that we shouldn't expect AI researchers to be skilled at the latter. So even if researchers have thought sufficiently and sanely about the question of "what kinds of capabilities are we still missing that would be required for AGI", they would still be lacking t...
Well, the update for me would go both ways.
On one side, as you point out, it would mean that the model's single pass reasoning did not improve much (or at all).
On the other side, it would also mean that you can get large performance and reliability gains (on specific benchmarks) by just adding simple stuff. This is significant because you can do this much more quickly than the time it takes to train a new base model, and there's probably more to be gained in that direction – similar tricks we can add by hardcoding various "system-2 loops" into ...
When the issue is climate change, a prevalent rationalist take goes something like this:
"Climate change would be a top priority if it weren't for technological progress. However, because technological advances will likely help us to either mitigate the harms from climate change or will create much bigger problems on their own, we probably shouldn't prioritize climate change too much."
We could say the same thing about these trends of demographic aging that you highlight. So, I'm curious why you're drawn to this topic and where the normative motivation...
"Climate change would be a top priority if it weren't for technological progress. However, because technological advances will likely help us to either mitigate the harms from climate change or will create much bigger problems on their own, we probably shouldn't prioritize climate change too much."
This attitude deserves a name: technocrastinating.
Technological progress has been happening for a while. At some point, this argument will stop making sense and we must admit that no, this (climate change, fertility, whatever) is not fine, stop technocrastinat...
The tabletop game sounds really cool!
Interesting takeaways.
The first was exactly the above point, and that at some point, ‘I or we decide to trust the AIs and accept that if they are misaligned everyone is utterly f***ed’ is an even stronger attractor than I realized.
Yeah, when you say it like that... I feel like this is gonna be super hard to avoid!
...The second was that depending on what assumptions you make about how many worlds are wins if you don’t actively lose, ‘avoid turning wins into losses’ has to be a priority alongside ‘turn your losses into not l
I agree that it sounds somewhat premature to write off Larry Page based on attitudes he had a long time ago, when AGI seemed more abstract and far away, and then not seek/try communication with him again later on. If that were Musk's true and only reason for founding OpenAI, then I agree that this was a communication fuckup.
However, my best guess is that this story about Page was interchangeable with a number of alternative plausible criticisms of his competition on building AGI that Musk would likely have come up with in nearby worlds. People like Musk (a...
I totally agree. And I also think that all involved are quite serious when they say they care about the outcomes for all of humanity. So I think in this case history turned on a knife edge; Musk would've at least not done this much harm had he and Page had clearer thinking and clearer communication, possibly just by a little bit.
But I do agree that there's some motivated reasoning happening there, too. In support of your point that Musk might find an excuse to do what he emotionally wanted to anyway (become humanity's savior and perhaps emperor for eternit...
I thought the part you quoted was quite concerning, also in the context of what comes afterwards:
Hiatus: Sam told Greg and Ilya he needs to step away for 10 days to think. Needs to figure out how much he can trust them and how much he wants to work with them. Said he will come back after that and figure out how much time he wants to spend.
Sure, the email by Sutskever and Brockman gave some nonviolent communication vibes and maybe it isn't "the professional thing" to air one's feelings and perceived mistakes like that, but they seemed genuine in what ...
Some of the points you make don't apply to online poker. But I imagine that the most interesting rationality lessons from poker come from studying other players and exploiting them, rather than memorizing and developing an intuition for the pure game theory of the game.
It seems important to establish whether we are in fact going to be in a race and whether one side isn't already far ahead.
With racing, there's a difference between optimizing the chance of winning vs optimizing the extent to which you beat the other party when you do win. If it's true that China is currently pretty far behind, and if TAI timelines are fairly short so that a lead now is pretty significant, then the best version of "racing" shouldn't be "get to the finish line as fast as possible." Instead, it should be "use your lead to your advantage." So,...
Even if attaining a total and forevermore cessation of suffering is substantially more difficult/attainable by substantially fewer people in one lifetime, I don't think it's unreasonable to think that most people could suffer at least 50 percent less with dedicated mindfulness practice. I'm curious as to what might feed an opposing intuition for you! I'd be quite excited about empirical research that investigates the tractability and scalability of meditation for reducing suffering, in either case.
My sense is that existing mindfulness studies don't show th...
[...] I am certainly interested to know if anyone is aware of sources that make a careful distinction between suffering and pain in arguing that suffering and its reduction is what we (should) care about.
I did so in my article on Tranquilism, so I broadly share your perspective!
I wouldn't go as far as what you're saying in endnote 9, though. I mean, I see some chance that you're right in the impractical sense of, "If someone gave up literally all they cared about in order to pursue ideal meditation training under ideal circumstances (and during the trainin...
This would be a valid rebuttal if instruction-tuned LLMs were only pretending to be benevolent as part of a long-term strategy to eventually take over the world, and execute a treacherous turn. Do you think present-day LLMs are doing that? (I don't)
Or that they have a sycophancy drive. Or that, next to "wanting to be helpful," they also have a bunch of other drives that will likely win over the "wanting to be helpful" part once the system becomes better at long-term planning and orienting its shards towards consequentialist goals.
On that latter model...
I am not convinced MIRI has given enough evidence to support the idea that unregulated AI will kill everyone and their children.
The way you're expressing this feels like an unnecessarily strong bar.
I think advocacy for an AI pause already seems pretty sensible to me if we accept the following premises:
Would most existing people accept a gamble with 20% of chance of death in the next 5 years and 80% of life extension and radically better technology? I concede that many would, but I think it's far from universal, and I wouldn't be too surprised if half of people or more think this isn't for them.
I personally wouldn't want to take that gamble (strangely enough I've been quite happy lately and my life has been feeling meaningful, so the idea of dying in the next 5 years sucks).
(Also, I want to flag that I strongly disagree with your optimism.)
we have found Mr Altman highly forthcoming
That's exactly the line that made my heart sink.
I find it a weird thing to choose to say/emphasize.
The issue under discussion isn't whether Altman hid things from the new board; it's whether he hid things to the old board a long while ago.
Of course he's going to seem forthcoming towards the new board at first. So, the new board having the impression that he was forthcoming towards them? This isn't information that helps us much in assessing whether to side with Altman vs the old board. That makes me think: why repo...
For me, the key question in situations when leaders made a decision with really bad consequences is, "How did they engage with criticism and opposing views?"
If they did well on this front, then I don't think it's at all mandatory to push for leadership changes (though certainly, the worse someones track record gets, the more that speaks against them).
By contrast, if leaders tried to make the opposition look stupid or if they otherwise used their influence to dampen the reach of opposing views, then being wrong later is unacceptable.
Basically, I want to all...
I agree with what you say in the first paragraph. If you're talking about Ilya, which I think you are, I can see what you mean in the second paragraph, but I'd flag that even if he had some sort of plan here, it seems pretty costly and also just bad norms for someone with his credibility to say something that indicates that he thinks OpenAI is on track to do well at handling their great responsibility, assuming he were to not actually believe this. It's one thing to not say negative things explicitly; it's a different thing to say something positive that r...
It seems likely (though not certain) that they signed non-disparagement agreements, so we may not see more damning statements from them even if that's how they feel. Also, Ilya at least said some positive things in his leaving announcement, so that indicates either that he caved in to pressure (or too high agreeableness towards former co-workers) or that he's genuinely not particularly worried about the direction of the company and that he left more because of reasons related to his new project.
Someone serious about alignment seeing dangers better do what is save and not be influenced by a non-disparagement agreement. It might lose them some job prospects and have money and possible lawsuit costs, but if history on earth is on the line? Especially since such a known AI genius would find plenty support from people who supported such open move.
So I hope he assumes talking right NOW it not considered strategically worth it. E.g. He might want to increase his chance to be hired by semi safety serious company (more serious than Open AI, but not enough to hire a proven whistleblower), where he can use his position better.
I agree: appealing to libertarianism shouldn't automatically win someone the argument on whether it's okay to still have factory farms.
The fact that Zvi thought he provided enough of a pointer to an argument there feels weird, in my opinion.
That said, maybe he was mostly focused on wanting to highlight that a large subset of people who are strongly against this ban (and may use libertarian arguments to argue for their position) are only against bans when it suits their agenda. So, maybe the point was in a way more about specific people's hypocrisy in how t...
I think one issue is that someone can be aware about a specific worldview's existence and even consider it a plausible worldview, but still be quite bad at understanding what it would imply/look like in practice if it were true.
For me personally, it's not that I explicitly singled out the scenario that happened and assigned it some very low probability. Instead, I think I mostly just thought about scenarios that all start from different assumptions, and that was that.
For instance, when reading Paul's "What failure looks like" (which I had done multip...
I lean towards agreeing with the takeaway; I made a similar argument here and would still bet on the slope being very steep inside the human intelligence level.
Yeah, you need an enormous bankroll to play $10,000 tournaments. What a lot of pros do is sell action. Let's say you're highly skilled and have a, say, 125% expected return on investment. If you find someone with a big bankroll and they're convinced of your skills, you can you sell them your action at a markup somewhere between 1 and 1.2 to incentivize them to make a profit. I'd say something like 1.1 markup is fairest, so you're paying them a good prize to weather the variance for you. At 1.1 markup, they pay 1.1x whatever it costs you to buy into t...
You also quote this part of the article:
Theo Boer, a healthcare ethics professor at Protestant Theological University in Groningen, served for a decade on a euthanasia review board in the Netherlands. “I entered the review committee in 2005, and I was there until 2014,” Boer told me. “In those years, I saw the Dutch euthanasia practice evolve from death being a last resort to death being a default option.” He ultimately resigned.
I found a submission by this Theo Boer for the UK parliament, where he explains his reasons for now opposing euthanasia in ...
Assisted Suicide Watch
A psychiatrist overstepping their qualifications by saying “It’s never gonna get any better” ((particularly when the source of the suffering is at least partly BPD, for which it's commonly known that symptoms can get better in someone's 40s)) clearly should never happen.
However, I'd imagine that most mental health professionals would be extremely careful when making statements about whether there's hope for things to get better. In fact, there are probably guidelines around that.
Maybe it didn't happen this way at all: I notice I'm con...
If you know you have a winning hand, you do not want your opponent to fold, you want them to match your bet. So you kinda have to balance optimizing for the maximum pool at showdown with limiting the information you are leaking so there is a showdown. Or at least it would seem like that to me, I barely know the rules.
This is pretty accurate.
For simplicity, let's assume you have a hand that has a very high likelihood of winning at showdown on pretty much any runout. E.g., you have KK on a flop that is AK4, and your opponent didn't raise you before the...
FWIW, one thing I really didn't like about how he came across in the interview is that he seemed to be engaged in framing the narrative one-sidedly in an underhanded way, sneakily rather than out in the open. (Everyone tries to frame the narrative in some way, but it becomes problematic when people don't point out the places where their interpretation differs from others, because then listeners won't easily realize that there are claims that they still need to evaluate and think about rather than just take for granted and something that everyone else alrea...
There are realistic beliefs Altman could have about what's good or bad for AI safety that would not allow Zvi to draw that conclusion. For instance:
Small edges are why there's so much money gambled in poker.
It's hard to reach a skill level where you make money 50% of the night, but it's not that hard to reach a point where you're "only" losing 60% of the time. (That's still significantly worse than playing roulette, but compared to chess competitions where hobbyists never win any sort of prize, you've at least got chances.)
You criticize Altman for pushing ahead with dangerous AI tech, but then most of what you'd spend the money on is pushing ahead with tech that isn't directly dangerous. Sure, that's better. But it doesn't solve the issue that we're headed into an out-of-control future. Where's the part where we use money to improve the degree to which thoughtful high-integrity people (or prosocial AI successor agents with those traits) are able to steer where this is all going?
(Not saying there are easy answers.)
I mean, personality disorders are all about problems in close interpersonal relationships (or lack of interest in such relationships, in schizoid personality disorder), and trust is always really relevant in such relationships, so I think this could be a helpful lens of looking at things. At the same time, I'd be very surprised if you could derive new helpful treatment approaches from this sort of armchair reasoning (even just at the level of hypothesis generation to be subjected to further testing).
Also, some of these seem a bit strained:
...Dilemma:
- If the Thought Assessors converge to 100% accuracy in predicting the reward that will result from a plan, then a plan to wirehead (hack into the Steering Subsystem and set reward to infinity) would seem very appealing, and the agent would do it.
- If the Thought Assessors don’t converge to 100% accuracy in predicting the reward that will result from a plan, then that’s the very definition of inner misalignment!
[...]
The thought “I will secretly hack into my own Steering Subsystem” is almost certainly not aligned with the designer’s intention. So a
Conditioned Taste Aversion (CTA) is a phenomenon where, if I get nauseous right now, it causes an aversion to whatever tastes I was exposed to a few hours earlier—not a few seconds earlier, not a few days earlier, just a few hours earlier. (I alluded to CTA above, but not its timing aspect.) The evolutionary reason for this is straightforward: a few hours is presumably how long it typically takes for a toxic food to induce nausea.
That explains why my brother no longer likes mushrooms. When we were little, he liked them and we ate mushrooms at a restaurant,...
Is that sort of configuration even biologically possible (or realistic)? I have no deep immunology understanding, but I think bad reactions to vaccines have little to nothing to do with whether you're up-to-date on previous vaccines. So far, I'm not sure we're good at predicting who reacts with more severe side effects than average (and if we did, it's not like it's easy to tweak the vaccine, except for tradeoff-y things like lowering the vaccination dose).
My point is that I have no evidence that he ended up reading most of the relevant posts in their entirety. I don't think people who read all the posts in their entirety should just go ahead and unilaterally dox discussion participants, but I feel like people who have only read parts of it (or only secondhand sources) should do it even less.
Also, at the time, I interpreted Roko's "request for a summary" more as a way for him to sneer at people. His "summary" had a lot of loaded terms and subjective judgments in it. Maybe this is a style thing, but I f...
My point is that I have no evidence that he ended up reading most of the relevant posts in their entirety.
Indeed, because they were very long. That was Roko’s complaint!
I don’t think people who read all the posts in their entirety should just go ahead and unilaterally dox discussion participants, but I feel like people who have only read parts of it (or only secondhand sources) should do it even less.
I don’t think “how much of a post has someone read” has any bearing whatever on whether it’s proper to dox anyone.
...Also, at the time, I interpreted Ro
See my comment here.
Kat and Emerson were well-known in the community and they were accused of something that would cause future harm to EA community members as well. By contrast, Chloe isn't particularly likely to make future false allegations even based on Nonlinear's portrayal (I would say). It's different for Alice, since Nonlinear claim she has a pattern. (But with Alice, we'd at least want someone to talk to Nonlinear in private and verify how reliable they seem about negative info they have about Alice, before simply taking their word for it ba...
By contrast, Roko posted a 100 word summary of the Nonlinear incident that got some large number of net downvotes, so he seems to be particularly poorly informed about what even happened.
Roko posted a request for a summary—he offered his own current and admittedly poorly-informed understanding of the situation, by way of asking for a better version of same. (And he was right about the post he was commenting on being very long.) This is virtuous behavior, and the downvotes were entirely unwarranted.
Some conditions for when I think it's appropriate for an anonymous source to make a critical post about a named someone on the forum:
*I think there should be a role of "investigative reporter:" someone...
Very thoughtful post. I liked that you delved into this out of interest even though you aren't particularly involved in this community, but then instead of just treating it as fun but unproductive gossip, you used your interest to make a high-value contribution!
It changed my mind in some places (I had a favorable reaction to the initial post by Ben; also, I still appreciate what Ben tried to do).
I will comment on two points that I didn't like, but I'm not sure to what degree this changes your recommended takeaways (more on this below).
...They [Kat
I appreciate the detailed response!
I don't like that this sounds like this is only (or mostly) about tone.
The core of it, for me, is that Nonlinear was in a brutally difficult position. I've been on the receiving end of dogpiles from my own community before, and I know what it feels like. It's excruciating, it's terrifying, and you all-but see your life flashing before your eyes. Crisis communication is very, very, very difficult, particularly when people are already skeptical of you. Nonlinear's response to Ben was as he was on the verge of fundamen...
An organization gets applications from all kinds of people at once, whereas an individual can only ever work at one org. It's easier to discreetly contact most of the most relevant parties about some individual than it is to do the same with an organization.
I also think it's fair to hold orgs that recruit within the EA or rationalist communities to slightly higher standards because they benefit directly from association with these communities.
That said, I agree with habryka (and others) that
...I think if the accusations are very thoroughly falsified and
I agree in general, but think the force of this is weaker in this specific instance because NonLinear seems like a really small org. Most of the issues raised seem to be associated with in-person work and I would be surprised if NonLinear ever went above 10 in-person employees. So at most this seems like one order of magnitude in difference. Clearly the case is different for major corporations or orgs that directly interact with many more people.
a) A lot of your points are specifically about Altman and the board, whereas many of my points started that way but then went into the abstract/hypothetical/philosophical. At least, that's how I meant it – I should have made this more clear. I was assuming, for the sake of the argument, that we're speaking of a situation where the person in the board's position found out that someone else is deceptive to their very core, with no redeeming principles they adhere to. So, basically what you're describing in your point "I" with the lizardpeople. I focused on t...
When I make an agreement to work closely with you on a crucial project,
I agree that there are versions of "agreeing to work closely together on the crucial project" where I see this as "speak up now or otherwise allow this person into your circle of trust." Once someone is in that circle, you cannot kick them out without notice just because you think you observed stuff that made you change your mind – if you could do that, it wouldn't work as a circle of trust.
So, there are circumstances where I'd agree with you. Whether the relationship between a board me...
Maybe, yeah. Definitely strongly agree with not telling the staff a more complete story seems to be bad for both intrinsic and instrumental reasons.
I'm a bit unsure how wise it would be to tip Altman off in advance given what we've seen he can mobilize in support of himself.
And I think it's a thing that only EAs would think up that it's valuable to be cooperative towards people who you're convinced are deceptive/lack integrity. [Edit: You totally misunderstood what I meant here; I was criticizing them for doing this too naively. I was not prais...
Hm, to add a bit more nuance, I think it's okay at a normal startup for a board to be comprised of people who are likely to almost always side with the CEO, as long as they are independent thinkers who could vote against the CEO if the CEO goes off the rails. So, it's understandable (or even good/necessary) for CEOs to care a lot about having "aligned" people on the board, as long as they don't just add people who never think for themselves.
It gets more complex in OpenAI's situation where there's more potential for tensions between CEO and the board. I mea...
If this is true at all, it's not going to be a very strong effect, meaning you can find very rational and cognitively capable people who do the opposite of this in decision situations that directly pit reward against the things they hold most dearly. (And it may not be true because a lot of personal hedonists tend to "lack sophistication," in the sense that they don't understand that their own feelings of valuing ... (read more)