All of Lukas_Gloor's Comments + Replies

So I think the more rational and cognitively capable a human is, the more likely they'll optimize more strictly and accurately for future reward.

If this is true at all, it's not going to be a very strong effect, meaning you can find very rational and cognitively capable people who do the opposite of this in decision situations that directly pit reward against the things they hold most dearly. (And it may not be true because a lot of personal hedonists tend to "lack sophistication," in the sense that they don't understand that their own feelings of valuing ... (read more)

I like all the considerations you point out, but based on that reasoning alone, you could also argue that a con man who ran a lying scheme for 1 year and stole only like $20,000 should get life in prison -- after all, con men are pathological liars and that phenotype rarely changes all the way. And that seems too harsh?

I'm in two minds about it: On the one hand, I totally see the utilitarian argument of just locking up people who "lack a conscience" forever the first time they get caught for any serious crime. On the other hand, they didn't choose how they... (read more)

7Ben Pace
Good point. I can imagine things like "permanent parole" (note that probation and parole are meaningfully different) or being under house arrest or having constraints on your professional responsibilities or finances or something, being far better than literal incarceration.
4Ben Pace
I agree there are people who do small amounts of damage to society, are caught, and do not reoffend. Then there are other people whose criminal activities will be most of their effect on society, will reliably reoffend, and for whom the incapacitation strongly works out positive in consequentialist terms. My aim would be to have some way of distinguishing between them. The amount of evidence we have about Bankman-Fried's character is quite different than that of most con men, including from childhood and from his personal diary, so I hope we can have more confidence based on that. But a different solution is to not do any psychologizing, and just judge based on reoffending. See this section from the ACX post: I should add that Scott has lots of concerns about doing this in the US, and argues that properly doing this in the US would massively increase the incarcerated population. I didn't quite follow his concerns, but I was not convinced that something like this would be a bad idea on consequentialist grounds, even if the incarcerated population were to massively increase. (Note that I would support improving the quality of prisons to being broadly as nice as outside of prisons.)
7Noosphere89
One of the missing considerations is that crime is done mostly by young people, and the rate of crimes goes down the older you get. A lot of this IMO is that the impulsiveness/risk-taking behavior of crimes decreases a lot with age, but the empirical fact of crime going down with age, especially reoffending is a big reason why locking people up for life is less good than Ben Pace said, because the reoffending rate goes down with age.

Suppose that a researcher's conception of current missing pieces is a mental object M, their timeline estimate is a probability function P, and their forecasting expertise F is a function that maps M to P. In this model, F can be pretty crazy, creating vast differences in P depending how you ask, while M is still solid.

Good point. This would be reasonable if you think someone can be super bad at F and still great at M.

Still, I think estimating "how big is this gap?" and "how long will it take to cross it?" might quite related, so I expect the skills to be correlated or even strongly correlated.

6Kaj_Sotala
I think their relationship depends on whether crossing the gap requires grind or insight. If it's mostly about grind then a good expert will be able to estimate it, but insight tends to unpredictable by nature. Another way of looking at my comment above would be that timelines of less than 5 years would imply the remaining steps mostly requiring grind, and timelines of 20+ years would imply that some amount of insight is needed.

It surveyed 2,778 AI researchers who had published peer-reviewed research in the prior year in six top AI venues (NeurIPS, ICML, ICLR, AAAI, IJCAI, JMLR); the median time for a 50% chance of AGI was either in 23 or 92 years, depending on how the question was phrased.

Doesn't that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn't thought about this question sufficiently and/or sanely?

It seems irresponsible to me to update even just a small bit to ... (read more)

6Kaj_Sotala
That would imply that most professions would be getting automated or having their productivity very significantly increased. My impression from following the news and seeing some studies is that this is happening within copywriting, translation, programming, and illustration. [EDIT: and transcription] Also people are turning to chatbots for some types of therapy, though many people will still intrinsically prefer a human for that and it's not affecting the employment of human therapists yet. With o3, math (and maybe physics) research is starting to be affected, though it mostly hasn't been yet. I might be forgetting some, but the amount of professions left out of that list suggests that there are quite a few skill types that are still untouched. (There are of course a lot of other professions for which there have been moderate productivity boosts, but AFAIK mostly not to the point that it would affect employment.)

Doesn't that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn't thought about this question sufficiently and/or sanely?

We know that AI expertise and AI forecasting are separate skills and that we shouldn't expect AI researchers to be skilled at the latter. So even if researchers have thought sufficiently and sanely about the question of "what kinds of capabilities are we still missing that would be required for AGI", they would still be lacking t... (read more)

Well, the update for me would go both ways. 

On one side, as you point out, it would mean that the model's single pass reasoning did not improve much (or at all). 

On the other side, it would also mean that you can get large performance and reliability gains (on specific benchmarks) by just adding simple stuff. This is significant because you can do this much more quickly than the time it takes to train a new base model, and there's probably more to be gained in that direction – similar tricks we can add by hardcoding various "system-2 loops" into ... (read more)

3yo-cuddles
I sense that my quality of communication diminishes past this point, I should get my thoughts together before speaking too confidently I believe you're right we do something similar to the LLM's (loosely, analogously), see https://www.lesswrong.com/posts/i42Dfoh4HtsCAfXxL/babble (I need to learn markdown) My intuition is still LLM pessimistic, I'd be excited to see good practical uses, this seems like tool ai and that makes my existential dread easier to manage!

When the issue is climate change, a prevalent rationalist take goes something like this:

"Climate change would be a top priority if it weren't for technological progress. However, because technological advances will likely help us to either mitigate the harms from climate change or will create much bigger problems on their own, we probably shouldn't prioritize climate change too much." 

We could say the same thing about these trends of demographic aging that you highlight. So, I'm curious why you're drawn to this topic and where the normative motivation... (read more)

"Climate change would be a top priority if it weren't for technological progress. However, because technological advances will likely help us to either mitigate the harms from climate change or will create much bigger problems on their own, we probably shouldn't prioritize climate change too much."

This attitude deserves a name: technocrastinating.

Technological progress has been happening for a while. At some point, this argument will stop making sense and we must admit that no, this (climate change, fertility, whatever) is not fine, stop technocrastinat... (read more)

The tabletop game sounds really cool!

Interesting takeaways.

The first was exactly the above point, and that at some point, ‘I or we decide to trust the AIs and accept that if they are misaligned everyone is utterly f***ed’ is an even stronger attractor than I realized.

Yeah, when you say it like that... I feel like this is gonna be super hard to avoid!

The second was that depending on what assumptions you make about how many worlds are wins if you don’t actively lose, ‘avoid turning wins into losses’ has to be a priority alongside ‘turn your losses into not l

... (read more)

I agree that it sounds somewhat premature to write off Larry Page based on attitudes he had a long time ago, when AGI seemed more abstract and far away, and then not seek/try communication with him again later on. If that were Musk's true and only reason for founding OpenAI, then I agree that this was a communication fuckup.

However, my best guess is that this story about Page was interchangeable with a number of alternative plausible criticisms of his competition on building AGI that Musk would likely have come up with in nearby worlds. People like Musk (a... (read more)

I totally agree. And I also think that all involved are quite serious when they say they care about the outcomes for all of humanity. So I think in this case history turned on a knife edge; Musk would've at least not done this much harm had he and Page had clearer thinking and clearer communication, possibly just by a little bit.

But I do agree that there's some motivated reasoning happening there, too. In support of your point that Musk might find an excuse to do what he emotionally wanted to anyway (become humanity's savior and perhaps emperor for eternit... (read more)

I thought the part you quoted was quite concerning, also in the context of what comes afterwards: 

Hiatus: Sam told Greg and Ilya he needs to step away for 10 days to think. Needs to figure out how much he can trust them and how much he wants to work with them. Said he will come back after that and figure out how much time he wants to spend.

Sure, the email by Sutskever and Brockman gave some nonviolent communication vibes and maybe it isn't "the professional thing" to air one's feelings and perceived mistakes like that, but they seemed genuine in what ... (read more)

Some of the points you make don't apply to online poker. But I imagine that the most interesting rationality lessons from poker come from studying other players and exploiting them, rather than memorizing and developing an intuition for the pure game theory of the game. 

  • If you did want to focus on the latter goal, you can play online poker (many players can >12 tables at once) and after every session, run your hand histories through a program (e.g., "GTO Wizard") that will tell you where you made mistakes compared to optimal strategy, and how much
... (read more)
1rossry
Strongly agree. I didn't realize this when I wrote the original post, but I'm now convinced. It has been the most interesting / useful thing that I've learned in the working-out of Cunningham's Law with respect to this post. And so, there's a reason that the curriculum for my and Max's course shifts away from Nash equilibrium as the solution concept to optimizing winnings against an empirical (and non-Nash) field just as soon as we can manage it. For example, Practicum #3 (of 6) is "write a rock-paper-scissors bot that takes advantage of our not-exactly-random players as much as you can" without much further specification.

It seems important to establish whether we are in fact going to be in a race and whether one side isn't already far ahead.

With racing, there's a difference between optimizing the chance of winning vs optimizing the extent to which you beat the other party when you do win. If it's true that China is currently pretty far behind, and if TAI timelines are fairly short so that a lead now is pretty significant, then the best version of "racing" shouldn't be "get to the finish line as fast as possible." Instead, it should be "use your lead to your advantage." So,... (read more)

2niplav
Yep, makes sense. I think if I modify the model to shorten timelines & widen the gap between PRC and US the answer could flip.

Even if attaining a total and forevermore cessation of suffering is substantially more difficult/attainable by substantially fewer people in one lifetime, I don't think it's unreasonable to think that most people could suffer at least 50 percent less with dedicated mindfulness practice. I'm curious as to what might feed an opposing intuition for you! I'd be quite excited about empirical research that investigates the tractability and scalability of meditation for reducing suffering, in either case.

My sense is that existing mindfulness studies don't show th... (read more)

1jbkjr
If you have any specific studies in mind which show this, I would be interested to see! I have a sense that mindfulness tends to be studied in the context of “increasing well-being” in a general sense and not specifically to “decrease or eliminate suffering.” I would be quite interested in a study which studies meditation’s effects when directly targeting suffering. I really appreciate you raising this point in detail; I think it’s something I haven’t included enough in my own considerations. Having enough free time and energy for meditation practice is indeed a kind of privilege. I’m going to chew on this some more, but one initial thought I’ve had is that the general quality of life needed as a prerequisite to devoting enough time and energy to meditation practice may be lower than one may expect, at least by Western standards. For example, in a place like India, there seems to be a good amount of people in difficult circumstances that nonetheless make time for meditation and spiritual pursuits. However, I agree that in the limit, if all of your waking moments are focused on simply acquiring enough food today, it seems much less reasonable to prescribe meditation as the solution for their suffering.

[...] I am certainly interested to know if anyone is aware of sources that make a careful distinction between suffering and pain in arguing that suffering and its reduction is what we (should) care about.

I did so in my article on Tranquilism, so I broadly share your perspective!

I wouldn't go as far as what you're saying in endnote 9, though. I mean, I see some chance that you're right in the impractical sense of, "If someone gave up literally all they cared about in order to pursue ideal meditation training under ideal circumstances (and during the trainin... (read more)

1jbkjr
Lukas, thanks for taking the time to read and reply! I appreciate you reminding me of your article on Tranquilism—it's been a couple of years since I read it (during my fellowship with CLR), and I hadn't made a mental note of it making such a distinction when I did, so thanks for the reminder. While I agree that it's an open question as to how effective meditation is for alleviating suffering at scale (e.g. how easy it is for how many humans to reduce their suffering by how much with how much time/effort), I don't think it would require as much of a commitment as you seem to expect in the median case. Personally, I think it's likely that the median person would be able to make substantial progress in reducing suffering as a layperson, i.e. without becoming a monastic. Even if attaining a total and forevermore cessation of suffering is substantially more difficult/attainable by substantially fewer people in one lifetime, I don't think it's unreasonable to think that most people could suffer at least 50 percent less with dedicated mindfulness practice. I'm curious as to what might feed an opposing intuition for you! I'd be quite excited about empirical research that investigates the tractability and scalability of meditation for reducing suffering, in either case. (By the way, would it be alright if I ping you privately to set up a meeting? I've been a fan of your writing since becoming familiar with you during my time at CLR and would love a chance to pick your brain about SFE stuff and hear about what you've been up to lately!)

This would be a valid rebuttal if instruction-tuned LLMs were only pretending to be benevolent as part of a long-term strategy to eventually take over the world, and execute a treacherous turn. Do you think present-day LLMs are doing that? (I don't)

Or that they have a sycophancy drive. Or that, next to "wanting to be helpful," they also have a bunch of other drives that will likely win over the "wanting to be helpful" part once the system becomes better at long-term planning and orienting its shards towards consequentialist goals. 

On that latter model... (read more)

I thought the first paragraph and the boldened bit of your comment seemed insightful. I don't see why what you're saying is wrong – it seems right to me (but I'm not sure).

7habryka
(I didn't get anything out of it, and it seems kind of aggressive in a way that seems non-sequitur-ish, and also I am pretty sure mischaracterizes people. I didn't downvote it, but have disagree-voted with it)

I am not convinced MIRI has given enough evidence to support the idea that unregulated AI will kill everyone and their children.

The way you're expressing this feels like an unnecessarily strong bar. 

I think advocacy for an AI pause already seems pretty sensible to me if we accept the following premises: 

  • The current AI research paradigm mostly makes progress in capabilities before progress in understanding. 
    (This puts AI progress in a different reference class from most other technological progress, so any arguments with base rates from "tech
... (read more)
4Sen
The bar is very low for me: If MIRI wants to demand the entire world shut down an entire industry, they must be an active research institution actively producing agreeable papers. AI is not particularly unique even relative to most technologies. Our work on chemistry in the 1600's-1900's far outpaced our level of true understanding of chemistry, to the point where we only had a good model of an atom in the 20th century. And I don't think anyone will deny the potential dangers of chemistry. Other technologies followed a similar trajectory. We don't have to agree that the range is 20-80% at all, never mind the specifics of it. Most polls demonstrate researchers find around 5-10% chance of total extinction on the high end. MIRI's own survey finds a similar result! 80% would be insanely extreme. Your landscape of experts is, I'm guessing, your own personal follower list and not statistically viable.   

Would most existing people accept a gamble with 20% of chance of death in the next 5 years and 80% of life extension and radically better technology? I concede that many would, but I think it's far from universal, and I wouldn't be too surprised if half of people or more think this isn't for them.

I personally wouldn't want to take that gamble (strangely enough I've been quite happy lately and my life has been feeling meaningful, so the idea of dying in the next 5 years sucks).

(Also, I want to flag that I strongly disagree with your optimism.)
 

8Matthew Barnett
For what it's worth, while my credence in human extinction from AI in the 21st century is 10-20%, I think the chance of human extinction in the next 5 years is much lower. I'd put that at around 1%. The main way I think AI could cause human extinction is by just generally accelerating technology and making the world a scarier and more dangerous place to live. I don't really buy the model in which an AI will soon foom until it becomes a ~god.
6Seth Herd
I like this framing. I think the more common statement would be 20% chance of death in 10-30 years , and 80% chance of life extension and much better technology that they might not live to see. I think the majority of humanity would actually take this bet. They are not utilitarians or longtermists. So if the wager is framed in this way, we're going full steam ahead.

we have found Mr Altman highly forthcoming

That's exactly the line that made my heart sink.

I find it a weird thing to choose to say/emphasize.

The issue under discussion isn't whether Altman hid things from the new board; it's whether he hid things to the old board a long while ago.

Of course he's going to seem forthcoming towards the new board at first. So, the new board having the impression that he was forthcoming towards them? This isn't information that helps us much in assessing whether to side with Altman vs the old board. That makes me think: why repo... (read more)

Followed immediately by: 

I too also have very strong concerns that we are putting a person whose highest stats are political maneuvering and deception, who is very high in power seeking, into this position. By all reports, you cannot trust what this man tells you.

2bhauth
Yes, but Zvi's earlier posts were more positive about Altman. I just picked a relatively recent post, written after the board fired him.

For me, the key question in situations when leaders made a decision with really bad consequences is, "How did they engage with criticism and opposing views?"

If they did well on this front, then I don't think it's at all mandatory to push for leadership changes (though certainly, the worse someones track record gets, the more that speaks against them).

By contrast, if leaders tried to make the opposition look stupid or if they otherwise used their influence to dampen the reach of opposing views, then being wrong later is unacceptable.

Basically, I want to all... (read more)

I agree with what you say in the first paragraph. If you're talking about Ilya, which I think you are, I can see what you mean in the second paragraph, but I'd flag that even if he had some sort of plan here, it seems pretty costly and also just bad norms for someone with his credibility to say something that indicates that he thinks OpenAI is on track to do well at handling their great responsibility, assuming he were to not actually believe this. It's one thing to not say negative things explicitly; it's a different thing to say something positive that r... (read more)

It seems likely (though not certain) that they signed non-disparagement agreements, so we may not see more damning statements from them even if that's how they feel. Also, Ilya at least said some positive things in his leaving announcement, so that indicates either that he caved in to pressure (or too high agreeableness towards former co-workers) or that he's genuinely not particularly worried about the direction of the company and that he left more because of reasons related to his new project. 

Someone serious about alignment seeing dangers better do what is save and not be influenced by a non-disparagement agreement. It might lose them some job prospects and have money and possible lawsuit costs, but if history on earth is on the line? Especially since such a known AI genius would find plenty support from people who supported such open move.

So I hope he assumes talking right NOW it not considered strategically worth it. E.g. He might want to increase his chance to be hired by semi safety serious company (more serious than Open AI, but not enough to hire a proven whistleblower), where he can use his position better.

I agree: appealing to libertarianism shouldn't automatically win someone the argument on whether it's okay to still have factory farms.

The fact that Zvi thought he provided enough of a pointer to an argument there feels weird, in my opinion.

That said, maybe he was mostly focused on wanting to highlight that a large subset of people who are strongly against this ban (and may use libertarian arguments to argue for their position) are only against bans when it suits their agenda. So, maybe the point was in a way more about specific people's hypocrisy in how t... (read more)

3Andrew Burns
Zvi is talking about those people who use libertarianism as a gloss for "getting what they want." In other words, people who aren't into liberty per se, but only into liberty to the extent it satisfies their preferences. There probably is, and if there isn't, there should be, a word for people who invoke liberty this way. That way, when talking about the sort that, for instance, want children to be allowed to read the Bible in the classroom (because LIBERTY!) while simultaneously wanting to ban some book on trans-youth (because PARENTS RIGHTS), we can say: oh, yes, that (word) is at it again. I mean, hypocrite for sure, and perhaps gaslighter, but we need a better word. Well, if there is an existing word, please let me know. There are so many of these sorts out and about, they easily dwarf the population of libertarians.

I think one issue is that someone can be aware about a specific worldview's existence and even consider it a plausible worldview, but still be quite bad at understanding what it would imply/look like in practice if it were true. 

For me personally, it's not that I explicitly singled out the scenario that happened and assigned it some very low probability. Instead, I think I mostly just thought about scenarios that all start from different assumptions, and that was that.

For instance, when reading Paul's "What failure looks like" (which I had done multip... (read more)

1denkenberger
I did have some probability mass on AI boxing being relevant. And I still have some probability mass that there will be sudden recursive self-improvement. But I also had significant probability mass on AI being economically important, and therefore very visible. And with an acceleration of progress, I thought many people would be concerned about it. I don’t know as I would’ve predicted a particular chat-gpt moment (I probably would have guessed some large AI accident), but the point is that we should have been ready for a case when the public/governments became concerned about AI. I think the fact that there were some AI governance efforts before chat-gpt was due in large part to the people saying there could be slow take off, like Paul.

I lean towards agreeing with the takeaway; I made a similar argument here and would still bet on the slope being very steep inside the human intelligence level. 

In some of his books on evolution, Dawkins also said very similar things when commenting on Darwin vs Wallace, basically saying that there's no comparison, Darwin had a better grasp of things, justified it better and more extensively, didn't have muddled thinking about mechanisms, etc.

1francis kafka
I mean to some extent, Dawkins isn't a historian of science, presentism, yadda yadda but from what I've seen he's right here. Not that Wallace is somehow worse, given that of all the people out there he was certainly closer than the rest. That's about it

Very cool! I used to think Hume was the most ahead of his time, but this seems like the same feat if not better.

5dr_s
Democritus also has a decent claim to that for being the first to imagine atoms and materialism altogether.

Yeah, you need an enormous bankroll to play $10,000 tournaments. What a lot of pros do is sell action. Let's say you're highly skilled and have a, say, 125% expected return on investment. If you find someone with a big bankroll and they're convinced of your skills, you can you sell them your action at a markup somewhere between 1 and 1.2 to incentivize them to make a profit. I'd say something like 1.1 markup is fairest, so you're paying them a good prize to weather the variance for you.  At 1.1 markup, they pay 1.1x whatever it costs you to buy into t... (read more)

You also quote this part of the article:

Theo Boer, a healthcare ethics professor at Protestant Theological University in Groningen, served for a decade on a euthanasia review board in the Netherlands. “I entered the review committee in 2005, and I was there until 2014,” Boer told me. “In those years, I saw the Dutch euthanasia practice evolve from death being a last resort to death being a default option.” He ultimately resigned. 

I found a submission by this Theo Boer for the UK parliament, where he explains his reasons for now opposing euthanasia in ... (read more)

Assisted Suicide Watch

A psychiatrist overstepping their qualifications by saying “It’s never gonna get any better” ((particularly when the source of the suffering is at least partly BPD, for which it's commonly known that symptoms can get better in someone's 40s)) clearly should never happen.

However, I'd imagine that most mental health professionals would be extremely careful when making statements about whether there's hope for things to get better. In fact, there are probably guidelines around that.

Maybe it didn't happen this way at all: I notice I'm con... (read more)

8Lukas_Gloor
You also quote this part of the article: I found a submission by this Theo Boer for the UK parliament, where he explains his reasons for now opposing euthanasia in more detail. He writes: This is a "slope" of sorts, but I think it's not a bad one.  The arguments for extending the practice all seem reasonable. What matters is, "are people suffering?" and, "are they right that there's not enough hope for them to justify continued suffering?" Regarding pressure/being pressured, I thought this part was interesting: I'd be curious to figure out why it is exactly that requests for euthanasia are higher in demographs where people tend to be better off/suffering less. That said, I'm not at all convinced that this would prove that there's something horribly wrong going on with these developments after legalization of assisted suicide. (Still, I'd be curious to investigate this further.)  Reading this account, it feels to me like Theo Boer has a problem with death intrinsically, as opposed to only having a problem with death when a person has for-themselves good/strong reasons to continue to want to live. That's not an outlook I agree with. "Their own incapacity to face life's harder episodes" is a question-begging phrasing. For all we know, many people who choose assisted suicide would voluntarily chose to continue with their suffering if there was more at stake that they cared about! For instance, if they learned that by continuing to suffer, they'd solve world poverty, they might continue to suffer. It seems wrong, then, to say they're "incapable," when the real reason is more about how they don't want it enough. It's their life, so their decision. "Since the autonomous citizens are not under any other pressure than their own" – this is also an interesting perspective. He seems to be conceding that no matter how much society and relatives try to reassure chronically ill or disabled elderly people that they're still valued and cared about (something we absolutely m

If you know you have a winning hand, you do not want your opponent to fold, you want them to match your bet. So you kinda have to balance optimizing for the maximum pool at showdown with limiting the information you are leaking so there is a showdown. Or at least it would seem like that to me, I barely know the rules. 

This is pretty accurate.

For simplicity, let's assume you have a hand that has a very high likelihood of winning at showdown on pretty much any runout. E.g., you have KK on a flop that is AK4, and your opponent didn't raise you before the... (read more)

1quiet_NaN
Thanks, this is interesting.  From my understanding, in no-limit games, one would want to only have some fraction of ones bankroll in chips on the table, so that one can re-buy after losing an all-in bluff. (I would guess that this fraction should be determined by the Kelly criterion or something.) On the other hand, from browsing Wikipedia, it seems like many poker tournaments prohibit or limit re-buying after going bust. This would indicate that one has limited amounts of opportunity to get familiar with the strategy of the opponents (which could very well change once the stakes change).  (Of course, Kelly is kind of brutal with regard to gambling. In a zero sum game, the average edge is zero, so at least one participant should not be playing even from an EV perspective. But even under the generous assumption that you are 50% more likely than chance to win a 50 participant elimination tournament (e.g. because a third of the participants are actively trying to lose) (so your EV is 0.5 the buy-in) Kelly tells you to wager about 1% of your bankroll. So if the buy-in is 10k$ you would have to be a millionaire.)

I really liked this post! I will probably link to it in the future.

Edit: Just came to my mind that these are things I tend to think of under the heading "considerateness" rather than kindness, but it's something I really appreciate in people either way (and the concepts are definitely linked). 

4silentbob
Guess I'd agree. Maybe I was anchored a bit here by the existing term of computational kindness. :)

FWIW, one thing I really didn't like about how he came across in the interview is that he seemed to be engaged in framing the narrative one-sidedly in an underhanded way, sneakily rather than out in the open. (Everyone tries to frame the narrative in some way, but it becomes problematic when people don't point out the places where their interpretation differs from others, because then listeners won't easily realize that there are claims that they still need to evaluate and think about rather than just take for granted and something that everyone else alrea... (read more)

There are realistic beliefs Altman could have about what's good or bad for AI safety that would not allow Zvi to draw that conclusion. For instance: 

  • Maybe Altman thinks it's really bad for companies' momentum to go through CEO transitions (and we know that he believes OpenAI having a lot of momentum is good for safety, since he sees them as both adequately concerned about safety and more concerned about it than competitors).
  • Maybe Altman thinks OpenAI would be unlikely to find another CEO who understands the research landscape well enough while also be
... (read more)
6Seth Herd
This is great, thanks for filling in that reasoning. I agree that there are lots of plausible reasons Altman could've made that comment, other than disdain for safety.
7Lukas_Gloor
FWIW, one thing I really didn't like about how he came across in the interview is that he seemed to be engaged in framing the narrative one-sidedly in an underhanded way, sneakily rather than out in the open. (Everyone tries to frame the narrative in some way, but it becomes problematic when people don't point out the places where their interpretation differs from others, because then listeners won't easily realize that there are claims that they still need to evaluate and think about rather than just take for granted and something that everyone else already agrees about.)  He was not highlighting the possibility that the other side's perspective still has validity; instead, he was shrugging that possibility under the carpet. He talked as though (implicitly, not explicitly) it's now officially established or obviously true that the board acted badly (Lex contributed to this by asking easy questions and not pushing back on anything too much). He focused a lot on the support he got during this hard time and people saying good things about him (eulogy while still alive comparison, highlighting that he thinks there's no doubt about his character) and said somewhat condescending things about the former board (about how he thinks they had good intentions, said in that slow voice and thoughtful tone, almost like they had committed a crime) and then emphasized their lack of experience.  For contrast, here are things he could have said that would have made it easier for listeners to come to the right conclusions (I think anyone who is morally scrupulous about whether they're in the right in situations when many others speak up against them would have highlighted these points a lot more, so the absence of these bits in Altman's interview is telling us something.) * Instead of just saying that he believes the former board members came from a place of good intentions, also say if/whether he believes that some of the things they were concerned about weren't totally unreasona

Small edges are why there's so much money gambled in poker. 

It's hard to reach a skill level where you make money 50% of the night, but it's not that hard to reach a point where you're "only" losing 60% of the time. (That's still significantly worse than playing roulette, but compared to chess competitions where hobbyists never win any sort of prize, you've at least got chances.) 

You criticize Altman for pushing ahead with dangerous AI tech, but then most of what you'd spend the money on is pushing ahead with tech that isn't directly dangerous. Sure, that's better. But it doesn't solve the issue that we're headed into an out-of-control future. Where's the part where we use money to improve the degree to which thoughtful high-integrity people (or prosocial AI successor agents with those traits) are able to steer where this is all going? 
(Not saying there are easy answers.) 

8bhauth
"This is about how to spend money on AI safety" isn't the point of the opening of the post. It's more: * Here's some stuff I'd talk about anyway (like thinking of the economy in terms of energy flows) and a convenient way to frame it that was in the news! * Wow, we could also spend money on not maximally accelerating AI! * I see a lot of people saying that AI is urgently needed to solve [problem] like global warming, but here is how you solve some problems by solving the problems. AI is something I've thought about a lot, but I think I've already posted everything about that that I want to, and people didn't seem to appreciate this that much.

I mean, personality disorders are all about problems in close interpersonal relationships (or lack of interest in such relationships, in schizoid personality disorder), and trust is always really relevant in such relationships, so I think this could be a helpful lens of looking at things. At the same time, I'd be very surprised if you could derive new helpful treatment approaches from this sort of armchair reasoning (even just at the level of hypothesis generation to be subjected to further testing).

Also, some of these seem a bit strained: 

  • Narcissisti
... (read more)

Dilemma:

  • If the Thought Assessors converge to 100% accuracy in predicting the reward that will result from a plan, then a plan to wirehead (hack into the Steering Subsystem and set reward to infinity) would seem very appealing, and the agent would do it.
  • If the Thought Assessors don’t converge to 100% accuracy in predicting the reward that will result from a plan, then that’s the very definition of inner misalignment!

    [...]

    The thought “I will secretly hack into my own Steering Subsystem” is almost certainly not aligned with the designer’s intention. So a
... (read more)
4Steven Byrnes
I read that sequence a couple months ago (in preparation for writing §2.7 here), and found it helpful, thanks. I agree that we’re probably on basically the same page. FYI Alex also has this post making a similar point. I think I agree, in that I’m somewhat pessimistic about plans wherein we want the “adult AI” to have object-level goal X, and so we find a reward function and training environment where that winds up happening. Not that such a plan would definitely fail (e.g. lots of human adults are trying to take care of their children), just that it doesn’t seem like the kind of approach that passes the higher bar of having a strong reason to expect success (e.g. lots of human adults are not trying to take care of their children). (See here for someone trying to flesh out this kind of approach.) So anyway, my take right now is basically: * If we want the “adult AGI” to be trying to do a particular thing (‘make nanobots’, or ‘be helpful towards its supervisor’, or whatever), we should replace (or at least supplement) a well-chosen reward function with a more interpretability-based approach; for example, see Plan for mediocre alignment of brain-like [model-based RL] AGI (which is a simplified version of Post 14 of this series) * Or we can have a similar relation to AGIs that we have to the next generation of humans: We don’t know exactly at the object level what they will be trying to do and why, but they basically have “good hearts” and so we trust their judgment. These two bullet points correspond to the “two paths forward” of Post 12 of this series.

Conditioned Taste Aversion (CTA) is a phenomenon where, if I get nauseous right now, it causes an aversion to whatever tastes I was exposed to a few hours earlier—not a few seconds earlier, not a few days earlier, just a few hours earlier. (I alluded to CTA above, but not its timing aspect.) The evolutionary reason for this is straightforward: a few hours is presumably how long it typically takes for a toxic food to induce nausea.

That explains why my brother no longer likes mushrooms. When we were little, he liked them and we ate mushrooms at a restaurant,... (read more)

Is that sort of configuration even biologically possible (or realistic)? I have no deep immunology understanding, but I think bad reactions to vaccines have little to nothing to do with whether you're up-to-date on previous vaccines. So far, I'm not sure we're good at predicting who reacts with more severe side effects than average (and if we did, it's not like it's easy to tweak the vaccine, except for tradeoff-y things like lowering the vaccination dose). 

4Nathan Helm-Burger
Honestly, I don't know. I'm just trying to be open minded about the weird stuff that might become possible due to rapid advances in AI speeding up tech progress generally. Areas like biotech, where the complexity of the problem and our limited understanding are the bottlenecks will be more likely to progress rapidly than, say, energy or mass constrained areas of tech.

My point is that I have no evidence that he ended up reading most of the relevant posts in their entirety. I don't think people who read all the posts in their entirety should just go ahead and unilaterally dox discussion participants, but I feel like people who have only read parts of it (or only secondhand sources) should do it even less

Also, at the time, I interpreted Roko's "request for a summary" more as a way for him to sneer at people. His "summary" had a lot of loaded terms and subjective judgments in it. Maybe this is a style thing, but I f... (read more)

My point is that I have no evidence that he ended up reading most of the relevant posts in their entirety.

Indeed, because they were very long. That was Roko’s complaint!

I don’t think people who read all the posts in their entirety should just go ahead and unilaterally dox discussion participants, but I feel like people who have only read parts of it (or only secondhand sources) should do it even less.

I don’t think “how much of a post has someone read” has any bearing whatever on whether it’s proper to dox anyone.

Also, at the time, I interpreted Ro

... (read more)

See my comment here

Kat and Emerson were well-known in the community and they were accused of something that would cause future harm to EA community members as well. By contrast, Chloe isn't particularly likely to make future false allegations even based on Nonlinear's portrayal (I would say). It's different for Alice, since Nonlinear claim she has a pattern. (But with Alice, we'd at least want someone to talk to Nonlinear in private and verify how reliable they seem about negative info they have about Alice, before simply taking their word for it ba... (read more)

By contrast, Roko posted a 100 word summary of the Nonlinear incident that got some large number of net downvotes, so he seems to be particularly poorly informed about what even happened.

Roko posted a request for a summary—he offered his own current and admittedly poorly-informed understanding of the situation, by way of asking for a better version of same. (And he was right about the post he was commenting on being very long.) This is virtuous behavior, and the downvotes were entirely unwarranted.

Some conditions for when I think it's appropriate for an anonymous source to make a critical post about a named someone on the forum:

  • Is the accused a public person or do they run an organization in the EA or rationality ecosystem?
  • Or: Is the type of harm the person is accused of something that the community benefits from knowing?
  • Did someone who is non-anonymous and trusted in the community talk to the anonymous accuser and verify claims and (to some degree*) stake their reputation for them?

*I think there should be a role of "investigative reporter:" someone... (read more)

Very thoughtful post. I liked that you delved into this out of interest even though you aren't particularly involved in this community, but then instead of just treating it as fun but unproductive gossip, you used your interest to make a high-value contribution! 

It changed my mind in some places (I had a favorable reaction to the initial post by Ben; also, I still appreciate what Ben tried to do). 

I will comment on two points that I didn't like, but I'm not sure to what degree this changes your recommended takeaways (more on this below).

They [Kat

... (read more)

I appreciate the detailed response! 

I don't like that this sounds like this is only (or mostly) about tone.

The core of it, for me, is that Nonlinear was in a brutally difficult position. I've been on the receiving end of dogpiles from my own community before, and I know what it feels like. It's excruciating, it's terrifying, and you all-but see your life flashing before your eyes. Crisis communication is very, very, very difficult, particularly when people are already skeptical of you. Nonlinear's response to Ben was as he was on the verge of fundamen... (read more)

An organization gets applications from all kinds of people at once, whereas an individual can only ever work at one org. It's easier to discreetly contact most of the most relevant parties about some individual than it is to do the same with an organization.

I also think it's fair to hold orgs that recruit within the EA or rationalist communities to slightly higher standards because they benefit directly from association with these communities.

That said, I agree with habryka (and others) that 

I think if the accusations are very thoroughly falsified and

... (read more)

I agree in general, but think the force of this is weaker in this specific instance because NonLinear seems like a really small org. Most of the issues raised seem to be associated with in-person work and I would be surprised if NonLinear ever went above 10 in-person employees. So at most this seems like one order of magnitude in difference. Clearly the case is different for major corporations or orgs that directly interact with many more people. 

a) A lot of your points are specifically about Altman and the board, whereas many of my points started that way but then went into the abstract/hypothetical/philosophical. At least, that's how I meant it – I should have made this more clear. I was assuming, for the sake of the argument, that we're speaking of a situation where the person in the board's position found out that someone else is deceptive to their very core, with no redeeming principles they adhere to. So, basically what you're describing in your point "I" with the lizardpeople. I focused on t... (read more)

When I make an agreement to work closely with you on a crucial project,

I agree that there are versions of "agreeing to work closely together on the crucial project" where I see this as "speak up now or otherwise allow this person into your circle of trust." Once someone is in that circle, you cannot kick them out without notice just because you think you observed stuff that made you change your mind – if you could do that, it wouldn't work as a circle of trust.

So, there are circumstances where I'd agree with you. Whether the relationship between a board me... (read more)

6Ben Pace
1. I find the situation a little hard to talk about concretely because whatever concrete description I give will not be correct (because nobody involved is telling us what happened). Nonetheless, let us consider the most uncharitable narrative regarding Altman here, where members of the board come to believe he is a lizard, a person who is purely selfish and who has no honor. (To be clear I do not think this is accurate, I am using it for communication purposes.) Here are some rules. * Just because someone is a lizard, does not make it okay to lie to them * Just because someone is a lizard, does not make it okay to go back on agreements with them * While the lizard had the mandate to make agreements and commitments on behalf of your company, it is not now okay to disregard those agreements and commitments The situation must not be "I'll treat you honorably if I think you're a good person, but the moment I decide you're a lizard then I'll act with no honor myself." The situation must be "I will treat you honorably because it is right to be honorable." Otherwise the honor will seep out of the system as probabilities we assign to others' honor wavers. I think it is damaging to the trust people place in board members, to see them act with so little respect or honor. It reduces everyone's faith in one another to see people in powerful positions behave badly. II. I respect that in response to my disapproval of your statement, you took the time to explain in detail the reasoning behind your comment and communicate some more of your perspective on the relevant game theory. I think that generally helps, when folks are having conflicts, to examine openly the reasons why decisions were made and investigate those. And it also gives us more surface area to locate key parts of the disagreement. I still disagree with you. I think it was an easy-and-wrong thing to suggest that only people in the EA tribe would care about this important ethical principle I care about. But

Maybe, yeah. Definitely strongly agree with not telling the staff a more complete story seems to be bad for both intrinsic and instrumental reasons. 

I'm a bit unsure how wise it would be to tip Altman off in advance given what we've seen he can mobilize in support of himself. 

And I think it's a thing that only EAs would think up that it's valuable to be cooperative towards people who you're convinced are deceptive/lack integrity. [Edit: You totally misunderstood what I meant here; I was criticizing them for doing this too naively. I was not prais... (read more)

-1Ben Pace
Absolutely not. When I make an agreement to work closely with you on a crucial project, if I think you're deceiving me, I will let you know. I will not surprise backstab you and get on with my day. I will tell you outright and I will say it loudly. I may move quickly to disable you if it's an especially extreme circumstance but I will acknowledge that this is a cost to our general cooperative norms where people are given space to respond even if I assign a decent chance to them behaving poorly. Furthermore I will provide evidence and argument in response to criticism of my decision by other stakeholders who are shocked and concerned by it. Shame on you for suggesting only your tribe knows or cares about honoring partnerships with people after you've lost trust in them. Other people know what's decent too.

Hm, to add a bit more nuance, I think it's okay at a normal startup for a board to be comprised of people who are likely to almost always side with the CEO, as long as they are independent thinkers who could vote against the CEO if the CEO goes off the rails. So, it's understandable (or even good/necessary) for CEOs to care a lot about having "aligned" people on the board, as long as they don't just add people who never think for themselves.

It gets more complex in OpenAI's situation where there's more potential for tensions between CEO and the board. I mea... (read more)

Load More