1 min read

2

This is a special post for quick takes by yanni kyriacos. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
108 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Prediction: In 6-12 months people are going to start leaving Deepmind and Anthropic for similar sounding reasons to those currently leaving OpenAI (50% likelihood).

> Surface level read of what is happening at OpenAI; employees are uncomfortable with specific safety policies. 
> Deeper, more transferable, harder to solve problem; no person that is sufficiently well-meaning and close enough to the coal face at Big Labs can ever be reassured they're doing the right thing continuing to work for a company whose mission is to build AGI.

Basically, this is less about "OpenAI is bad" and more "Making AGI is bad". 

9James Payor
I think this is kinda likely, but will note that people seem to take quite a while before they end up leaving. If OpenAI (both recently and the first exodus) is any indication, I think it might take longer for issues to gel and become clear enough to have folks more-than-quietly leave.
6Neel Nanda
I'd be pretty surprised
3yanni kyriacos
Hey mate thanks for the comment. I'm finding "pretty surprised" hard to interpret. Is that closer to 1% or 15%?

I like the fact that despite not being (relatively) young when they died, the LW banner states that Kahneman & Vinge have died "FAR TOO YOUNG", pointing to the fact that death is always bad and/or it is bad when people die when they were still making positive contributions to the world (Kahneman published "Noise" in 2021!).

I like it too, and because your comment made me think about it, I now kind of wish it said "orders of magnitude too young"

2Neel Nanda
What banner?
2Neil
They took it down real quick for some reason.
5the gears to ascension
it's still there for me
3the gears to ascension
gone now
3habryka
Oh, it seemed like the kind of thing you would only keep up around the time of death, and we kept it up for a bit more than a day. Somehow it seemed inappropriate to keep it up for longer.
2Neil
This reminds me of when Charlie Munger died at 99, and many said of him "he was just a child". Less of a nod to transhumanist aspirations, and more to how he retained his sparkling energy and curiosity up until death. There are quite a few good reasons to write "dead far too young". 

[IMAGE] Extremely proud and excited to watch Greg Sadler (CEO of Good Ancestors Policy) and Soroush Pour (Co-Founder of Harmony Intelligence) speak to the Australian Federal Senate Select Committee on Artificial Intelligence. This is a big moment for the local movement. Also, check out who they ran after.

4habryka
(Looks like the image got lost, probably you submitted the comment before it fully finished uploading)
1yanni kyriacos
Thanks for the heads up! Should be there now :)

Something I'm confused about: what is the threshold that needs meeting for the majority of people in the EA community to say something like "it would be better if EAs didn't work at OpenAI"?

Imagining the following hypothetical scenarios over 2024/25, I can't predict confidently whether they'd individually cause that response within EA?

  1. Ten-fifteen more OpenAI staff quit for varied and unclear reasons. No public info is gained outside of rumours
  2. There is another board shakeup because senior leaders seem worried about Altman. Altman stays on
  3. Superalignment team is disbanded
  4. OpenAI doesn't let UK or US AISI's safety test GPT5/6 before release
  5. There are strong rumours they've achieved weakly general AGI internally at end of 2025

This question is two steps removed from reality.  Here’s what I mean by that.  Putting brackets around each of the two steps:

what is the threshold that needs meeting [for the majority of people in the EA community] [to say something like] "it would be better if EAs didn't work at OpenAI"?
 

Without these steps, the question becomes 

What is the threshold that needs meeting before it would be better if people didn’t work at OpenAI?

Personally, I find that a more interesting question.  Is there a reason why the question is phrased at two removes like that?  Or am I missing the point?

What does a "majority of the EA community" mean here? Does it mean that people who work at OAI (even on superalignment or preparedness) are shunned from professional EA events? Does it mean that when they ask, people tell them not to join OAI? And who counts as "in the EA community"? 

I don't think it's that constructive to bar people from all or even most EA events just because they work at OAI, even if there's a decent amount of consensus people should not work there. Of course, it's fine to host events (even professional ones!) that don't invite OAI people (or Anthropic people, or METR people, or FAR AI people, etc), and they do happen, but I don't feel like barring people from EAG or e.g. Constellation just because they work at OAI would help make the case, (not that there's any chance of this happening in the near term) and would most likely backfire. 

I think that currently, many people (at least in the Berkeley EA/AIS community) will tell you to not join OAI if asked. I'm not sure if they form a majority in terms of absolute numbers, but they're at least a majority in some professional circles (e.g. both most people at FAR/FAR Labs and at Lightcone/Lighthaven would p... (read more)

2Dagon
[ I don't consider myself EA, nor a member of the EA community, though I'm largely compatible in my preferences ] I'm not sure it matters what the majority thinks, only what marginal employees (those who can choose whether or not to work at OpenAI) think.  And what you think, if you are considering whether to apply, or whether to use their products and give them money/status. Personally, I just took a job in a related company (working on applications, rather than core modeling), and I have zero concerns that I'm doing the wrong thing. [ in response to request to elaborate: I'm not going to at this time.  It's not secret, nor is my identity generally, but I do prefer not to make it too easy for 'bots or searchers to tie my online and real-world lives together. ]

I recently discovered the idea of driving all blames into oneself, which immediately resonated with me. It is relatively hardcore; the kind of thing that would turn David Goggins into a Buddhist.

Gemini did a good job of summarising it:

This quote by Pema Chödron, a renowned Buddhist teacher, represents a core principle in some Buddhist traditions, particularly within Tibetan Buddhism. It's called "taking full responsibility" or "taking self-blame" and can be a bit challenging to understand at first. Here's a breakdown:

What it Doesn't Mean:

  • Self-Flagellation: This practice isn't about beating yourself up or dwelling on guilt.
  • Ignoring External Factors: It doesn't deny the role of external circumstances in a situation.

What it Does Mean:

  • Owning Your Reaction: It's about acknowledging how a situation makes you feel and taking responsibility for your own emotional response.
  • Shifting Focus: Instead of blaming others or dwelling on what you can't control, you direct your attention to your own thoughts and reactions.
  • Breaking Negative Cycles: By understanding your own reactions, you can break free from negative thought patterns and choose a more skillful response.

Analogy:

Imagine a pebble thrown ... (read more)

4TsviBT
Well, what if there's a good piece of code (if you'll allow the crudity) in your head, and someone else's bad behavior is geared at hacking/exploiting that piece of code? The harm done is partly due to that piece of code and its role in part of your reaction to their bad behavior. But the implication is that they should stop with their bad behavior, not that you should get rid of the good code. I believe you'll respond "Ah, but you see, there's more than two options. You can change yourself in ways other than just deleting the code. You could recognize how the code is actually partly good and partly bad, and refactor it; and you could add other code to respond skillfully to their bad behavior; and you can add other code to help them correct their behavior.". Which I totally agree with, but at this point, what's being communicated by "taking self-blame" other than at best "reprogram yourself in Good/skillful ways" or more realistically "acquiesce to abuse"?
4Selfmaker662
The Stoics put this idea in a much kinder way: control the controllable (specifically our actions and attitudes), accept the uncontrollable.  The problem is, people's could's are broken. I have managed to make myself much unhappier by thinking I can control my actions until I read Nate Soares' post I linked above. You can't, even in the everyday definition of control, forgetting about paradoxes of "free will".
1sweenesm
Nice write up on this (even if it was AI-assisted), thanks for sharing! I believe another benefit is Raising One's Self-Esteem: If high self-esteem can be thought of as consistently feeling good about oneself, then if someone takes responsibility for their emotions, recognizing that they can change their emotions at will, they can consistently choose to feel good about and love themselves as long as their conscience is clear. This is inline with "The Six Pillars of Self-Esteem" by Nathaniel Branden: living consciously, self-acceptance, self-responsibility, self-assertiveness, living purposefully, and personal integrity.

I don't know how to say this in a way that won't come off harsh, but I've had meetings with > 100 people in the last 6 months in AI Safety and I've been surprised how poor the industry standard is for people:

- Planning meetings (writing emails that communicate what the meeting will be about, it's purpose, my role)
- Turning up to scheduled meetings (like literally, showing up at all)
- Turning up on time
- Turning up prepared (i.e. with informed opinions on what we're going to discuss)
- Turning up with clear agendas (if they've organised the meeting)
- Running the meetings
- Following up after the meeting (clear communication about next steps)

I do not have an explanation for what is going on here but it concerns me about the movement interfacing with government and industry. My background is in industry and these just seem to be things people take for granted?

It is possible I give off a chill vibe, leading to people thinking these things aren't necessary, and they'd switch gears for more important meetings!

4habryka
When people reach out to me to meet without an obvious business context I tend to be quite lax about my standards (it's them who asked for my time, I am not going to also put in 30 minutes to prepare for the meeting). Not sure whether that's what you are running into, but I have a huge amount of variance in how much effort I put into different meetings.
1yanni kyriacos
Hey thanks for your thoughts! I'm not sure what you mean by "obvious business context", but I'm either meeting friends / family (not business) or colleagues / community members (business). Is there a third group that I'm missing? If you're actually talking about the second group, I wonder how much your and their productivity would be improved if you were less lax about it?
4habryka
By 'obvious business context' I mean something like "there is a clear business/mission proposition for why I am taking this meeting". My guess is mine and their productivity would be reduced because the meetings would be a bunch costlier and I would meet with fewer people.
1yanni kyriacos
It just occurred to me that maybe posting this isn't helpful. And maybe net negative. If I get enough feedback saying it is then I'll happily delete.

There have been multiple occasions where I've copy and pasted email threads into an LLM and asked it things like:

  1. What is X person saying
  2. What are the cruxes in this conversation?
  3. Summarise this conversation
  4. What are the key takeaways
  5. What views are being missed from this conversation

I really want an email plugin that basically brute forces rationality INTO email conversations.

1Johannes C. Mayer
In principle, you could use Whisper or any other ASR system with high accuracy to enforce something like this during a live conversation.
1yanni kyriacos
Hi Johannes! Thanks for the suggestion :) I'm not sure i'd want it in the middle of a video call, but maybe in a forum context like this could be cool?
1Johannes C. Mayer
Seems pretty good to me to have this in a video call to me. The main reason why don't immediately try this out is that I would need to write a program to do this.
1yanni kyriacos
That seems fair enough!

I have heard rumours that an AI Safety documentary is being made. Separate to this, a good friend of mine is also seriously considering making one, but he isn't "in" AI Safety. If you know who this first group is and can put me in touch with them, it might be worth getting across each others plans.

4William the Kiwi
There is a AI x-risk documentary currently being filmed. An Inconvenient Doom. https://www.documentary-campus.com/training/masterschool/2024/inconvenient-doom It covers some aspects on AI safety, but doesn't focus on it exactly.

More thoughts on "80,000 hours should remove OpenAI from the Job Board"

- - - - 

A broadly robust heuristic I walk around with in my head is "the more targeted the strategy, the more likely it will miss the target". 

Its origin is from when I worked in advertising agencies, where TV ads for car brands all looked the same because they needed to reach millions of people, while email campaigns were highly personalised.

The thing is, if you get hit with a tv ad for a car, worst case scenario it can't have a strongly negative effect because of its generic... (read more)

A judgement I'm attached to is that a person is either extremely confused or callous if they work in capabilities at a big lab. Is there some nuance I'm missing here?

6Ann
I would consider, for the sake of humility, that they might disagree with your assessment for actual reasons, rather than assuming confusion is necessary. (I don't have access to their actual reasoning, apologies.) Edit: To give you a toy model of reasoning to chew on - Say a researcher has a p(doom from AGI) of 20% from random-origin AGI; 30% from military origin AGI; 10% from commercial lab origin AGI (and perhaps other numbers elsewhere that are similarly suggestive). They estimate the chances we develop AGI (relatively) soon as roughly 80%, regardless of their intervention. They also happen to have a have a p(doom from not AGI) of 40% from combined other causes, and expect an aligned AGI to be able to effectively reduce this to something closer to 1% through better coordinating reasonable efforts. What's their highest leverage action with that world model?
2yanni kyriacos
Hi Ann! Thank you for your comment. Some quick thoughts: "I would consider, for the sake of humility, that they might disagree with your assessment for actual reasons, rather than assuming confusion is necessary." * Yep! I have considered this. The purpose of my post is to consider it (I am looking for feedback, not upvotes or downvotes). "They also happen to have a have a p(doom from not AGI) of 40% from combined other causes, and expect an aligned AGI to be able to effectively reduce this to something closer to 1% through better coordinating reasonable efforts." * This falls into the confused category for me. I'm not sure how you have a 40% p(doom) from something other than unaligned AGI. Could you spell out for me what could make such a large number?
4Ann
Here's a few possibilities: * They predict that the catastrophic tipping points from climate change and perhaps other human-caused environmental changes will cause knock-on effects that eventually add up to our extinction, and the policy struggles to change that currently seem like we will not be able to pull them off despite observing clear initial consequences in terms of fire, storm, and ocean heating. * They model a full nuclear exchange in the context of a worldwide war as being highly possible and only narrowly evaded so far, and consider the consequences of that to cause or at least be as bad as extinction. * They are reasonably confident that pandemics arising or engineered without the help of AI could, in fact, take out our species under favorable circumstances, and worry the battlefield of public health is currently slipping towards the favor of diseases over time. * Probably smaller contributors going forward:  They are familiar with other religious groups inclined to bring about the apocalypse and have some actual concern over their chance of success. (Probably U.S.-focused.) * They are looking at longer time frames, and are thinking of various catastrophes likely within the decades or centuries immediately after we would otherwise have developed AGI, some of them possibly caused by the policies necessary to not do so. * They think humans may voluntarily decide it is not worth existing as a species unless we make it worth their while properly, and should not be stopped from making this choice. Existence, and the world as it is for humans, is hell in some pretty important and meaningful ways. * They are not long-termists in any sense but stewardship, and are counting the possibility that everyone who exists and matters to them under a short-term framework ages and dies. * They consider most humans to currently be in a state of suffering worse than non-existence, the s-risk of doom is currently 100%, and the 60% not-doom is mostly optimism we can
-2RHollerith
Maybe their goal is sabotage. Maybe they enjoy deception or consider themselves to be comparatively advantaged at it.
1yanni kyriacos
Hi Richard! Thanks for the comment. It seems to me that might apply to < 5% of people in capabilities?
2RHollerith
My probability is more like 0.1%.
-3Thomas Kwa
Someone I know who works at Anthropic, not on alignment, has thought pretty hard about this and concluded it was better than alternatives. Some factors include * by working on capabilities, you free up others for alignment work who were previously doing capabilities but would prefer alignment * more competition on product decreases aggregate profits of scaling labs At one point some kind of post was planned but I'm not sure if this is still happening. I also think there are significant upskilling benefits to working on capabilities, though I believe this less than I did the other day.
1yanni kyriacos
Thanks for your comment Thomas! I appreciate the effort. I have some questions: * by working on capabilities, you free up others for alignment work who were previously doing capabilities but would prefer alignment I am a little confused by this, would you mind spelling it out for me? Imagine "Steve" took a job at "FakeLab" in capabilities. Are you saying Steve making this decision creates a Safety job for "Jane" at "FakeLab", that otherwise wouldn't have existed? * more competition on product decreases aggregate profits of scaling labs Again I am a bit confused. You're suggesting that if, for e.g., General Motors announced tomorrow they were investing $20 billion to start an AGI lab, that would be a good thing? 
2Thomas Kwa
* Jane at FakeLab has a background in interpretability but is currently wrangling data / writing internal tooling / doing some product thing because the company needs her to, because otherwise FakeLab would have no product and be unable to continue operating including its safety research. Steve has comparative advantage at Jane's current job. * It seems net bad because the good effect of slowing down OpenAI is smaller than the bad effect of GM racing? But OpenAI is probably slowed down-- they were already trying to build AGI and they have less money and possibly less talent. Thinking about the net effect is complicated and I don't have time to do it here. The situation with joining a lab rather than founding one may also be different.

[PHOTO] I sent 19 emails to politicians, had 4 meetings, and now I get emails like this. There is SO MUCH low hanging fruit in just doing this for 30 minutes a day (I would do it but my LTFF funding does not cover this). Someone should do this!

3niplav
The image is not showing.
3yanni kyriacos
Thanks for letting me know!

I expect (~ 75%) that the decision to "funnel" EAs into jobs at AI labs will become a contentious community issue in the next year. I think that over time more people will think it is a bad idea. This may have PR and funding consequences too.

3Garrett Baker
This has been a disagreement people have had for many years. Why expect it to come to a head this year?
1yanni kyriacos
More people are going to quit labs / OpenAI. Will EA refill the leaky funnel?

Three million people are employed by the travel (agent) industry worldwide. I am struggling to see how we don't lose 80%+ of those jobs to AI Agents in 3 years (this is ofc just one example). This is going to be an extremely painful process for a lot of people.

4faul_sname
I am struggling to see how we do lose 80%+ of these jobs within the next 3 years. Operationalizing this, I would give you 4:1 that the fraction (or raw number, if you'd prefer) of employees occupied as travel agents is over 20% of today's value, according to the Labor Force Statistics from the US Bureau of Labor Statistics Current Population Survey (BLS CPS) Characteristics of the Employed dataset. For reference, here are the historical values for the BLS CPS series cpsaat11b ("Employed persons by detailed occupation and age") since 2011 (which is the earliest year they have it available as a spreadsheet). If you want to play with the data yourself, I put it all in one place in google sheets here. As of the 2023 survey, about 0.048% of surveyed employees, and 0.029% of surveyed people, were travel agents. As such, I would be willing to bet at 4:1 that when the 2027 data becomes available, at least 0.0096% of surveyed employees and at least 0.0058% of surveyed Americans report their occupation as "Travel Agent". Are you interested in taking the opposite side of this bet? Edit: Fixed aritmetic error in the percentages in the offered bet

Please help me find research on aspiring AI Safety folk!

I am two weeks into the strategy development phase of my movement building and almost ready to start ideating some programs for the year.

But I want these programs to be solving the biggest pain points people experience when trying to have a positive impact in AI Safety .

Has anyone seen any research that looks at this in depth? For example, through an interview process and then survey to quantify how painful the pain points are?

Some examples of pain points I've observed so far through my interviews wit... (read more)

2quila
I'm not aware of research on this. LW has a poll feature that at least the mods can use (based on embedded emoji reacts, which could be used for pain levels), maybe ask the LW team to make one and a post for this

Big AIS news imo: “The initial members of the International Network of AI Safety Institutes are Australia, Canada, the European Union, France, Japan, Kenya, the Republic of Korea, Singapore, the United Kingdom, and the United States.”

https://www.commerce.gov/news/press-releases/2024/09/us-secretary-commerce-raimondo-and-us-secretary-state-blinken-announce

H/T @shakeel

4Daniel Murfet
It might be worth knowing that some countries are participating in the "network" without having formal AI safety institutes

The Australian Federal Government is setting up an "AI Advisory Body" inside the Department of Industry, Science and Resources.

IMO that means Australia has an AISI in:

9 months (~ 30% confidence)
12 months (~ 50%)
18 months (~ 80%)

:mega:  Attention: AI Safety Enthusiasts in Wellington (and New Zealand) :mega: 

I'm pleased to announce the launch of a brand new Facebook group dedicated to AI Safety in Wellington: AI Safety Wellington (AISW). This is your local hub for connecting with others passionate about ensuring a safe and beneficial future with artificial intelligence / reducing x-risk.To kick things off, we're hosting a super casual meetup where you can:

  • Meet & Learn: Connect with Wellington's AI Safety community.
  • Chat & Collaborate: Discuss career paths, upcoming events, and
... (read more)

AI Safety (in the broadest possible sense, i.e. including ethics & bias) is going be taken very seriously soon by Government decision makers in many countries. But without high quality talent staying in their home countries (i.e. not moving to UK or US), there is a reasonable chance that x/c-risk won’t be considered problems worth trying to solve. X/c-risk sympathisers need seats at the table. IMO local AIS movement builders should be thinking hard about how to keep talent local (this is an extremely hard problem to solve).

TIL that the words "fact" and "fiction" come from the same word: "Artifice" - which is ~ "something that has been created".

I am 90% sure that most AI Safety talent aren't thinking hard enough about what Neglectedness. The industry is so nascent that you could look at 10 analogous industries, see what processes or institutions are valuable and missing and build an organisation around the highest impact one. 

The highest impact job ≠ the highest impact opportunity for you!

 

I really like this ad strategy from BlueDot. 5 stars.

90% of the message is still x / c-risk focussed, but by including "discrimination" and "loss of social connection" they're clearly trying to either (or both);

  1. create a big tent
  2. nudge people that are in "ethics" but sympathetic to x / c-risk into x / c-risk

(prediction: 4000 disagreement Karma)

Huh, this is really a surprisingly bad ad. None of the things listed are things that IMO have much to do with why AI is uniquely dangerous (they are basically all misuse-related), and also not really what any of the AI Safety fundamentals curriculum is about (and as such is false advertising). Like, I don't think the AI Safety fundamentals curriculum has anything that will help with election interference or loss of social connection or less likelihood of war or discrimination? 

2yanni kyriacos
I should have been clearer - I like the fact that they're trying to create a larger tent and (presumably) win ethics people over. There are many reasons not also not like the ad. I would also guess that they have an automated campaign running with several (maybe dozens) of pieces of creative. Without seeing their analytics it would be impossible to know which performs the best, but it wouldn't surprise me if it was this one (lists work).

I always get downvoted when I suggest that (1) if you have short timelines and (2) you decide to work at a Lab then (3) people might hate you soon and your employment prospects could be damaged.

What is something obvious I'm missing here? 

One thing I won't find convincing is someone pointing at the Finance industry post GFC as a comparison. 

I believe the scale of unemployment could be much higher. E.g. 5% ->15% unemployment in 3 years.

2faul_sname
If you work in a generative AI lab a significant number of people already hate the work you're doing and would likely hate you specifically if your existence became salient to you, for reasons that are at best tangentially related to your contribution to existential risk. This is true regardless of what your timelines look like. But I don't understand the mechanism by which working in a frontier AI lab is supposed to damage your employment prospects. The set of people who hate you for causing technological unemployment is probably not going to intersect much with the set of people who are making hiring decisions. People who have a history of doing antisocial-but-profitable-for-their-employer stuff get hired all the time, and proudly advertise those profitable antisocial activities on their resumes. In the extreme, you could argue that working in a frontier AI lab could lead to total human obsolescence, which would harm your job prospects on account of there are no jobs anywhere for anyone. But that's like saying "crashing into an iceberg could cause a noticeable decrease in the number of satisfied diners in the dining saloon of the Titanic".
1yanni kyriacos
My instinct as to why people don't find it a compelling argument; 1. They don't have short timelines like me, and therefore chuck it out completely 2. Are struggling to imagine a hostile public response to 15% unemployment rates 3. Copium 

I think it is good to have some ratio of upvoted/agreed : downvotes/disagreed posts in your portfolio. I think if all of your posts are upvoted/high agreeance then you're either playing it too safe or you've eaten the culture without chewing first.

I'm pretty confident that a majority of the population will soon have very negative attitudes towards big AI labs. I'm extremely unsure about what impact this will have on the AI Safety and EA communities (because we work with those labs in all sorts of ways). I think this could increase the likelihood of "Ethics" advocates becoming much more popular, but I don't know if this necessarily increases catastrophic or existential risks.

Not sure if this type of concern has reached the meta yet, but if someone approached me asking for career advice, tossing up whether to apply for a job at a big AI lab, I would let them know that it could negatively affect their career prospects down the track because so many people now perceive such as a move as either morally wrong or just plain wrong-headed. And those perceptions might only increase over time. I am not making a claim here beyond this should be a career advice consideration.

When AI Safety people are also vegetarians, vegans or reducetarian, I am pleasantly surprised, as this is one (of many possible) signals to me they're "in it" to prevent harm, rather than because it is interesting.

Yesterday Greg Sadler and I met with the President of the Australian Association of Voice Actors. Like us, they've been lobbying for more and better AI regulation from government. I was surprised how much overlap we had in concerns and potential solutions:
1. Transparency and explainability of AI model data use (concern)

2. Importance of interpretability (solution)

3. Mis/dis information from deepfakes (concern)

4. Lack of liability for the creators of AI if any harms eventuate (concern + solution)

5. Unemployment without safety nets for Australians (concern)

6.... (read more)

Ilya Sutskever has left OpenAI https://twitter.com/ilyasut/status/1790517455628198322

If GPT5 actually comes with competent agents then I expect this to be a "Holy Shit" moment at least as big as ChatGPT's release. So if ChatGPT has been used by 200 million people, then I'd expect that to at least double within 6 months of GPT5 (agent's) release. Maybe triple. So that "Holy Shit" moment means a greater share of the general public learning about the power of frontier models. With that will come another shift in the Overton Window. Good luck to us all.

The catchphrase I walk around with in my head regarding the optimal strategy for AI Safety is something like: Creating Superintelligent Artificial Agents* (SAA) without a worldwide referendum is ethically unjustifiable. Until a consensus is reached on whether to bring into existence such technology, a global moratorium is required (*we already have AGI).

I thought it might be useful to spell that out.

What is your AI Capabilities Red Line Personal Statement? It should read something like "when AI can do X in Y way, then I think we should be extremely worried / advocate for a Pause*". 

I think it would be valuable if people started doing this; we can't feel when they're on an exponential, so its likely we will have powerful AI creep up on us.

@Greg_Colbourn just posted this and I have an intuition that people are going to read it and say "while it can do Y it still can't do X"

*in the case you think a Pause is ever optimal.

4Vladimir_Nesov
A Pause doesn't stop capabilities at the currently demonstrated level. In that sense, GPT-2 might've been a prudent threshold (in a saner civilization) for when to coordinate to halt improvement in semiconductor technology and to limit production capacity for better nodes (similarly to not building too many centrifuges for enriching uranium, as a first line of defense against a nuclear winter). If not GPT-2, when the power of scaling wasn't yet obvious to most, then certainly GPT-3. In our world, I don't see it happening other than in response to a catastrophe where there are survivors, the good outcomes our civilization might be capable of reaching lack dignity and don't involve a successful Pause. A Pause must end, so conditions for a Pause need to be sensitive to new developments that make Pausing no longer necessary, which could also be a problem in a survivable catastrophe world.

I'd like to see research that uses Jonathan Haidt's Moral Foundations research and the AI risk repository to forecast whether there will soon be a moral backlash / panic against frontier AI models.

Ideas below from Claude:

# Conservative Moral Concerns About Frontier AI Through Haidt's Moral Foundations

## Authority/Respect Foundation
### Immediate Concerns
- AI systems challenging traditional hierarchies of expertise and authority
- Undermining of traditional gatekeepers in media, education, and professional fields
- Potential for AI to make decisions traditiona... (read more)

I beta tested a new movement building format last night: online networking. It seems to have legs.

V quick theory of change: 
> problem it solves: not enough people in AIS across Australia (especially) and New Zealand are meeting each other (this is bad for the movement and people's impact).
> we need to brute force serendipity to create collabs.
> this initiative has v low cost

quantitative results:
> I purposefully didn't market it hard because it was a beta. I literally got more people that I hoped for
> 22 RSVPs and 18 attendees
> this s... (read more)

Is anyone in the AI Governance-Comms space working on what public outreach should look like if lots of jobs start getting automated in < 3 years? 

I point to Travel Agents a lot not to pick on them, but because they're salient and there are lots of them. I think there is a reasonable chance in 3 years that industry loses 50% of its workers (3 million globally).

People are going to start freaking out about this. Which means we're in "December 2019" all over again, and we all remember how bad Government Comms were during COVID.

Now is the time to start ... (read more)

I’d like to quickly skill up on ‘working with people that have a kind of neurodivergence’.

This means;

  • understanding their unique challenges
  • how I can help them get the best from/for themselves
  • but I don’t want to sink more than two hours into this

Can anyone recommend some online content for me please?

2mesaoptimizer
I recommend messaging people who seem to have experience doing so, and requesting to get on a call with them. I haven't found any useful online content related to this, and everything I've learned in relation to social skills and working with neurodivergent people, I learned by failing and debugging my failures.
1yanni kyriacos
Thanks for the feedback! I had a feeling this is where I'd land :|

Something bouncing around my head recently ... I think I agree with the notion that "you can't solve a problem at the level it was created".

A key point here is the difference between "solving" a problem and "minimising its harm".

  • Solving a problem = engaging with a problem by going up a level from which is was createwd
  • Minimising its harm = trying to solve it at the level it was created

Why is this important? Because I think EA and AI Safety have historically focussed (and has their respective strengths in) harm-minimisation.

This applies obviously the micro. ... (read more)

[IMAGE] there is something about the lack overlap between these two audiences that makes me uneasy. WYD?

1metachirality
I think working on safety roles at capabilities orgs is mostly mutually exclusive with a pause, so I don't think this is that remarkable.

A piece of career advice I've given a few times recently to people in AI Safety, which I thought worth repeating here, is that AI Safety is so nascent a field that the following strategy could be worth pursuing:

1. Write your own job description (whatever it is that you're good at / brings joy into your life).

2. Find organisations that you think need thing that job but don't yet have it. This role should solve a problem they either don't know they have or haven't figured out how to solve.

3. Find the key decision maker and email them. Explain the (their) pro... (read more)

4mesaoptimizer
There's generally a cost to managing people and onboarding newcomers, and I expect that offering to volunteer for free is usually a negative signal, since it implies that there's a lot more work than usual that would need to be done to onboard this particular newcomer. Have you experienced otherwise? I'd love to hear some specifics as to why you feel this way.

I've decided to post something very weird because it might (in some small way) help shift the Overton Window on a topic: as long as the world doesn't go completely nuts due to AI, I think there is a 5%-20% chance I will reach something close to full awakening / enlightenment in about 10 years. Something close to this: 

Very quick thoughts on setting time aside for strategy, planning and implementation, since I'm into my 4th week of strategy development and experiencing intrusive thoughts about needing to hurry up on implementation;

  • I have a 52 week LTFF grant to do movement building in Australia (AI Safety)
  • I have set aside 4.5 weeks for research (interviews + landscape review + maybe survey) and strategy development (segmentation, targeting, positioning),
  • Then 1.5 weeks for planning (content, events, educational programs), during which I will get feedback from others on th
... (read more)

I have an intuition that if you tell a bunch of people you're extremely happy almost all the time (e.g. walking around at 10/10) then many won't believe you, but if you tell them that you're extremely depressed almost all the time (e.g. walking around at 1/10) then many more would believe you. Do others have this intuition? Keen on feedback.

1cubefox
In my experience, people with mania (the opposite of depression) tend to exhibit more visible symptoms, like talking a lot and very loudly, laughing more than the situation warrants, appearing overconfident etc. While people with depression are harder to notice, except in severe cases were they can't even get out of bed. So if someone doesn't have symptoms of mania, it is likely they aren't manic. Of course it is possible there are extremely happy people who aren't manic, but equally it is also possible that there are extremely unhappy people who aren't depressed. Since the latter seems rare, the former seem also rare.
1bideup
If many more people are extremely happy all the time than extremely depressed all the time, the bunch of people you describe would be managing their beliefs rationally. And indeed I think that’s probably the case.

Two jobs in AI Safety Advocacy that AFAICT don't exist, but should and probably will very soon. Will EAs be the first to create them though? There is a strong first mover advantage waiting for someone -

1. Volunteer Coordinator - there will soon be a groundswell from the general population wanting to have a positive impact in AI. Most won't know how to. A volunteer manager will help capture and direct their efforts positively, for example, by having them write emails to politicians

2. Partnerships Manager - the President of the Voice Actors guild reached out... (read more)

Help clear something up for me: I am extremely confused (theoretically) how we can simultaneously have:

1. An Artificial Superintelligence

2. It be controlled by humans (therefore creating misuse of concentration of power issues)

My intuition is that once it reaches a particular level of power it will be uncontrollable. Unless people are saying that we can have models 100x more powerful than GPT4 without it having any agency??

4Mitchell_Porter
You could have a Q&A superintelligence that is passive and reactive - it gives the best answer to a question, on the basis of what it already knows, but it takes no steps to acquire more information, and when it's not asked a question, it just sits there... But any agent that uses it, would de facto become a superintelligence with agency. 
3Davidmanheim
This is one of the key reasons that the term alignment was invented and used instead of control; I can be aligned with the interests of my infant, or my pet, without any control on their part.

Something someone technical and interested in forecasting should look into:  can LLMs reliably convert peoples claims into a % of confidence through sentiment analysis? This would be useful for Forecasters I believe (and rationality in general)

Be the meme you want to see in the world (screenshot).

[GIF] A feature I'd love on the forum: while posts are read back to you, the part of the text that is being read is highlighted. This exists on Naturalreaders.com and would love to see it here (great for people who have wandering minds like me).

I think acting on the margins is still very underrated. For e.g. I think 5x the amount of advocacy for a Pause on capabilities development of frontier AI models would be great. I also think in 12 months time it would be fine for me to reevaluate this take and say something like 'ok that's enough Pause advocacy'.

Basically, you shouldn't feel 'locked in' to any view. And if you're starting to feel like you're part of a tribe, then that could be a bad sign you've been psychographically locked in.

I went to buy a ceiling fan with a light in it recently. There was one on sale that happened to also tick all my boxes, joy! But the salesperson warned me "the light in this fan can't be replaced and only has 10,000 hours in it.  After that you'll need a new fan. So you might not want to buy this one." I chuckled internally and bought two of them, one for each room.

2Viliam
So basically the lights will only run out after a time sufficient to become a master at something. If you become a master at replacing lights, problem solved. :D 

"alignment researchers are found to score significantly higher in liberty (U=16035, p≈0)" This partly explains why so much of the alignment community doesn't support PauseAI!

"Liberty: Prioritizes individual freedom and autonomy, resisting excessive governmental control and supporting the right to personal wealth. Lower scores may be more accepting of government intervention, while higher scores champion personal freedom and autonomy..." 
https://forum.effectivealtruism.org/posts/eToqPAyB4GxDBrrrf/key-takeaways-from-our-ea-and-alignment-research-surveys... (read more)

[+][comment deleted]10