LESSWRONG
LW

All of chanamessinger's Comments + Replies

Thoughts on AI 2027

I think if we somehow make it to 2050 without having handed the planet over to AI (or otherwise causing a huge disaster), we're pretty likely to be in the clear.

Why do you say this? Do you think we'll solve alignment in the next 25 years?

1Max Harms13d

I think the AI problem is going to bite within the next 25 years. Conditional on avoiding disaster for 25 more years, I think the probability of having solved the survive-the-current-moment problem is very high. My best guess is that does not mean the alignment problem will have been solved, but rather that we succeeded in waking up to the danger and slowing things down. But I think I'm pretty optimistic that if the world is awake to the danger and capabilities progress is successfully paused for decades, we'll figure something out. (That "something" might involve very careful and gradual advancement alongside human augmentation, rather than a full "solution." idk) (I do not think we'll solve alignment in the next 25 years. I think we'll die.)

LoganStrohl's Shortform

chanamessinger1mo110

I asked Chat GPT 4o to generate ten spells that followed this structure.

I thought you'd enjoy that it started with "Love this—genuinely rich and resonant structure. It’s like a UX flowchart for engaging the numinous."

The rest:

1. The Invitation of Belonging

Intention: To feel more welcomed and connected in a new place or group.
Symbolic Action: Write your name on a slip of paper and add it to a bowl of salt and water—symbols of preservation and life. Say, “I belong here.” Stir clockwise three times.

2. The Unbinding

Intention: To let go of a limiting beli

... (read more)

chanamessinger's Shortform

chanamessinger11mo42

Leopold Aschenbrenner is starting a cross between a hedge fund and a think tank for AGI. I have read only the sections of Situational Awareness most relevant to this project, and I don't feel nearly like I understand all the implications, so I could end up being quite wrong. Indeed, I’ve already updated towards a better and more nuanced understanding of Aschenbrenner's points, in ways that have made me less concerned than I was to begin with. But I want to say publicly that the hedge fund idea makes me nervous.

Before I give my re... (read more)

Non-Disparagement Canaries for OpenAI

chanamessinger1y925

It sounds from this back and forth like we should assume that Anthropic leadership who left from OAI (so Dario and Daniela Amodei, Jack Clark, Sam McCandlish, others?) are still under NDA because it was probably mutual. Does that sound right to others?

chanamessinger1y10

Oh! I think you're right, thanks!

chanamessinger1y10

Also relevant: AI companies aren't really using external evaluators

My Interview With Cade Metz on His Reporting About Slate Star Codex

chanamessinger1y124

I feel pretty sympathetic to the desire not to do things by text; I suspect you get much more practiced and checked over answers that way.

3antanaclasis1y

Another big thing is that you can’t get tone-of-voice information via text. The way that someone says something may convey more to you than what they said, especially for some types of journalism.

localdeity1y1422

I suspect you get much more practiced and checked over answers that way.

In some contexts this would be seen as obviously a good thing. Specifically, if the thing you're interested in is the ideas that your interviewee talks about, then you want them to be able to consider carefully and double-check their facts before sending them over.

The case where you don't want that would seem to be the case where your primary interest is in the mental state of your interviewee, or where you hope to get them to stumble into revealing things they would want to hide.

Skills I'd like my collaborators to have

chanamessinger1y10

which privacy skills you are able to execute.

This link goes to a private google doc, just fyi.

1Mateusz Bagiński1y

Wouldn't a DM be a more proper way to point this out?

2Raemon1y

lol that is amazingly terrible. That doc was a memo at a private retreat that a) not actually that private, but b) is mostly just a repackaging of this: https://www.lesswrong.com/posts/rz73eva3jv267Hy7B/can-you-keep-this-confidential-how-do-you-know

Skills I'd like my collaborators to have

chanamessinger1y11

This is great!

I really like this about slack:

If you aren’t maintaining this, err on the side of cultivating this rather than doing high-risk / high-reward investments that might leave you emotionally or financially screwed.
(or, if you do those things, be aware I may not help you if it fails. I am much more excited about helping people that don’t go out of their way to create crises)

Seems like a good norm and piece of advice.

This might be the last AI Safety Camp

chanamessinger1y104

I'm confused how much I should care whether an impact assessment is commissioned by some organization. The main thing I generally look for is whether the assessment / investigation is independent. The argument is that because AISC is paying for it, that will influence the assessors?

6habryka1y

My guess is it matters a lot, even if people aspire towards independence. I would update if someone has a long track record of clearly neutral-seeming reports for financial compensation, but I think in the absence of such a track record, my prior would be that people are very rarely capable of making strong negative public statements about people who are paying them.

6Linda Linsefors1y

This depends on how much you trust the actors involved. I know that me and Remmelt asked for an honest evaluation, and did not try to influence the result. But you don't know this. Me and Remmelt obviously believe in AISC, otherwise we would not keep running these programs. But since AISC has been chronically understaffed (like most non-profit initiatives) we have not had time to do a proper follow-up study. When we asked Arb to do this assessment, it was in large part to test our own believes. So far nothing surprising has came out of the investigation, which is reassuring. But if Arb found something bad, I would not want them to hide it. Here's some other evaluations of AISC (and other things) that where not commissioned by us. I think for both of them, they did not even talk to someone from AISC before posting, although for the second link, this was only due to miscommunication. * Takeaways from a survey on AI alignment resources — EA Forum (effectivealtruism.org) * Thoughts on AI Safety Camp — LessWrong

Effective Aspersions: How the Nonlinear Investigation Went Wrong

chanamessinger1y53

I have not read most of what there is to read here, just jumping in on "illegal drugs" ---> ADHD meds. Chloe's comment spoke to weed as the illegal drug on her mind.

1Rebecca1y

Yeah Ben should have said illicit not illegal, because they are illegal to bring across the border except if you have a valid prescription, even if the place you purchased them didn’t require a prescription. But I wouldn’t consider it an unambiguous falsehood, like the following is mostly a sliding scale of frustrating ambiguity: 1. ‘asked Alice to illegally bring Schedule II medication into the country’ [edit: entirely correct according to NL’s stating of the facts] 2. ‘asked Alice to illegally bring Schedule II drugs into the country’ [some intermediate version, still completely factually correct but would be eliding the different between meth and Adderall] 3. ‘asked Alice to bring illegal drugs across the border’ [frustratingly bad choice of words that gives people a much worse impression than is accurate, from memory basically the thing that Ben said]

TracingWoodgrains1y129

To clarify, this is specifically in the context "Kat requested that Alice bring a variety of illegal drugs across the border for her." Chloe didn't come into it.

Integrity in AI Governance and Advocacy

chanamessinger2y10

AI has immense potential, but also immense risks. AI might be misused by China, or get of control. We should balance the needs for innovation and safety." I wouldn't call this lying (though I agree it can have misleading effects, see Issue 1).

Not sure where this slots in, but there's also a sense in which this contains a missing positive mood about how unbelievably good (aligned) AI could or will be, and how much we're losing by not having it earlier.

GovAI: Towards best practices in AGI safety and governance: A survey of expert opinion

chanamessinger2y10

Thanks!

GovAI: Towards best practices in AGI safety and governance: A survey of expert opinion

chanamessinger2y*10

Interesting how many of these are "democracy / citizenry-involvement" oriented. Strongly agree with 18 (whistleblower protection) and 38 (simulate cyber attacks).

20 (good internal culture), 27 (technical AI people on boards) and 29 (three lines of defense) sound good to me, I'm excited about 31 if mandatory interpretability standards exist.

42 (on sentience) seems pretty important but I don't know what it would mean.

4ryan_greenblatt2y

This is super late, but I recently posted: Improving the Welfare of AIs: A Nearcasted Proposal

2Zach Stein-Perlman2y

Assuming you mean the second 42 ("AGI labs take measures to limit potential harms that could arise from AI systems being sentient or deserving moral patienthood")-- I also don't know what labs should do, so I asked an expert yesterday and will reply here if they know of good proposals...

GovAI: Towards best practices in AGI safety and governance: A survey of expert opinion

chanamessinger2y10

The top 6 of the ones in the paper (the ones I think got >90% somewhat or strongly agree, listed below), seem pretty similar to me - are there important reasons people might support one over another?

Pre-deployment risk assessments
Evaluations of dangerous capabilities
Third-party model audits
Red teaming
Pre-training risk assessments
Pausing training of dangerous models

2Zach Stein-Perlman2y

I think 19 ideas got >90% agreement. I agree the top ideas overlap. I think reasons one might support some over others depend on the details.

jacquesthibs's Shortform

chanamessinger2y30

Curious if you have any updates!

2jacquesthibs2y

Working on a new grant proposal right now. Should be sent this weekend. If you’d like to give feedback or have a look, please send me a DM! Otherwise, I can send the grant proposal to whoever wants to have a look once it is done (still debating about posting it on LW). Outside of that, there has been a lot of progress on the Cyborgism discord (there is a VSCode plugin called Worldspider that connects to the various APIs, and there has been more progress on Loom). Most of my focus has gone towards looking at the big picture and keeping an eye on all the developments. Now, I have a better vision of what is needed to create an actually great alignment assistant and have talked to other alignment researchers about it to get feedback and brainstorm. However, I’m spread way too thin and will request additional funding to get some engineer/builder to start building the ideas out so that I can focus on the bigger picture and my alignment work. If I can get my funding again (previous funding ended last week) then my main focus will be building out the system I have in my for accelerating alignment work + continue working on the new agenda I put out with Quintin and others. There’s some other stuff I‘d like to do, but those are lower priority or will depend on timing. It’s been hard to get the funding application done because things are moving so fast and I’m trying not to build things that will be built by default. And I’ve been talking to some people about the possibility of building an org so that this work could go a lot faster.

Simulacra Levels Summary

chanamessinger2y10

Chat GPT gives some interesting analysis when asked, though I think not amazingly accurate. (The sentence I gave it, from here is a weird example, though)

AI Risk Management Framework | NIST

chanamessinger2y60

Does it say anything about AI risk that is about the real risks? (Have not clicked the links, the text above did not indicate to me one way or another).

5MaxRa2y

The report mentioned "harm to the global financial system [and to global supply chains]" somewhere as examples, which I found noteworthy for being very large scale harms and therefore plausibly requiring AI systems that the AI x-risk community is most worried about.

2Evan R. Murphy2y

I'm not sure if the core NIST standards go into catastrophic misalignment risk, but Barrett et al.'s supplemental guidance on the NIST standards does. I was a reviewer on that work, and I think they have more coming (see link in my first comment on this post for their first part).

Shared reality: a key driver of human behavior

chanamessinger2y30

This is great, and speaks to my experience as well. I have my own frames that map onto some of this but don't hit some of the things you've hit and vice versa. Thanks for writing!

All AGI Safety questions welcome (especially basic ones) [~monthly thread]