It sounds from this back and forth like we should assume that Anthropic leadership who left from OAI (so Dario and Daniela Amodei, Jack Clark, Sam McCandlish, others?) are still under NDA because it was probably mutual. Does that sound right to others?
Oh! I think you're right, thanks!
I feel pretty sympathetic to the desire not to do things by text; I suspect you get much more practiced and checked over answers that way.
I suspect you get much more practiced and checked over answers that way.
In some contexts this would be seen as obviously a good thing. Specifically, if the thing you're interested in is the ideas that your interviewee talks about, then you want them to be able to consider carefully and double-check their facts before sending them over.
The case where you don't want that would seem to be the case where your primary interest is in the mental state of your interviewee, or where you hope to get them to stumble into revealing things they would want to hide.
This is great!
I really like this about slack:
- If you aren’t maintaining this, err on the side of cultivating this rather than doing high-risk / high-reward investments that might leave you emotionally or financially screwed.
- (or, if you do those things, be aware I may not help you if it fails. I am much more excited about helping people that don’t go out of their way to create crises)
Seems like a good norm and piece of advice.
I'm confused how much I should care whether an impact assessment is commissioned by some organization. The main thing I generally look for is whether the assessment / investigation is independent. The argument is that because AISC is paying for it, that will influence the assessors?
I have not read most of what there is to read here, just jumping in on "illegal drugs" ---> ADHD meds. Chloe's comment spoke to weed as the illegal drug on her mind.
To clarify, this is specifically in the context "Kat requested that Alice bring a variety of illegal drugs across the border for her." Chloe didn't come into it.
AI has immense potential, but also immense risks. AI might be misused by China, or get of control. We should balance the needs for innovation and safety." I wouldn't call this lying (though I agree it can have misleading effects, see Issue 1).
Not sure where this slots in, but there's also a sense in which this contains a missing positive mood about how unbelievably good (aligned) AI could or will be, and how much we're losing by not having it earlier.
Interesting how many of these are "democracy / citizenry-involvement" oriented. Strongly agree with 18 (whistleblower protection) and 38 (simulate cyber attacks).
20 (good internal culture), 27 (technical AI people on boards) and 29 (three lines of defense) sound good to me, I'm excited about 31 if mandatory interpretability standards exist.
42 (on sentience) seems pretty important but I don't know what it would mean.
The top 6 of the ones in the paper (the ones I think got >90% somewhat or strongly agree, listed below), seem pretty similar to me - are there important reasons people might support one over another?
Curious if you have any updates!
Chat GPT gives some interesting analysis when asked, though I think not amazingly accurate. (The sentence I gave it, from here is a weird example, though)
Does it say anything about AI risk that is about the real risks? (Have not clicked the links, the text above did not indicate to me one way or another).
This is great, and speaks to my experience as well. I have my own frames that map onto some of this but don't hit some of the things you've hit and vice versa. Thanks for writing!
Is this something Stampy would want to help with?
https://www.lesswrong.com/posts/WXvt8bxYnwBYpy9oT/the-main-sources-of-ai-risk
I think that incentivizes self-deception on probabilities. Also, P <10^-10 are pretty unusual, so I'd expect that to cause very little to happen.
Thanks!
When you say "They do, however, have the potential to form simulacra that are themselves optimizers, such as GPT modelling humans (with pretty low fidelity right now) when making predictions"
do you mean things like "write like Ernest Hemingway"?
Is it true that current image systems like stable diffusion are non-optimizers? How should that change our reasoning about how likely it is that systems become optimizers? How much of a crux is "optimizeriness" for people?
Why do people keep saying we should maximize log(odds) instead of odds? Isn't each 1% of survival equally valuable?
In addition to Daniel's point, I think an important piece is probabilistic thinking - the AGI will execute not based on what will happen but on what it expects to happen. What probability is acceptable? If none, it should do nothing.
Have you written about your update to slow takeoff?
Nice! Added these to the wiki on calibration: https://www.lesswrong.com/tag/calibration
After years of tinkering and incremental progress, AIs can now play Diplomacy as well as human experts.[6]
Maybe this happened in 2022: https://twitter.com/polynoamial/status/1580185706735218689
Let me know if you have a cheerful price for this!
Here's the git! https://github.com/SonOfLilit/calibrate?fbclid=IwAR2vBZ8IWfMgHTPla0CbohCUIqmrMUl-XEcYIWhKUrJ4ZRfH2Eg7Z7Zf1J4
I will talk to the developer about it being open source - I think that was both of our ideals.
Do you know how to do this kind of thing? I'd be happy to pay you for your time.
This seems interesting to me but I can't yet latch onto it. Can you give examples of secrets being one or the other?
Are you distinguishing between "secrets where the existence of the secret is a big part of the secret" and "secrets where it's not"?
One of my feature requests! Just hard to do.
Why would they be jokes?
Don't know what you mean in the latter sentence.
Conversational moves in EA / Rationality that I like for epistemics
This is why less wrong needs the full suite of emoji reacts.
Title changed!
I meant signposting to indicate things like saying "here's a place where I have more to say but not in this context" etc, during for instance a conversation, so I'm truthfully saying that there's more to the story.
Yeah, I think "intentionally causing others to update in the wrong direction" and "leaving them with their priors" end up pretty similar (if you don't make strong distinctions between action and omission, which I think this test at least partially rests on) if you have a good model of their priors (which I think is potentially the hardest part here).
Kind is one of the four adjectives in your description of Iron Hufflepuff.
Hm, Keltham has a lot of good qualities here, but kind doesn't seem among them.
Sounds scary, but thank you for the model of what's actually going on!
Oh woah! Thanks for linking.
True! 65 Watts! That would really be something.
Unfortunately I'm not seeing anything close to that on the Amazon UK site :/
Might be bad search skills, though.
Your link's lightbulbs have a bayonet style, not the E27 threading :) Thanks for the other link! Amazon says currently unavailable.
ETA: Found some, will add to post
Tried to buy those, didn't have any luck finding ones that fit nicely into my sockets! (An embarassing mistake I didn't describe in detail is buying corn bulbs that turned out to be...mini?) If you have an amazon UK link for ones with E27 threading, that would be awesome.
ETA: Having looked, it looks like not all corn bulbs are brighter than the ones I have, though I have now found 2000 lumen ones. I don't know if corn bulbs are still better if they have lower lumens. I would guess not?
ETA 2: The link above does have E27 if you click through the multiple listings in the same link, wasn't obvious to me at first, thanks!
I saw people discussing forecasting success of this on twitter and people were saying that the intelligence agencies actually called this right. Does anyone know an easy link to what those agencies were saying?
Context: https://twitter.com/ClayGraubard/status/1496699988801433602?s=20&t=mQ8sAzMRppI8Pr44O38M3w
https://twitter.com/ClayGraubard/status/1496866236973658112?s=20&t=mQ8sAzMRppI8Pr44O38M3w
Nice! Welcome!
I definitely find it helpful to be surrounded by people who will do this for me and help me cultivate a habit of it over time. The case for it being very impactful is if people do a one-time thing, like apply for something or put themselves in the running for something that they otherwise wouldn't have that makes a big difference. The ones that are about accountability (Can I remind you about that in a week?) also are sort of a conscientiousness loan, which can be cheap since it can be easier to check in on other people than to do it for yourself.
It is definitely important to have sense of who you're talking to and what they need (law of equal and opposite advice). For what it's worth, 5-10 and 13 are aimed to be disproportionately helpful for people who have trouble doing things (depending on the reason).
Leopold Aschenbrenner is starting a cross between a hedge fund and a think tank for AGI. I have read only the sections of Situational Awareness most relevant to this project, and I don't feel nearly like I understand all the implications, so I could end up being quite wrong. Indeed, I’ve already updated towards a better and more nuanced understanding of Aschenbrenner's points, in ways that have made me less concerned than I was to begin with. But I want to say publicly that the hedge fund idea makes me nervous.
Before I give my re... (read more)