A "Core Views on AI Safety" post is now available at https://www.anthropic.com/index/core-views-on-ai-safety
(Linkpost for that is here: https://www.lesswrong.com/posts/xhKr5KtvdJRssMeJ3/anthropic-s-core-views-on-ai-safety.)
I’ve run Hamming circles within CFAR contexts a few times, and once outside. Tips from outside:
Timing can be tricky here! If you do 4x 20m with breaks, and you’re doing this in an evening, then by the time you get to the last person, people might be tired.
Especially so if you started with the Hamming Questions worksheet exercise (link as prereq at top of post).
I think next time I would drop to 15 each, and keep the worksheet.
I appreciate the concept of "Numerical-Emotional Literacy". In fact, this is what I personally think/feel the "rationalist project" should be. To the extent I am a "rationalist" then precisely specifically what I mean by that is that knowing what I value, and pursuing numerical-emotional literacy around it, is important to me.
To make in-line adjustments, grab a copy of the spreadsheet (https://www.microcovid.org/spreadsheet) and do anything you like to it!
There is now a wired article about this tool and the process of creating it: https://www.wired.com/story/group-house-covid-risk-points/
I think the reporter did a great job of capturing what an "SF group house" is like and how to live a kind of "high IQ / high EQ" rationalist-inspired live, so this might be a thing one could send to friends/family about "how we do things".
It's not just Dario, it's a larger subset of OpenAI splitting off: "He and a handful of OpenAI colleagues are planning a new project, which they tell us will probably focus less on product development and more on research. We support their move and we’re grateful for the time we’ve spent working together."
I heard someone wanted to know about usage statistics for the microcovid.org calculator. Here they are!
Sorry to leave you hanging for so long Richard! This is the reason why in the calculator we ask about "number of people typically near you at a given time" for the duration of the event. (You can also think of this as a proxy for "density of people packed into the room".) No reports like that that I'm aware of, alas!
Want to just give credit to all the non-rationalist coauthors of microcovid.org! (7 non-rationalists and 2 "half-rationalists"?)
I've learned a LOT about the incredible power of trusted collaborations between "hardcore epistemics" folks and much more pragmatic folks with other skillsets (writing, UX design, medical expertise with ordinary people as patients, etc). By our powers combined we were able to build something usable by non-rationalist-but-still-kinda-quantitative folks, and are on our way to something usable by "normal people" 😲.
We've been able to...
Also, don't forget to factor in "kicking off a chain of onwards infections" into your COVID avoidance price somehow. You can't stop at valuing "cost of COVID to *me*".
We don't really know how to do this properly yet, but see discussion here: https://forum.effectivealtruism.org/posts/MACKemu3CJw7hcJcN/microcovid-org-a-tool-to-estimate-covid-risk-from-common?commentId=v4mEAeehi4d6qXSHo#No5yn8nves7ncpmMt
Sadly nothing useful. As mentioned here (https://www.microcovid.org/paper/2-riskiness#fn6) we think it's not higher than 10%, but we haven't found anything to bound it further.
"I've heard people make this claim before but without explaining why. [...] the key risk factors for a dining establishment are indoor vs. outdoor, and crowded vs. spaced. The type of liquor license the place has doesn't matter."
I think you're misunderstanding how the calculator works. All the saved scenarios do is fill in the parameters below. The only substantial difference between "restaurant" and "bar" is that we assume bars are places people speak loudly. That's all. If the bar you have in mind isn't like that, just change the parameters.
entry-level leadership
It has become really salient to me recently that good practice involves lots of prolific output in low-stakes throwaway contexts. Whereas a core piece of EA and rationalist mindsets is steering towards high-stakes things to work on, and treating your outputs as potentially very impactful and not to be thrown away. In my own mind “practice mindset” and “impact mindset” feel very directly in tension.
I have a feeling that something around this mindset difference is part of why world-saving orientation in a community might be correlated with inadequate opportunities for low-stakes leadership practice.
Here's another further-afield steelman, inspired by blameless postmortem culture.
When debriefing / investigating a bad outcome, it's better for participants to expect not to be labeled as "bad people" (implicitly or explicitly) as a result of coming forward with information about choices they made that contributed to the failure.
More social pressure against admitting publicly that one is contributing poorly contributes to systematic hiding/obfuscation of information about why people are making those choices (e.g. incentives). And we nee...
Another distinction I think is important, for the specific example of "scientific fraud vs. cow suffering" as a hypothetical:
Science is a terrible career for almost any goal other than actually contributing to the scientific endeavor.
I have a guess that "science, specifically" as a career-with-harmful-impacts in the hypothetical was not specifically important to Ray, but that it was very important to Ben. And that if the example career in Ray's "which harm is highest priority?" thought experiment had been "high-frequ...
You're right that I'd respond to different cases differently. Doing high frequency trading in a way that causes some harm - if you think you can do something very good with the money - seems basically sympathetic to me, in a sufficiently unjust society such as ours.
Any info good (including finance and trading) is on some level pretending to involve stewardship over our communal epistemics, but the simulacrum level of something like finance is pretty high in many respects.
One distinction I see getting elided here:
I think one's limited resources (time, money, etc) are a relevant question in one's behavior, but a "goodness budget" is not relevant at all.
For example: In a world where you could pay $50 to the electric company to convert all your electricity to renewables, or pay $50 more to switch from factory to pasture-raised beef, then if someone asks "hey, your household electrical bill is destroying the environment, why didn't you choose the green option", a relevant reply is "becaus...
The recent EA meta fund announcement linked to this post (https://www.centreforeffectivealtruism.org/blog/the-fidelity-model-of-spreading-ideas ) which highlights another parallel approach: in addition to picking idea expressions that fail gracefully, to prefer transmission methods that preserve nuance.
If you have ovaries/uterus, a non-zero interest in having kids with your own gametes, and you're at least 25 or so: Get a fertility consultation.
They do an ultrasound and a blood test to estimate your ovarian reserve. Until you either try to conceive or get other measurements, you don't know if you have normal fertility for your age, or if your fertility is already declining without knowing it.
This is important information to know, in order to make later informed decisions (such as when and whether to freeze your eggs, when to start looking for a...
Two observations:
Nod. Definitely open to better versions of the question that carve at more useful joints. (With a caveat that the question is more oriented towards "what are the easiest street lamps to look under" than "what is the best approximation")
So, I guess my return question is: do you have suggestions on subfields to focus on, or exclude, from "AI capabilities research" that more reliably points to "AGI", that you think there's likely to exist public data on? (Or some other way to carve up AI research space)
It does seem...
Important updates to your model:
Our collective total years of experience is ~119 times the age of the universe. (The universe is 13.8 billion years old, versus 1.65 trillion total human experience years so far).
Also: at 7.44 billion people alive right now, we collectively experience the age of the universe every ~2 years (https://twitter.com/karpathy/status/850772106870640640?lang=en)
I hadn't read that link on the side-taking hypothesis of morality before, but I note that if you find that argument interesting, you would like Gillian Hadfield's book "Rules for a Flat World". She talks about law (not "what courts and congress do" but broadly "the enterprise of subjecting human conduct to rules") and emphasizes that law is similar to norms/morality, except in addition there is a canonical place that "the rules" get posted and also a canonical way to obtain a final arbitration about questio...
FWIW, this claim doesn't match my intuition, and googling around, I wasn't able to quickly find any papers or blog posts supporting it.
"Explaining and Harnessing Adversarial Examples" (Goodfellow et al. 2014) is the original demonstration that "Linear behavior in high-dimensional spaces is sufficient to cause adversarial examples".
I'll emphasize that high-dimensionality is a crucial piece of the puzzle, which I haven't seen you bring up yet. You may already be aware of this, but I'll emphasize it anyway: the usu...
As the dimension increases, a decision-boundary hyperplane that has 1% test error rapidly gets extremely close to the equator of the sphere
What does the center of the sphere represent in this case?
(I'm imaging the training and test sets consisting of points in a highly dimensional space, and the classifier as drawing a hyperplane to mostly separate them from each other. But I'm not sure what point in this space would correspond to the "center", or what sphere we'd be talking about.)
The central argument can be understood from the intuitions presented in Counterintuitive Properties of High Dimensional Space in the section titled Concentration of Measure
Thanks for this link, that is a handy reference!
When evaluating whether there is a broad base of support, I think it's important to distinguish "one large-scale funder" from "narrow overall base of support". Before the Arnold foundation's funding, the reproducibility project had a broad base of committed participants contributing their personal resources and volunteering their time.
To add some details from personal experience: In late 2011 and early 2012, the Reproducibility Project was a great big underfunded labor of love. Brian Nosek had outlined a plan to replicate ~50 studies - ...
COI: I work at Anthropic
I confirmed internally (which felt personally important for me to do) that our partnership with Palantir is still subject to the same terms outlined in the June post "Expanding Access to Claude for Government":
... (read more)