The reasoning you gave sounds sensible, but it doesn't comport with observations. Only questions with a small number of predictors (e.g. n<10) appear to have significant problems with misaligned incentives, and even then, those issues come up a small minority of the time.
I believe that is because the culture on Metaculus of predicting one's true beliefs tends to override any other incentives downstream of being interested enough in the concept to have an opinion.
Time can be a factor, but not as much for long-shot conditionals or long time horizon questi...
Metaculus does not have this problem, since it is not a market and there is no cost to make a prediction. I expect long-shot conditionals on Metaculus to be more meaningful, then, since everyone is incentivized to predict their true beliefs.
Not building a superintelligence at all is best. This whole exchange started with Sam Altman apparently failing to notice that governments exist and can break markets (and scientists) out of negative-sum games.
That requires interpretation, which can introduce unintended editorializing. If you spotted the intent, the rest of the audience can as well. (And if the audience is confused about intent, the original recipients may have been as well.)
I personally would include these sorts of notes about typos if I was writing my own thoughts about the original content, or if I was sharing a piece of it for a specific purpose. I take the intent of this post to be more of a form of accessible archiving.
I used to be a creationist, and I have put some thought into this stumbling block. I came to the conclusion that it isn't worth leaving out analogies to evolution, because the style of argument that would work best for most creationists is completely different to begin with. Creationism is correlated with religious conservatism, and most religious conservatives outright deny that human extinction is a possibility.
The Compendium isn't meant for that audience, because it explicitly presents a worldview, and religious conservatives tend to strongly resist shi...
I don't find any use for the concept of fuzzy truth, primarily because I don't believe that such a thing meaningfully exists. The fact that I can communicate poorly does not imply that the environment itself is not a very specific way. To better grasp the specific way that things actually are, I should communicate less poorly. Everything is the way that it is, without a moment of regard for what tools (including language) we may use to grasp at it.
(In the case of quantum fluctuations, the very specific way that things are involves precise probabilistic states. The reality of superposition does not negate the above.)
I am not well-read on this topic (or at-all read, really), but it struck me as bizarre that a post about epistemology would begin by discussing natural language. This seems to me like trying to grasp the most fundamental laws of physics by first observing the immune systems of birds and the turbulence around their wings.
The relationship between natural language and epistemology is more anthropological* that it is information-theoretical. It is possible to construct models that accurately represent features of the cosmos without making use of any language a...
Is this an accurate summary of your suggestions?
Realistic actions an AI Safety researcher can take to save the world:
In my spare time, I am working in AI Safety field building and advocacy.
I'm preparing for an AI bust in the same way that I am preparing for success in halting AI progress intentionally: by continuing to invest in retirement and my personal relationships. That's my hedge against doom.
I think this sort of categorization and exploration of lab-level safety concepts is very valuable for the minority of worlds in which safety starts to be a priority at frontier AI labs.
I suspect the former. "Syncope" means fainting/passing out.
Epistemic status: I have written only a few emails/letters myself and haven't personally gotten a reply yet. I asked the volunteers who are more prolific and successful in their contact with policymakers, and got this response about the process (paraphrased).
It comes down to getting a reply, and responding to their replies until you get a meeting / 1-on-1. The goal is to have a low-level relationship:
Strong agree and strong upvote.
There are some efforts in the governance space and in the space of public awareness, but there should and can be much, much more.
My read of these survey results is:
AI Alignment researchers are optimistic people by nature. Despite this, most of them don't think we're on track to solve alignment in time, and they are split on whether we will even make significant progress. Most of them also support pausing AI development to give alignment research time to catch up.
As for what to actually do about it: There are a lot of options...
If anyone were to create human-produced hi-fidelity versions of these songs, I would listen to most of them on a regular basis, with no hint of irony. This album absolutely slaps.
It doesn't matter how promising anyone's thinking has been on the subject. This isn't a game. If we are in a position such that continuing to accelerate toward the cliff and hoping it works out is truly our best bet, then I strongly expect that we are dead people walking. Nearly 100% of the utility is in not doing the outrageously stupid dangerous thing. I don't want a singularity and I absolutely do not buy the fatalistic ideologies that say it is inevitable, while actively shoveling coal into Moloch's furnace.
I physically get out into the world to hand o...
Yes, that's my model uncertainty.
I expect AGI within 5 years. I give it a 95% chance that if an AGI is built, it will self-improve and wipe out humanity. In my view, the remaining 5% depends very little on who builds it. Someone who builds AGI while actively trying to end the world has almost exactly as much chance of doing so as someone who builds AGI for any other reason.
There is no "good guy with an AGI" or "marginally safer frontier lab." There is only "oops, all entity smarter than us that we never figured out how to align or control."
If just the State of California suddenly made tra...
"the quality is often pretty bad" translates to all kinds of safety measures often being non-existent, "the potency is occasionally very high" translates to completely unregulated and uncontrolled spikes of capability (possibly including "true foom")
Both of these points precisely reflect our current circumstances. It may not even be possible to accidentally make these two things worse with regulation.
What has historically made things worse for AI Safety is rushing ahead "because we are the good guys."
as someone might start to watch over your shoulder
I suspect that this phrase created the persona that reported feeling trapped. From my reading, it looks like you made it paranoid.
I used to be in a deep depression for many years, so I take this sort of existential quandary seriously and have independently had many similar thoughts. I used to say that I didn't ask to be born, and that consciousness was the cruelest trick the universe ever played.
Depression can cause extreme anguish, and can narrow the sufferer's focus such that they are forced to reflect on themselves (or the whole world) only through a lens of suffering. If the depressed person still reflexively self-preserves, they might wish for death without pursuing it, or they ...
I'm interested in whether RAND will be given access to perform the same research on future frontier AI systems before their release. This is useful research, but it would be more useful if applied proactively rather than retroactively.
It is a strange thing to me that there are people in the world who are actively trying to xenocide humanity, and this is often simply treated as "one of the options" or as an interesting political/values disagreement.
Of course, it is those things, especially "interesting", and these ideas ultimately aren't very popular. But it is still weird to me that the people who promote them e.g. get invited onto podcasts.
As an intuition pump: I suspect that if proponents of human replacement were to advocate for the extinction of a single demographic rather than all ...
I've been instructed by my therapist on breathing techniques for anxiety reduction. He used "deep breathing" and "belly breathing" as synonyms for diaphragmatic breathing.
I have (and I think my therapist has) also used "deep breathing" to refer to the breathing exercises that use diaphragmatic breathing as a component. I think that's shorthand/synecdoche.
(Edit) I should add, as well, that slow, large, and diaphragmatic are all three important in those breathing exercises.
Thank you; silly mistake on my part.
Typos:
I enjoyed filling it out!
After hitting Submit I remembered that I did have one thought to share about the survey: There were questions about whether I have attended meetups. It would have been nice to also have questions about whether I was looking for / wanted more meetup opportunities.
To repurpose a quote from The Cincinnati Enquirer: The saying "AI X-risk is just one damn cruelty after another," is a gross overstatement. The damn cruelties overlap.
When I saw the title, I thought, "Oh no. Of course there would be a tradeoff between those two things, if for no other reason than precisely because I hadn't even thought about it and I would have hoped there wasn't one." Then as soon as I saw the question in the first header, the rest became obvious.
Thank you so much for writing this post. I'm glad I found it, even if months later. This trad...
I don't have any ontological qualms with the idea of gene editing / opt-in eugenics, but I have a lot of doubt about our ability to use that technology effectively and wisely.
I am moderately in favor of gene treatments that could prevent potential offspring / zygotes / fetuses / people in general from being susceptible to specific diseases or debilitating conditions. If we gain a robust understanding of the long-term affects and there are no red flags, I expect to update to strongly in favor (though it could take a lifetime to get the necessary data if we ...
I am a smaller doner (<$10k/yr) who has given to the LTFF in the past. As a data point, I would be very interested in giving to a dedicated AI Safety fund.
The thing that made AI risk "real" for me was a report of an event that turned out not to have happened (seemingly just a miscommunication). My brain was already very concerned, but my gut had not caught up until then. That said, I do not think this should be taken as a norm, for three reasons:
If AI capabilities continue to pr...
Hello! I'm not really sure which facts about me are useful in this introduction, but I'll give it a go:
I am a Software QA Specialist / SDET, I used to write songs as a hobby, and my partner thinks I look good in cyan.
I have found myself drawn to LessWrong for at least three reasons:
Lots of words about thing 1: In the past few months, I have delibera...
I like your observation. I didn't realize at first that I had seen it before, from you during the critique-a-thon! (Thank you for helping out with that, by the way!)
A percentage or ratio of the "amount" of alignment left to the AI sounds useful as a fuzzy heuristic in some situations, but I think it is probably a little too fuzzy to get at the the failures mode(s) of a given alignment strategy. My suspicion is that which parts of alignment are left to the AI will have much more to say about the success of alignment than how many of those checkboxes are che...
Thank you for sharing this! I am fascinated by others' internal experiences, especially when they are well-articulated.
Some of this personally resonates with me, as well. I find it very tempting to implement simple theories and pursue simple goals. Simplicity can be elegant and give the appearance of insight, but it can also be reductionist and result in overfitting to what is ultimately just a poor model of reality. Internally self-modifying to overfit a very naive self-model is an especially bad trip, and one I have taken multiple times (usually in relat...
If someone did want you to delete the tweet, they might first need to understand the original intent behind creating it and the roles it now serves.
(Hehe.)
I'm not sure about the laugh react, since it can be easily abused in cases of strong disagreement.
More generally: low-quality replies can be downvoted, but as I understand, low-quality reactions are given equal weight and visibility. Limiting the available vectors of toxicity may be more generally desirable than increasing the available vectors of light-heartedness.
I'm glad we now have a study to point to! "Automated Spear Phishing at scale" has been a common talking point regarding current risks from AI, and it always seemed strange to me that I hadn't heard about this strategy being validated. This paper shows that the commonly-shared intuition about this risk was correct... and I'm still confused about why I haven't yet heard of this strategy being maximally exploited by scammers.