pausing means moving AI underground, and from what I can tell that would make it much harder to do safety research
I would be overjoyed if all AI research were driven underground! The main source of danger is the fact that there are thousands of AI researchers, most of whom are free to communicate and collaborate with each other. Lone researchers or small underground cells of researcher who cannot publish their results would be vastly less dangerous than the current AI research community even if there are many lone researchers and many small underground ...
Impressive performance by the chatbot.
Maybe "motto" is the wrong word. I meant words / concepts to use in a comment or in a conversation.
"Those companies that created ChatGPT, etc? If allowed to continue operating without strict regulation, they will cause an intelligence explosion."
All 3 of the other replies to your question overlook the crispest consideration: namely, it is not possible to ensure the proper functioning of even something as simple as a circuit for division (such as we might find inside a CPU) through testing alone: there are too many possible inputs (too many pairs of possible 64-bit divisors and dividends) to test in one lifetime even if you make a million perfect copies of the circuit and test them in parallel.
Let us consider very briefly what else besides testing an engineer might do to ensure (or "verify" as the ...
Let me reassure you that there’s more than enough protein available in plant-based foods. For example, here’s how much grams of protein there is in 100 gram of meat
That is misleading because most foods are mostly water, included the (cooked) meats you list, but the first 4 of the plant foods you list have had their water artificially removed: soy protein isolate; egg white, dried; spirulina algae, dried; baker’s yeast.
Moreover, the human gut digests and absorbs more of animal protein than of plant protein. Part of the reason for this is the plant prote...
The very short answer is that the people with the most experience in alignment research (Eliezer and Nate Soares) say that without an AI pause lasting many decades the alignment project is essentially hopeless because there is not enough time. Sure, it is possible the alignment project succeeds in time, but the probability is really low.
Eliezer has said that AIs based on the deep-learning paradigm are probably particularly hard to align, so it would probably help to get a ban or a long pause on that paradigm even if research in other paradigms continues, b...
I'm going to use "goal system" instead of "goals" because a list of goals is underspecified without some method for choosing which goal prevails when two goals "disagree" on the value of some outcome.
wouldn’t we then want ai to improve its own goals to achieve new ones that have increased effectiveness and improving the value of the world?
That is contradictory: the AI's goal system is the single source of truth for the effectiveness and how much of an improvement is any change in the world.
I would need a definition of AGI before I could sensibly answer those questions.
ChatGPT is already an artificial general intelligence by the definition I have been using for the last 25 years.
I think the leaders of the labs have enough private doubts about the safety of their enterprise that if an effective alignment method were available to them, they would probably adopt the method (especially if the group that devised the method do not seem particularly to care who gets credit for having devised it). I.e., my guess is that almost all of the difficulty is in devising an effective alignment method, not getting the leading lab to adopt it. (Making 100% sure that the leading lab adopts it is almost impossible, but acting in such a way that the l...
Your question would have been better without the dig at theists and non-vegans.
However, whereas the concept of an unaligned general intelligence has the advantage of being a powerful, general abstraction, the HMS concept has the advantage of being much easier to explain to non-experts.
The trouble with the choice of phrase "hyperintelligent machine sociopath" is that it gives the other side of the argument and easy rebuttal, namely, "But that's not what we are trying to do: we're not trying to create a sociopath". In contrast, if the accusation is that (many of) the AI labs are trying to create a machine smarter than people, then t...
An influential LW participant, Jim Miller, who I think is a professor of economics, has written here that divestment does little good because any reduction in the stock price caused by pulling the investments can be counteracted by profit-motivated actors. For publicly-traded stocks, there is a robust supply of of profit-motivated actors scanning for opportunities. I am eager for more discussion on this topic.
I am alarmed to see that I made a big mistake in my previous comment: where I wrote that "contributing more money to AI-safety charities has almost n...
money generated by increases in AI stock could be used to invest in efforts into AI safety, which receives comparably less money
In the present situation, contributing more money to AI-safety charities has almost no positive effects and does almost nothing to make AI "progress" less dangerous. (In fact, in my estimation, the overall effect of all funding for alignment research so far has make the situation a little worse, by publishing insights that will tend to be usable by capability researchers without making non-negligible progress towards an eventua...
So let's do it first, before the evil guys do it, but let's do it well from the start!
The trouble is no one knows how to do it well. No one knows how to keep an AI aligned as the AI's capabilities start exceeding human capabilities, and if you believe experts like Eliezer and Connor Leahy, it is very unlikely that anyone is going to figure it out before the lack of this knowledge causes human extinction or something equally dire.
It is only a slight exaggeration to say that the only thing keeping the current crop of AI systems from killing us all (or kil...
The ruling coalition can disincentivize the development of a semiconductor supply chain outside the territories it controls by selling world-wide semiconductors that use "verified boot" technology to make it really hard to use the semiconductor to run AI workloads similar to how it is really hard even for the best jailbreakers to jailbreak a modern iPhone.
Out of curiosity, would you agree with this being the most plausible path, even if you disagree with the rest of my argument?
The most plausible story I can imagine quickly right now is the US and China fight a war and the US wins and uses some of the political capital from that win to slow down the AI project, perhaps through control over the world's leading-edge semiconductor fabs plus pressuring Beijing to ban teaching and publishing about deep learning (to go with a ban on the same things in the West). I believe that basically all the leading-edge fa...
People worked on capabilities for decades, and never got anywhere until recently, when the hardware caught up, and it was discovered that scaling works unexpectedly well.
If I believed that, then maybe I'd believe (like you seem to do) that there is no strong reason to believe that alignment project cannot be finished successfully before the capabilities project creates an unaligned super-human AI. I'm not saying scaling and hardware improvement have not been important: I'm saying they were not sufficient: algorithmic improvements were quite necessary fo...
But what's even more unlikely, is the chance that $200 billion on capabilities research plus $0.1 billion on alignment research is survivable, while $210 billion on capabilities research plus $1 billion on alignment research is deadly.
This assumes that alignment success is the mostly likely avenue to safety for humankind whereas like I said, I consider other avenues more likely. Actually there needs to be a qualifier on that: I consider other avenues more likely than the alignment project's succeeding while the current generation of AI researchers remai...
we're probably doomed in that case anyways, even without increasing alignment research.
I believe we're probably doomed anyways.
I think even you would agree what P(1) > P(2)
Sorry to disappoint you, but I do not agree.
Although I don't consider it quite impossible that we will figure out alignment, most of my hope for our survival is in other things, such as a group taking over the world and then using their power to ban AI research. (Note that that is in direct contradiction to your final sentence.) So for example, if Putin or Xi were dictator of t...
I don't agree that the probability of alignment research succeeding is that low. 17 years or 22 years of trying and failing is strong evidence against it being easy, but doesn't prove that it is so hard that increasing alignment research is useless.
People worked on capabilities for decades, and never got anywhere until recently, when the hardware caught up, and it was discovered that scaling works unexpectedly well.
There is a chance that alignment research now might be more useful than alignment research earlier, though there is uncertainty in everything.
W...
AI safety spending is only $0.1 billion while AI capabilities spending is $200 billion. A company which adds a comparable amount of effort on both AI alignment and AI capabilities should speed up the former more than the latter
There is very little hope IMHO in increasing spending on technical AI alignment because (as far as we can tell based on how slow progress has been on it over the last 22 years) it is a much thornier problem than AI capability research and because most people doing AI alignment research don't have a viable story about how they are ...
Good reply. The big difference is that in the Cold War, there was no entity with the ability to stop the 2 parties engaged in the nuclear arms race whereas instead of hoping for the leading labs to come to an agreement, in the current situation, we can lobby the governments of the US and the UK to shut the leading labs down or at least nationalize them. Yes, the still leaves an AI race between the developed nations, but the US and the UK are democracies, and most voters don't like AI whereas the main concern of the leaders of China and Russia is to avoid r...
Eliezer thinks (as do I) that technical progress in alignment is hopeless without first improving the pool of prospective human alignment researchers (e.g., via human cognitive augmentation).
Who’s track record of AI predictions would you like to see evaluated?
Whoever has the best track record :)
Your specific action places most of its hope for human survival on the entities that have done the most to increase extinction risk.
But I’m more thinking about what work remains.
It depends on how they did it. If they did it by formalizing the notion of "the values and preferences (coherently extrapolated) of (the living members of) the species that created the AI", then even just blindly copying their design without any attempt to understand it has a very high probability of getting a very good outcome here on Earth.
The AI of course has to inquire into and correctly learn about our values and preferences before it can start intervening on our behalf, so one way such a blind copying ...
I’ve also experienced someone discouraging me from acquiring technical AI skills for the purpose of pursuing a career in technical alignment because they don’t want me to contribute to capabilities down the line. They noted that most people who skill up to work on alignment end up working in capabilities instead.
I agree with this. I.e., although it is possible for individual careers in technical alignment to help the situation, most such careers have the negative effect of speeding up the AI juggernaut without any offsetting positive effects. I.e., the fewer people trained in technical alignment, the better.
Patenting an invention necessarily discloses the invention to the public. (In fact, incentivizing disclosure was the rationale for the creation of the patent system.) That is a little worrying because the main way the non-AI tech titans (large corporations) have protected themselves from patent suits has been to obtain their own patent portfolios, then entering into cross-licensing agreements with the holders of the other patent portfolios. Ergo, the patent-trolling project you propose could incentivize the major labs to disclose inventions that they are c...
Your argument about corporate secrets is sufficient to change my mind on activist patent trolling being a productive strategy against AI X-risk.
The part about funding would need to be solved with philanthropy. I don't believe that org exists, but I don't see why it couldn't.
I'm still curious whether there are other cases in which activist patent trolling can be a good option, such as animal welfare, chemistry, public health, or geoengineering (ie fracking).
With a specific deadline and a specific threat of a nuclear attack on the US.
The transfer should be made in January 2029
I think you mean in January 2029 or earlier if the question resolves before the end of 2028 otherwise there would be no need to introduce the CPI into the bet to keep things fair (or predictable).
You do realize that by "alignment", the OP (John) is not talking about techniques that prevent an AI that is less generally capable than a capable person from insulting the user or expressing racist sentiments?
We seek a methodology for constructing an AI that either ensures that the AI turns out not to be able to easily outsmart us or (if it does turn out to be able to easily outsmart us) ensures (or makes it unlikely) that it won't kill us all or do something other terrible thing. (The former is not researched much compared to the latter, but I felt the n...
Although the argument you outline might be an argument against ever fully trusting tests (usually called "evals" on this site) that this or that AI is aligned, alignment researchers have other tools in their toolbox besides running tests or evals.
It would take a long time to explain these tools, particularly to someone unfamiliar with software development or a related field like digital-electronics design. People make careers in studying tools to make reliable software systems (and reliable digital designs).
The space shuttle is steered by changing the dire...
The strongest sign an attack is coming that I know of is firm evidence that Russia or China is evacuating her cities.
Another sign that would get me to flee immediately (to a rural area of the US: I would not try to leave the country) is a threat by Moscow that Moscow will launch an attack unless Washington takes action A (or stops engaging in activity B) before specific time T.
Western Montana is separated from the missile fields by mountain ranges and the prevailing wind direction and is in fact considered the best place in the continental US to ride out a nuclear attack by Joel Skousen. Being too far away from population centers to be walkable by refugees is the main consideration for Skousen.
Skousen also likes the Cumberland Plateau because refugees are unlikely to opt to walk up the escarpment that separates the Plateau from the population centers to its south.
The overhead is mainly the "fixed cost" of engineering something that works well, which suggest re-using some of the engineering costs already incurred in making it possible for a person to make a hands-free phone call on a smartphone.
Off-topic: most things (e.g., dust particles) that land in an eye end up in the nasal cavity, so I would naively expect that protecting the eyes would be necessary to protect oneself fully from respiratory viruses:
https://www.ranelle.com/wp-content/uploads/2016/08/Tear-Dust-Obstruction-1024x485.jpg
Does anyone care to try to estimate how much the odds ratio of getting covid (O(covid)) decreases when we intervene by switching a "half-mask" respirator such as the ones pictured here for a "full-face" respirator (which protects the eyes)?
The way it is now, when one lab has an insight, the insight will probably spread quickly to all the other labs. If we could somehow "drive capability development into secrecy," that would drastically slow down capability development.
Malice is a real emotion, and it is a bad sign (but not a particularly strong sign) if a person has never felt it.
Yes, letting malice have a large influence on your behavior is a severe character flaw, that is true, but that does not mean that never having felt malice or being incapable of acting out of malice is healthy.
Actually, it is probably rare for a person never to act out of malice: it is probably much more common for a person to just be unaware of his or her malicious motivations.
The healthy organization is to be tempted to act maliciously now and...
their >90% doom disagrees with almost everyone else who thinks seriously about AGI risk.
The fact that your next sentence refers to Rohin Shah and Paul Christiano, but no one else, makes me worry that for you, only alignment researchers are serious thinkers about AGI risk. Please consider that anyone whose P(doom) is over 90% is extremely unlikely to become an alignment researcher (or to remain one if their P(doom) became high when they were an alignment researcher) because their model will tend predict that alignment research is futile or that it act...
I disagree. I think the fact that our reality branches a la Everett has no bearing on our probability of biogensis.
Consider a second biogenesis that happened recently enough and far away enough that light (i.e., information, causal influence) has not had enough time to travel from it to us. We know such regions of spacetime "recent enough and far away enough" exist and in principle could host life, but since we cannot observe a sign of life or a sign of lack of life from them, they are not relevant to our probability of biogenesis whereas by your logic, they are relevant.
When you write,
a draft research paper or proposal that frames your ideas into a structured and usable format,
who is "you"?
I know you just said that you don't completely trust Huberman, but just today, Huberman published a 30-minute video titled "Master your sleep and be more alert when awake". I listened to it (twice) to refresh my memory and to see if his advice changed.
He mentions yellow-blue (YB) contrasts once (at https://www.youtube.com/watch?v=lIo9FcrljDk&t=502s) and at least thrice he mentions the desirability of exposure to outdoor light when the sun is at a low angle (close to the horizon). As anyone can see by looking around at dawn and again at mid-day, at dawn...
This post paints a partially inaccurate picture. IMHO the following is more accurate.
Unless otherwise indicated, the following information comes from Andrew Huberman. Most comes from Huberman Lab Podcast #68. Huberman opines on a great many health topics. I want to stress that I don't consider Huberman a reliable authority in general, but I do consider him reliable on the circadian rhythm and on motivation and drive. (His research specialization for many years was the former and he for many years has successfully used various interventions to improve his o...
“Game theoretic strengthen-the-tribe perspective” is a completely unpersuasive argument to me. The psychological unity of humankind OTOH is persuasive when combined with the observation that this unitary psychology changes slowly enough that the human mind’s robust capability to predict the behavior of conspecifics (and manage the risks posed by them) can keep up.
Although I agree with another comment that Wolfram has not "done the reading" on AI extinction risk, my being able to watch his face while he confronts some of the considerations and arguments for the first time made it easier, not harder, for me to predict where his stance on the AI project will end up 18 months from now. It is hard for me to learn anything about anyone by watching them express a series of cached thoughts.
Near the end of the interview, Wolfram say that he cannot do much processing of what was discussed "in real time", which strongly sugge...
Some people are more concerned about S-risk than extinction risk, and I certainly don't want to dismiss them or imply that their concerns are mistaken or invalid, but I just find it a lot less likely that the AI project will lead to massive human suffering than its leading to human extinction.
the public seems pretty bought-in on AI risk being a real issue and is interested in regulation.
There's a huge gulf between people's expressing concern about AI to pollsters and the kind of regulations and shutdowns that would actually avert extinction. The people...
The CNS contains dozens of "feedback loops". Any intervention that drastically alters the equilibrium point of several of those loops is generally a bad idea unless you are doing it to get out of some dire situation, e.g., seizures. That's my recollection of Huberman's main objection put into my words (because I dont recall his words).
Supplementing melatonin is fairly unlikely to have (much of) a permanent effect on the CNS, but you can waste a lot of time by temporarily messing up CNS function for the duration of the melatonin supplementation (because a p...
Are Eliezer and Nate right that continuing the AI program will almost certainly lead to extinction or something approximately as disastrous as extinction?
Good points, which in part explains why I think it is very very unlikely that AI research can be driven underground (in the US or worldwide). I was speaking to the desirability of driving it underground, not its feasibility.