Richard Hollerith. 15 miles north of San Francisco. hruvulum@gmail.com
My probability that AI research will end all human life is .92. It went up drastically when Eliezer started going public with his pessimistic assessment in April 2022. Till then my confidence in MIRI (and knowing that MIRI has enough funding to employ many researchers) was keeping my probability down to about .4. (I am glad I found out about Eliezer's assessment.)
Currently I am willing to meet with almost anyone on the subject of AI extinction risk.
Last updated 26 Sep 2023.
The strongest sign an attack is coming that I know of is firm evidence that Russia or China is evacuating her cities.
Another sign that would get me to flee immediately (to a rural area of the US: I would not try to leave the country) is a threat by Moscow that Moscow will launch an attack unless Washington takes action A (or stops engaging in activity B) before specific time T.
Western Montana is separated from the missile fields by mountain ranges and the prevailing wind direction and is in fact considered the best place in the continental US to ride out a nuclear attack by Joel Skousen. Being too far away from population centers to be walkable by refugees is the main consideration for Skousen.
Skousen also likes the Cumberland Plateau because refugees are unlikely to opt to walk up the escarpment that separates the Plateau from the population centers to its south.
The overhead is mainly the "fixed cost" of engineering something that works well, which suggest re-using some of the engineering costs already incurred in making it possible for a person to make a hands-free phone call on a smartphone.
Off-topic: most things (e.g., dust particles) that land in an eye end up in the nasal cavity, so I would naively expect that protecting the eyes would be necessary to protect oneself fully from respiratory viruses:
https://www.ranelle.com/wp-content/uploads/2016/08/Tear-Dust-Obstruction-1024x485.jpg
Does anyone care to try to estimate how much the odds ratio of getting covid (O(covid)) decreases when we intervene by switching a "half-mask" respirator such as the ones pictured here for a "full-face" respirator (which protects the eyes)?
The way it is now, when one lab has an insight, the insight will probably spread quickly to all the other labs. If we could somehow "drive capability development into secrecy," that would drastically slow down capability development.
Malice is a real emotion, and it is a bad sign (but not a particularly strong sign) if a person has never felt it.
Yes, letting malice have a large influence on your behavior is a severe character flaw, that is true, but that does not mean that never having felt malice or being incapable of acting out of malice is healthy.
Actually, it is probably rare for a person never to act out of malice: it is probably much more common for a person to just be unaware of his or her malicious motivations.
The healthy organization is to be tempted to act maliciously now and then, but to be good at perceiving when that is happening and to ignore or to choose to resist the temptation most of the time (out of a desire to be a good person).
I expect people to disagree with this answer.
their >90% doom disagrees with almost everyone else who thinks seriously about AGI risk.
The fact that your next sentence refers to Rohin Shah and Paul Christiano, but no one else, makes me worry that for you, only alignment researchers are serious thinkers about AGI risk. Please consider that anyone whose P(doom) is over 90% is extremely unlikely to become an alignment researcher (or to remain one if their P(doom) became high when they were an alignment researcher) because their model will tend predict that alignment research is futile or that it actually increases P(doom).
There is a comment here (which I probably cannot find again) by someone who was in AI research in the 1990s, then he realized that the AI project is actually quite dangerous, so he changed careers to something else. I worry that you are not counting people like him as people who have thought seriously about AGI risk.
I disagree. I think the fact that our reality branches a la Everett has no bearing on our probability of biogensis.
Consider a second biogenesis that happened recently enough and far away enough that light (i.e., information, causal influence) has not had enough time to travel from it to us. We know such regions of spacetime "recent enough and far away enough" exist and in principle could host life, but since we cannot observe a sign of life or a sign of lack of life from them, they are not relevant to our probability of biogenesis whereas by your logic, they are relevant.
Although the argument you outline might be an argument against ever fully trusting tests (usually called "evals" on this site) that this or that AI is aligned, alignment researchers have other tools in their toolbox besides running tests or evals.
It would take a long time to explain these tools, particularly to someone unfamiliar with software development or a related field like digital-electronics design. People make careers in studying tools to make reliable software systems (and reliable digital designs).
The space shuttle is steered by changing the direction in which the rocket nozzles point relative to the entire shuttle. If at any point in flight, one of the rocket nozzles had pointed in a direction a few degrees different from the one it should point in, the shuttle would have been lost and all aboard would have died. The pointing or aiming of these rocket nozzles is under software control. How did the programmers at NASA make this software reliable? Not merely through testing!
These programmers at NASA relied on their usual tools (basically engineering-change-order culture) and did not need a more advanced tool called "formal verification", which Intel turned to to make sure their microprocessors did not have any flaw necessitating another expensive recall after they spent 475 million dollars in 1994 in a famous product recall to fix the so-called Pentium FDIV bug.
Note that FDIV refers to division of (floating-point) numbers and that it is not possible in one human lifetime to test all possible dividends and divisors to ensure that a divider circuit is operating correctly. I.e., the "impossible even theoretically" argument you outline would have predicted that it is impossible to ensure the correct operation of even something as simple as a divider implemented in silicon, and yet Intel has during the 30 years since the 1994 recall avoided another major recall of any of their microprocessors for reasons of a mistake in their digital design.
"Memory allocation" errors (e.g., use-after-free errors) are an important source of bugs and security vulnerabilities in software, and testing has for decades been an important way to find and eliminate these errors (Valgrind probably being the most well-known framework for doing the testing) but the celebrated new programming language Rust completely eliminates the need for testing for this class of programming errors. Rust replaces testing with a more reliable methodology making use of a tool called a "borrow checker". I am not asserting that a borrow checker will help people create an aligned super-intelligence: I am merely pointing at Rust and its borrow checker as an example of a methodology that is superior to testing for ensuring some desirable property (e.g., the absence of use-after-free errors that an attacker might be able to exploit) of a complex software artifact.
In summary, aligning a superhuman AI is humanly possible given sufficiently careful and talented people. The argument for stopping frontier AI research (or pausing it for 100 years) depends on considerations other than the "impossible even theoretically" argument you outline.
Methodologies that are superior to testing take time to develop. For example, the need for a better methodology to prevent "memory allocation" errors was recognized in the 1970s. Rust and its borrow checker are the result of a line of investigation inspired by a seminal paper published in 1987. But Rust has become a realistic option for most programming projects only within the last 10 years or less. And an alignment methodology that continue to be reliable even when the AI becomes super-humanly capable is a much taller order than a methodology to prevent use-after-free errors and related memory-allocation errors.