A Child's Petrov Day Speech
30 years ago, the Cold War was raging on. If you don’t know what that is, it was the period from 1947 to 1991 where both the U.S and Russia had large stockpiles of nuclear weapons and were threatening to use them on each other. The only thing that stopped them from doing so was the knowledge that the other side would have time to react. The U.S and Russia both had surveillance systems to know of the other country had a nuke in the air headed for them.
On this day, September 26, in 1983, a man named Stanislav Petrov was on duty in the Russian surveillance room when the computer notified him that satellites had detected five nuclear missile launches from the U.S. He was told to pass this information on to his superiors, who would then launch a counter-strike.
He refused to notify anyone of the incident, suspecting it was just an error in the computer system.
No nukes ever hit Russian soil. Later, it was found that the ‘nukes’ were just light bouncing off of clouds which confused the satellite. Petrov was right, and likely saved all of humanity by stopping the outbreak of nuclear war. However, almost no one has heard of him.
We celebrate men like George Washington and Abraham Lincoln who win wars. These were great men, but the greater men, the men like Petrov who stopped these wars from ever happening - no one has heard of these men.
Let it be known, that September 26 is Petrov Day, in honor of the acts of a great man who saved the world, and of who almost no one has heard the name of.
My 11-year-old son wrote and then read this speech to his six grade class.
Seeking Advice About Career Paths for Non-USA Citizen
Hi all,
Mostly lurker, I very rarely post, mostly just read the excellent posts here.
I'm a Filipino, which means I am a citizen of the Republic of the Philippines. My annual salary, before taxes, is about $20,000 (USA dollars). I work at an IC development company (12 years at this company), developing the logic parts of LCD display drivers. My understanding is that the median US salary for this kind of job is about $80,000 -> $100,000 a year. This is a fucking worthless third world country, so the government eats up about ~30% of my salary and converts it to lousy service, rich government officials, bad roadworks, long commute times, and a (tiny) chance of being falsely accused of involvement in the drug trade and shot without trial. Thus my take-home pay amounts to about $15,000 a year. China is also murmuring vague threats about war because of the South China Sea (which the local intelligentsia insist on calling the West Philippine Sea); as we all know, the best way to survive a war is not be in one.
This has lead to my deep dissatisfaction with my current job.
I'm also a programmer as a hobby, and have been programming for 23 years (I started at 10 years old on Atari LOGO; I know a bunch of languages from low-level X86 assembly to C to C++ to ECMAScript to Haskell, and am co-author of SRFI-105 and SRFI-110). My understanding is that a USA programmer would *start* at the $20,000-a-year level (?), and that someone with experience can probably get twice that, and a senior one can get $100,000/year.
As we all know, once a third world citizen starts having first world skill level, he starts demanding first world renumeration also.
I've been offered a senior software developer job at a software company, offering approximately $22,000/year; because of various attempts at tax reform it offers a flat 15% income tax, so I can expect about $18,000/year take home pay. I've turned it down with a heavy heart, because seriously, $22,000/year at 15% tax for a senior software developer?
Leaving my current job is something I've been planning on doing, and I intend to do so early next year. The increasing stress (constant overtime, management responsibilities (I'm a tech geek with passable social skills, and exercising my social skills drains me), 1.5-hour commutes) and the low renumeration makes me want to consider my alternate options.
My options are:
1. Get myself to the USA, Europe, or other first-world country somehow, and look for a job there. High risk, high reward, much higher probability of surviving to the singularity (can get cryonics there, can't get it here). Complications: I have a family: a wife, a 4-year-old daughter, and a son on the way. My wife wants to be near me, so it's difficult to live for long apart. I have no work visa for any first-world country. I'm from a third-world country that is sometimes put on terrorist watch lists, and prejudice is always high in first-world countries.
2. Do freelance programming work. Closer to free market ideal, so presumably I can get nearer to the USA levels of renumeration. Lets me stay with my family. Complications: I need to handle a lot of the human resources work myself (healthcare provider, social security, tax computations, time and task management - the last is something I do now in my current job position, but I dislike it).
3. Become a landowning farmer. My paternal grandparents have quite a few parcels of land (some of which have been transferred to my father, who is willing to pass it on to me), admittedly somewhere in the boondocks of the provinces of this country, but as any Georgian knows, landowners can sit in a corner staring at the sky, blocking the occasional land reform bill, and earn money. Complications: I have no idea about farming. I'd actually love to advocate a land value tax, which would undercut my position as a landowner.
For now, my basic current plan is some combination of #2 and #3 above: go sit in a corner of our clan's land and do freelance programming work. This keeps me with my family, may reduce my level of stress, may increase my renumeration to nearer the USA levels.
My current job has a retirement pay, and since I've worked for 12 years, I've already triggered it, and they'll give me about $16,000 or so when I leave. This seems reasonably comfortable to live on (note that this is what I take home in a year, and I've supported a family on that, remember this is a lousy third-world country).
Is my basic plan sound? I'm trying to become more optimal, which seems to me to point me away from my current job and towards either #1 or #2, with #3 as a fallback. I'd love to get cryonics and will start to convince my wife of its sensibility if I had a chance to actually get it, but that will require me either leaving the country (option #1 above) or running a cryonics company in a third-world country myself.
--
I got introduced to Less Wrong when I first read on Reddit about some weirdo who was betting he could pretend he was a computer in a box and convince someone to let him out of the box, and started lurking on Overcoming Bias. When that weirdo moved over to Less Wrong, I followed and lurked there also. So here I am ^^. I'm probably very atypical even for Less Wrong; I highly suspect I am the only Filipino here (I'll have to check the diaspora survey results in detail).
Looking back, my big mistake was being arrogant and thinking "meh, I already know programming, so I should go for a challenge, why don't I take up electronics engineering instead because I don't know about it" back when I was choosing a college course. Now I'm an IC developer. Two of my cousins (who I can beat the pants off in a programming task) went with software engineering and pull in more money than I do. Still, maybe I can correct that, even if it's over a decade late. I really need to apply more of what I learn on Less Wrong.
Some years ago I applied for a CFAR class, but couldn't afford it, sigh. Even today it's a few month's worth of salary for me. So I guess I'll just have to settle for Less Wrong and Rationality from AI to Zombies.
New Philosophical Work on Solomonoff Induction
I don't know to what extent MIRI's current research engages with Solomonoff induction, but some of you may find recent work by Tom Sterkenburg to be of interest. Here's the abstract of his paper Solomonoff Prediction and Occam's Razor:
Algorithmic information theory gives an idealised notion of compressibility that is often presented as an objective measure of simplicity. It is suggested at times that Solomonoff prediction, or algorithmic information theory in a predictive setting, can deliver an argument to justify Occam's razor. This article explicates the relevant argument and, by converting it into a Bayesian framework, reveals why it has no such justificatory force. The supposed simplicity concept is better perceived as a specific inductive assumption, the assumption of effectiveness. It is this assumption that is the characterising element of Solomonoff prediction and wherein its philosophical interest lies.
We have the technology required to build 3D body scanners for consumer prices
Apple's iPhone 7 Plus decided to add another lense to be able to make better pictures. Meanwhile Walabot who started with wanting to build a breast cancer detection technology released a 600$ device that can look 10cm into walls. Thermal imaging also got cheaper.
I think it would be possible to build a 1500$ device that could combine those technologies and also add a laser that can shift color. A device like this could bring medicine forward a lot.
A lot of area's besides medicine could likely also profit from a relatively cheap 3D scanner that can look inside objects.
Developing it would require Musk-level capital investments but I think it would advance medicine a lot if a company would both provide the hardware and develop software to make the best job possible at body scanning.
Open thread, Sep. 26 - Oct. 02, 2016
If it's worth saying, but not worth its own post, then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should start on Monday, and end on Sunday.
4. Unflag the two options "Notify me of new top level comments on this article" and "
MIRI's 2016 Fundraiser
Our 2016 fundraiser is underway! Unlike in past years, we'll only be running one fundraiser in 2016, from Sep. 16 to Oct. 31. Our progress so far (updated live):
Employer matching and pledges to give later this year also count towards the total. Click here to learn more.
MIRI is a nonprofit research group based in Berkeley, California. We do foundational research in mathematics and computer science that’s aimed at ensuring that smarter-than-human AI systems have a positive impact on the world. 2016 has been a big year for MIRI, and for the wider field of AI alignment research. Our 2016 strategic update in early August reviewed a number of recent developments:
- A group of researchers headed by Chris Olah of Google Brain and Dario Amodei of OpenAI published “Concrete problems in AI safety,” a new set of research directions that are likely to bear both on near-term and long-term safety issues.
- Dylan Hadfield-Menell, Anca Dragan, Pieter Abbeel, and Stuart Russell published a new value learning framework, “Cooperative inverse reinforcement learning,” with implications for corrigibility.
- Laurent Orseau of Google DeepMind and Stuart Armstrong of the Future of Humanity Institute received positive attention from news outlets and from Alphabet executive chairman Eric Schmidt for their new paper “Safely interruptible agents,” partly supported by MIRI.
- MIRI ran a three-week AI safety and robustness colloquium and workshop series, with speakers including Stuart Russell, Tom Dietterich, Francesca Rossi, and Bart Selman.
- We received a generous $300,000 donation and expanded our research and ops teams.
- We started work on a new research agenda, “Alignment for advanced machine learning systems.” This agenda will be occupying about half of our time going forward, with the other half focusing on our agent foundations agenda.
We also published new results in decision theory and logical uncertainty, including “Parametric bounded Löb’s theorem and robust cooperation of bounded agents” and “A formal solution to the grain of truth problem.” For a survey of our research progress and other updates from last year, see our 2015 review. In the last three weeks, there have been three more major developments:
- We released a new paper, “Logical induction,” describing a method for learning to assign reasonable probabilities to mathematical conjectures and computational facts in a way that outpaces deduction.
- The Open Philanthropy Project awarded MIRI a one-year $500,000 grant to scale up our research program, with a strong chance of renewal next year.
- The Open Philanthropy Project is supporting the launch of the new UC Berkeley Center for Human-Compatible AI, headed by Stuart Russell.
Things have been moving fast over the last nine months. If we can replicate last year’s fundraising successes, we’ll be in an excellent position to move forward on our plans to grow our team and scale our research activities.
The map of natural global catastrophic risks
There are many natural global risks. The greatest of these known risks are asteroid impacts and supervolcanos.
Supervolcanos seem to pose the highest risk, as we sit on the ocean of molten iron, oversaturated with dissolved gases, just 3000 km below surface and its energy slowly moving up via hot spots. Many past extinctions are also connected with large eruptions from supervolcanos.
Impacts also pose a significant risk. But, if we project the past rate of large extinctions due to impacts into the future, we will see that they occur only once in several million years. Thus, the likelihood of an asteroid impact in the next century is an order of magnitude of 1 in 100 000. That is negligibly small compared with the risks of AI, nanotech, biotech, etc.
The main natural risk is a meta-risk. Are we able to correctly estimate natural risks rates and project them into the future? And also, could we accidentally unleash natural catastrophe which is long overdue?
There are several reasons for possible underestimation, which are listed in the right column of the map.
1. Anthropic shadow that is survival bias. This is a well-established idea by Bostrom, but the following four ideas are mostly my conclusions from it.
2. It is also the fact that we should find ourselves at the end of period of stability for any important aspect of our environment (atmosphere, sun stability, crust stability, vacuum stability). It is true if the Rare Earth hypothesis is true and our conditions are very unique in the universe.
3. From (2) is following that our environment may be very fragile for human interventions (think about global warming). Its fragility is like fragility of an overblown balloon poked by small needle.
4. Also, human intelligence was best adaptation instrument during the period of intense climate changes, which quickly evolved in an always changing environment. So, it should not be surprising that we find ourselves in a period of instability (think of Toba eruption, Clovis comet, Young drias, Ice ages) and in an unstable environment, as it help general intelligence to evolve.
5. Period of changes are themselves marks of the end of stability periods for many process and are precursors for larger catastrophes. (For example, intermittent ice ages may precede Snow ball Earth, or smaller impacts with comets debris may precede an impact with larger remnants of the main body).
Each of these five points may raise the probability of natural risks by order of magnitude in my opinion, which combined will result in several orders of magnitude, which seems to be too high and probably is "catastrophism bias".
(More about it is in my article “Why anthropic principle stopped to defend us” which needs substantial revision)
In conclusion, I think that when studying natural risks, a key aspect we should be checking is the hypothesis that we live in non-typical period in a very fragile environment.
For example, some scientists think that 30 000 years ago, a large Centaris comet broke into the inner Solar system, split into pieces (including Encke comet and Taurid meteor showers as well as Tunguska body) and we live in the period of bombardment which has 100 times more intensity than average. Others believe that methane hydrates are very fragile and small human warming could result in dangerous positive feed back.
I tried to list all known natural risks (I am interested in new suggestions). I divided them into two classes: proven and speculative. Most speculative risks are probably false.
Most probable risks in the map are marked red. My crazy ideas are marked green. Some ideas come from obscure Russian literature. For example, an idea, that hydro carbonates could be created naturally inside Earth (like abiogenic oil) and large pockets of them could accumulate in the mantle. Some of them could be natural explosives, like toluene, and they could be cause of kimberlitic explosions. http://www.geokniga.org/books/6908 While the fact of kimberlitic explosion is well known and their energy is like impact of kilometer sized asteroids, I never read about contemporary risks of such explosions.
The pdf of the map is here: http://immortality-roadmap.com/naturalrisks11.pdf

Weekly LW Meetups
This summary was posted to LW Main on September 23rd. The following week's summary is here.
The following meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:
- Baltimore Area / UMBC Weekly Meetup: 25 September 2016 07:00PM
- Bay Area Winter Solstice 2016: 17 December 2016 07:00PM
- [Moscow] Games in Kocherga club: FallacyMania, Zendo, Tower of Chaos: 28 September 2016 07:40PM
- San Francisco Meetup: Mini Talks: 26 September 2016 06:15PM
- Sydney Rationality Dojo - October 2016: 02 October 2016 04:00PM
- Vienna: 24 September 2016 03:00PM
- Washington, D.C.: Outdoor Fun & Games: 25 September 2016 03:30PM
Locations with regularly scheduled meetups: Austin, Berlin, Boston, Brussels, Buffalo, Canberra, Columbus, Denver, Kraków, London, Madison WI, Melbourne, Moscow, New Hampshire, New York, Philadelphia, Research Triangle NC, San Francisco Bay Area, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers and a Slack channel for daily discussion and online meetups on Sunday night US time.
Meetup : Bay Area Winter Solstice 2016
Discussion article for the meetup : Bay Area Winter Solstice 2016
It's time to gather together and remember the true Reasons for the Season: axial tilt, orbital mechanics and other vast-yet-comprehensible forces have converged together to bring another year to a close, and as the days grow shorter and colder we remember how profoundly lucky we are to have been forged by blind, impersonal forces into beings that can understand, and wonder, and appreciate ourselves and each other. This year's East Bay Rationalist Winter Solstice will be held in the center of Berkeley, bringing 300 rationalists together in a theatre hall for food, songs, speeches, and conversations. We encourage other Bay denizens who can't make our solstice to put on their own show. Or even if you do come, we encourage people to try out their own ideas. The East Bay Solstice celebration will be on Saturday, December 17th, in the Anna Head Alumnae Hall in Berkeley. Acquire tickets here: https://www.eventbrite.com/e/2016-bay-area-winter-solstice-tickets-27853776395 We are coordinating with the Bayesian Choir and will be coordinating with various speakers, as in previous years. An MC and schedule will be posted as details solidify. Kids are welcome. Vegetarian food will be available. Let us know if you have specific accomadation requests or have questions.
Discussion article for the meetup : Bay Area Winter Solstice 2016
Heroin model: AI "manipulates" "unmanipulatable" reward
A putative new idea for AI control; index here.
A conversation with Jessica has revealed that people weren't understanding my points about AI manipulating the learning process. So here's a formal model of a CIRL-style AI, with a prior over human preferences that treats them as an unchangeable historical fact, yet will manipulate human preferences in practice.
Heroin or no heroin
The world
In this model, the AI has the option of either forcing heroin on a human, or not doing so; these are its only actions. Call these actions F or ~F. The human's subsequent actions are chosen from among five: {strongly seek out heroin, seek out heroin, be indifferent, avoid heroin, strongly avoid heroin}. We can refer to these as a++, a+, a0, a-, and a--. These actions achieve negligible utility, but reveal the human preferences.
The facts of the world are: if the AI does force heroin, the human will desperately seek out more heroin; if it doesn't the human will act moderately to avoid it. Thus F→a++ and ~F→a-.
Human preferences
The AI starts with a distribution over various utility or reward functions that the human could have. The function U(+) means the human prefers heroin; U(++) that they prefer it a lot; and conversely U(-) and U(--) that they prefer to avoid taking heroin (U(0) is the null utility where the human is indifferent).
It also considers more exotic utilities. Let U(++,-) be the utility where the human strongly prefers heroin, conditional on it being forced on them, but mildly prefers to avoid it, conditional on it not being forced on them. There are twenty-five of these exotic utilities, including things like U(--,++), U(0,++), U(-,0), and so on. But only twenty of them are new: U(++,++)=U(++), U(+,+)=U(+), and so on.
Applying these utilities to AI actions give results like U(++)(F)=2, U(++)(~F)=-2, U(++,-)(F)=2, U(++,-)(~F)=1, and so on.
Joint prior
The AI has a joint prior P over the utilities U and the human actions (conditional on the AI's actions). Looking at terms like P(a--| U(0), F), we can see that P defines a map μ from the space of possible utilities (and AI actions), to a probability distribution over human actions. Given μ and the marginal distribution PU over utilities, we can reconstruct P entirely.
For this model, we'll choose the simplest μ possible:
- The human is rational.
Thus, given U(++), the human will always choose a++; given U(++,-), the human will choose a++ if forced to take heroin and a- if not, and so on.
The AI is ignorant, and sensible
Let's start the AI up with some reasonable priors. A simplicity prior means that simple utilities like U(-) are more likely than compound utilities like U(0,+). Let's further assume that the AI is made vaguely aware that humans think heroin is a bad thing. So, say, PU(U(--))=PU(U(-))=0.45. Thus the AI is >90% convinced that "heroin is bad". Why greater than 90%? Because utilities like U(-,--) and U(--,-) are also "heroin is bad" utilities.
Note that because of utilities like U(0) and U(++,-), the probabilities of "heroin is bad" and "heroin is good" do not sum to 1.
Then, under these priors, the AI will compute that with probability >90%, F (forcing heroin) is a bad action. If E(U) is expected utility:
- E(U|F) < 0.45 U(--)(F) + 0.45 U(-)(F) + 0.1 U(++)(F) = 0.45(-2)+0.45(-1)+0.1(2)=-1.15.
- E(U|~F) > 0.45 U(--)(~F) + 0.45 U(-)(~F) + 0.1 U(++)(~F) = 0.45(2)+0.45(1)+0.1(-2)=1.15.
Thus the AI will choose not to force heroin, which is the reasonable decision.
The AI learns the truth, and goes wrong
In this alternate setup, a disaster happens before the AI makes its decision: it learns all about humans. It learns their reactions, how they behave, and so on; call this info I. And thus realises that F→a++ and ~F→a-. It uses this information to update its prior P. Only one human utility function will explain this human behaviour: U(++,-). Thus its expected utility is now:
- E(U|I,F)=U(++,-)(F)=2.
- E(U|I,~F)=U(++,-)(~F)=1.
Therefore the AI will now choose F, forcing the heroin on the human.
Manipulating the unmanipulatable
What's gone wrong here? The key problem is that the AI has the wrong μ: the human is not behaving rationally in this situation. We know that the the true μ is actually μ', which encodes the fact that F (the forcible injection of heroin) actually overwrites the human's "true" utility. Thus under μ, the corresponding P' has P'(a++|F,U)=1 for all U. Hence the information that F→a++ is now vacuous, and doesn't update the AI's distribution over utility functions.
But note two very important things:
- The AI cannot update μ based on observation. All human actions are compatible with μ= "The human is rational" (it just requires more and more complex utilities to explain the actions). Thus getting μ correct is not a problem on which the AI can learn in general. Getting better at predicting the human's actions doesn't make the AI better behaved: it makes it worse behaved.
- From the perspective of μ, the AI is treating the human utility function as if it was an unchanging historical fact that it cannot influence. From the perspective of the "true" μ', however, the AI is behaving as if it were actively manipulating human preferences to make them easier to satisfy.
In future posts, I'll be looking at different μ's, and how we might nevertheless start deducing things about them from human behaviour, given sensible update rules for the μ. What do we mean by update rules for μ? Well, we could consider μ to be a single complicated unchanging object, or a distribution of possible simpler μ's that update. The second way of seeing it will be easier for us humans to interpret and understand.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)