Black box knowledge

Elo

When we want to censor an image we put a black box over it. Over the area we want to censor. In a similar sense we can purposely censor our knowledge. This comes in particular handiness when thinking about things that might be complicated but we don't need to know.

A deliberate black box around how toasters work would look like this:

bread -> black box -> toast

Not all processes need knowing, for now a black box can be a placeholder for the future.

With the power provided to us by a black box, we can identify what we don't know. We can say; Hey! I don't know what a toaster is but it would be about 2 hours to work it out. if I ever did want to work it out, I could just spend two hours to do it. Until then; I saved myself two hours. If we take other more time-burdensome fields it works even better. Say tax.

Need to file tax -> black box accountant -> don't need to file my tax because I got the accountant to do it for me.

I know I can file my own tax, but that might be 100-200 hours of knowing everything an accountant knows about tax. (It also might be 10 hours depending on your country and their tax system). For now I can assume that hiring an accountant saved me a number of hours in doing it myself. So - Winning!

Take car repairs. On the one hand; you could do it yourself and unpack the black box, or you could trade your existing currency $$ (which you already traded your time to earn) for someone else's skills and time to repair the car. The system looks like this:

Broken car -> black box mechanic -> working car

By deliberately not knowing how it works; we can tap out of even trying to figure it out for now. The other advantage is that we can look at; not just what we know in terms of black boxes but more importantly what we don't know. We can build better maps by knowing what we don't know.

Computers:

Logic gates -> Black box computeryness -> www.lesswrong.com

Or maybe it's like this: (for more advanced users)

Computers:

Logic gates -> flip flops -> Black box CPU -> black box GPU -> www.lesswrong.com

The black-box system happens to also have a meme about it:

Step 1. Get out of bed

Step 2. Build AGI

Step 3. ?????

Step 4. Profit

Only now we have a name for deliberately skipping finding out how step 3 works.

Another useful system:

Dieting

Food in (weight goes up) -> black box human body -> energy out (weight goes down)

Make your own black box systems in the comments.

Meta: short post, 1.5 hour to write, edit and publish. Felt it was an idea that provides useful ways to talk about things. Needed it to explain something to someone, now all can enjoy!

My Table of contents has my other writings in it.

All suggestions and improvements welcome!

It's good to have a name for this idea!

I noticed a typo: "sensor" --> "censor"

I guess this is analogous to "magic" as used in http://lesswrong.com/lw/ix/say_not_complexity/

Still, good post, and definitely closer to real life than Eliezer. Thanks!

yes! It certainly does relate!

Some of your black box examples seem unproblematic. I agree that all you need to trust that a toaster will toast bread is an induction from repeated observation that bread goes in and toast comes out.

(Although, if the toaster is truly a black box about which we know absolutely NOTHING, then how can we induce that the toaster will not suddenly start shooting out popsicles or little green leprechauns when the year 2017 arrives? In reality, a toaster is nothing close to a black box. It is more like a gray box. Even if you think you know nothing about how a toaster works, you really do know quite a bit about how a toaster works by virtue of being a reasonably intelligent adult who understands a little bit about general physics--enough to know that a toaster is never going to start shooting out leprechauns. In fact, I would wager that there are very few true "black boxes" in the world--but rather, many gray boxes of varying shades of gray).

However, the tax accountant and the car mechanic seem to be even more problematic as examples of black boxes because there is intelligent agency behind them--agency that can analyze YOUR source code, determine the extent to which you think those things are a black box, and adjust their output accordingly. For example, how do you know that your car will be fixed if you bring it to the mechanic? If the mechanic knows that you consider automotive repair to be a complete black box, the mechanic could have an incentive to purposefully screw up the alignment or the transmission or something that would necessitate more repairs in the future, and you would have no way of telling where those problems came from. Or, the car mechanic could just lie about how much the repairs would cost, and how would you know any better? Ditto with the tax accountant.

The tax accountant and the car mechanic are a bit like AIs...except AIs would presumably be much more capable at scanning our source code and taking advantage of our ignorance of its black-box nature.

Here's another metaphor: in my mind, the problem of humanity confronting AI is a bit like the problem that a mentally-retarded billionaire would face.

Imagine that you are a mentally-retarded person with the mind of a two-year-old who has suddenly just come into possession of a billion dollars in a society where there is no state or higher authority to regulate enforce any sort of morality or make sure that things are "fair." How are you going to ensure that your money will be managed in your interest? How can you keep your money from being outright stolen from you?

I would assert that there would be, in fact, no way at all for you to have your money employed in your interest. Consider:

*Do you hire a money manager (a financial advisor, a bank, a CEO...any sort of money manager)? What would keep this money manager from taking all of your money and running away with it? (Remember, there is no higher authority to punish this money manager in this scenario). If you were as smart or smarter than the money manager, you could probably track down this money manager and take your money back. But you are not as smart as the money manager. You are a mentally-retarded person with the mind of a toddler. And in that case where you did happen to be as smart as the money manager, then the money manager would be redundant in the first place. You would just manage your own money.

*Do you try to manage your money on your own? Remember, you have the mind of a two-year-old. The best you can do is stumble around on the floor and say "Goo-goo-gah-gah." What are you going to be able to do with a billion dollars?

Neither solution in this metaphor is satisfactory.

In this metaphor: The two-year-old billionaire is humanity. The lack of a higher authority symbolizes the absence of a God to punish an AI. *The money manager is like AI.

If an AI is a black box, then you are screwed. If an AI is not a black box, then what do you need the AI for?

Humans only work as black-boxes (or rather, gray-boxes) because we have an instinctual desire to be altruistic to other humans. We don't take advantage of each other. (And this does not apply equally to all people. Sociopaths and tribalistic people would happily take advantage of strangers. And I would allege that a world civilization made up of entirely these types of people would be deeply dysfunctional).

So, here's how we might keep an AI from becoming a total black-box, while still allowing it to do useful work:

Let it run for a minute in a room unconnected to the Internet. Afterwards, hiring a hundred million programmers to trace out exactly what the AI was doing in that minute by looking at a readout of the most base-level code of the AI.

To any one of these programmers, the rest of the AI that does not happen to be that programmer's special area of expertise will seem like a black box. But, through communication, humanity could pool their specialized investigations into each part of the AIs running code and sketch out an overall picture of whether its computations were on a friendly trajectory or not.

trust that a toaster will toast bread

yes, this is a retrospective example. once I already know what happens; I can say that a toaster makes bread into toast. If you start to make predictive examples; things get more complicated as you have mentioned.

It still helps to have an understanding of what you don't know. And in the case of AI; an understanding of what you are deciding not to know (for now) can help you consider the risk involved in playing with AI of unclear potential.

i.e. AI with defined CEV -> what happens next -> humans are fine. seems like a bad idea to expect a good outcome from. Now maybe we can work on a better process for defining CEV.

humanity not extinct or suffering ->FAI black box -> humanity still not extinct or suffering