Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Viliam 10 February 2017 11:10:21AM 1 point [-]

any area where other people have well-informed opinions

It could be tricky to find out which areas are like this.

I mean, if I disagree with most people, I am probably going to suspect that their opinions are not well-informed, so I guess the core problem is assessing the "well-informedness" of people I disagree with.

For example, if I believe that someone is highly intelligent and has a lot of experience in X, and their explanation makes sense to me, or they are willing to listen to my arguments and then tell me where specifically did I make a mistake... I would be quite willing to move my opinion towards theirs. No controversy here.

But there are also situations where people have strong opinions for wrong reasons, and they consider themselves well-informed because they talked to each other and read each other's blogs or books.

Comment author: Davidmanheim 13 February 2017 07:15:46PM 0 points [-]

It's strange for me to consider that you'd take the inside view when evaluating the informed-ness of others.

The usually correct solution is to realize that both your opinion and theirs is likely based on making some incorrect assumptions. You just don't know which - and so this is a useful quick heuristic

A Quick Confidence Heuristic; Implicitly Leveraging "The Wisdom of Crowds"

1 Davidmanheim 10 February 2017 12:54AM

Let’s say you have well-informed opinions on a variety of topics. Without information about your long term accuracy in each given area, how confident should you be in those opinions?

Here’s a quick heuristic, for any area where other people have well-informed opinions about the same topics; your confidence should be a function of the distance of your estimate from the average opinion, and the standard deviation of those opinions. I’ll call this the wisdom-of-crowds-confidence level, because it can be justified based on the empirical observation that the average of even uninformed guesses is typically a better predictor than most individual predictions.

Why does this make sense?

The Aumann agreement theorem implies that rational discussants can, given enough patience and introspection, pass messages about their justifications until they eventually converge. Given that informed opinions share most evidence, the differential between the opinions is likely due to specific unshared assumptions or evidence. If that evidence were shared, unless the vast majority of the non-shared assumptions were piled up on the same side, the answer would land somewhere near the middle. (This is why I was going to call the heuristic Aumann-confidence, but I don’t think it quite fits.)

Unless you have a strong reason to assume you are a privileged observer, trading on inside information or much better calibrated than other observers, there is no reason to expect this nonshared evidence will be biased. And while this appears to contradict the conservation of expected evidence theorem, it’s actually kind-of a consequence of it, because we need to update on the knowledge that there is unshared evidence leading the other person to make their own claim.

This is where things get tricky — we need to make assumptions about joint distributions on unshared evidence. Suffice it to say that unless we have reason to believe our unshared evidence or assumptions is much stronger than theirs, we should end up near the middle. And that goes back to a different, earlier assumption - that others are also well informed.

Now that we’ve laid out the framework, though, we can sketch the argument.

  1. We can expect that our opinion should shift towards the average, once we know what the average is, even without exploring the other people’s unshared assumptions and data. The distance it should shift depends on how good our assumptions and data are compared to theirs.
  2. Even if we have strong reasons for thinking that we understand why others hold the assumptions they do, they presumably feel the same way about us.
  3. And why do you think your unshared evidence and assumptions are so great anyways, huh? Are you special or something?

Anyways, those are my thoughts.

Comments?

Comment author: BiasedBayes 30 January 2017 09:35:27PM 0 points [-]

Thanks for the link, the title is a bit misleading though ("most empirical questions..").

Comment author: Davidmanheim 08 February 2017 04:12:02PM 0 points [-]

Really? I'd expect that very few questions outside of the hard sciences are resolvable given model uncertainty, the limits on running RCTs, and the limited samples available given finite population and insufficient variation regarding variables of interest.

Do you disagree?

Comment author: Qiaochu_Yuan 22 January 2017 08:54:29AM *  0 points [-]

This argument more or less assumes its conclusion; after all, if it weren't the case that 1 - 0.999... were zero, then it would be some positive number x, so you could pick epsilon = x.

Comment author: Davidmanheim 23 January 2017 08:52:05PM 0 points [-]

And in certain constructions, epsilon is a distinct number - so it's actually fallacious without going back to the definitions!

Comment author: waveman 21 January 2017 10:06:33PM *  3 points [-]

Slightly off topic to the main point of the article, which is how to deal with not understanding something.

For anyone who wants to fully understand this, you need to read "Baby Rudin": Rudin "Principles of Mathematical Analysis".

Real numbers are defined as equivalence classes of convergent series. The series 1,1,1,1,1 and 0.9, 0.999 ... are in the same equivalence class and so are the same real number. People often get caught up with the assumption that two different series (or two different decimal representations) must be different numbers.

Comment author: Davidmanheim 23 January 2017 08:50:33PM 1 point [-]

If you study a bit more, though, it stops being necessarily true; https://arxiv.org/abs/1307.7392

Comment author: Davidmanheim 23 January 2017 08:47:15PM 0 points [-]

You left out a possibility; true dependent on something outside your realm of knowledge. In this case, it's true for real numbers, but false for surreal numbers.

[Link] Most empirical questions are unresolveable; The good, the bad, and the appropriately under-powered

3 Davidmanheim 23 January 2017 08:35PM
Comment author: Flinter 16 January 2017 07:28:34PM *  1 point [-]

Ok. I skimmed it, and I think I understand your post well enough (if not I'll read deeper!). What I am introducing into the dialogue is a theoretical and conceptually stable unit of value. I am saying, let's address the problems stated in your articles as if we don't have the problem of defining our base unit and that it exists and is agreed upon and it is stable for all time.

So here is an example from one of your links:

"Why is alignment hard?

Why expect that this problem is hard? This is the real question. You might ordinarily expect that whoever has taken on the job of building an AI is just naturally going to try to point that in a relatively nice direction. They’re not going to make evil AI. They’re not cackling villains. Why expect that their attempts to align the AI would fail if they just did everything as obviously as possible?

Here’s a bit of a fable. It’s not intended to be the most likely outcome. I’m using it as a concrete example to explain some more abstract concepts later.

With that said: What if programmers build an artificial general intelligence to optimize for smiles? Smiles are good, right? Smiles happen when good things happen."

Do we see how we can solve this problem now? We simply optimize the AI system for value, and everyone is happy.

If someone creates "bad" AI we could measure that, and use the measurement for a counter program.

Comment author: Davidmanheim 23 January 2017 04:22:00PM 0 points [-]

Another point - "What I am introducing into the dialogue is a theoretical and conceptually stable unit of value."

Without a full and perfect system model, I argued that creating perfectly aligned metrics, liek a unit of value, is impossible. (To be fair, I really argued that point in the follow-up piece; www.ribbonfarm.com/2016/09/29/soft-bias-of-underspecified-goals/ ) So if our model for human values is simplified in any way, it's impossible to guarantee convergence to the same goal without a full and perfect systems model to test it against.

Comment author: Flinter 16 January 2017 07:28:34PM *  1 point [-]

Ok. I skimmed it, and I think I understand your post well enough (if not I'll read deeper!). What I am introducing into the dialogue is a theoretical and conceptually stable unit of value. I am saying, let's address the problems stated in your articles as if we don't have the problem of defining our base unit and that it exists and is agreed upon and it is stable for all time.

So here is an example from one of your links:

"Why is alignment hard?

Why expect that this problem is hard? This is the real question. You might ordinarily expect that whoever has taken on the job of building an AI is just naturally going to try to point that in a relatively nice direction. They’re not going to make evil AI. They’re not cackling villains. Why expect that their attempts to align the AI would fail if they just did everything as obviously as possible?

Here’s a bit of a fable. It’s not intended to be the most likely outcome. I’m using it as a concrete example to explain some more abstract concepts later.

With that said: What if programmers build an artificial general intelligence to optimize for smiles? Smiles are good, right? Smiles happen when good things happen."

Do we see how we can solve this problem now? We simply optimize the AI system for value, and everyone is happy.

If someone creates "bad" AI we could measure that, and use the measurement for a counter program.

Comment author: Davidmanheim 23 January 2017 03:41:01PM 0 points [-]

"If someone creates "bad" AI we could measure that, and use the measurement for a counter program."

(I'm just going to address this point in this comment.) The space of potential bad programs is vast - and the opposite of a disastrous values misalignment is almost always a different values misalignment, not alignment.

In two dimensions, think of a misaligned wheel; it's very unlikely to be exactly 180 degrees (or 90 degrees) away from proper alignment. Pointing the car in a relatively nice direction is better than pointing it straight at the highway divider wall - but even a slight misalignment will eventually lead to going off-road. And the worry is that we need to have a general solution before we allow the car to get to 55 MPH, much less 100+. But you argue that we can measure the misalignment. True! If we had a way to measure the angle between its alignment and the correct one, we could ignore the misaligned wheel angle, and simple minimize the misalignment -which means the measure of divergence implicitly contains the correct alignment.

For an AI value function, the same is true. If we had a measure of misalignment, we could minimize it. The tricky part is that we don't have such a metric, and any correct such metric would be implicitly equivalent to solving the original problem. Perhaps this is a fruitful avenue, since recasting the problem this way can help - and it's similar to some of the approaches I've heard Dario Amodei mention regarding value alignment in machine learning systems. So it's potentially a good insight, but insufficient on its own.

Comment author: Davidmanheim 16 January 2017 06:48:31PM 0 points [-]

I think what you're missing is that metrics are difficult - I've written about that point in a number of contexts; www.ribbonfarm.com/2016/06/09/goodharts-law-and-why-measurement-is-hard/

There are more specific metric / goal problems with AI; Eliezer wrote this https://intelligence.org/2016/12/28/ai-alignment-why-its-hard-and-where-to-start/ - and Dario Amodei has been working on it as well; https://openai.com/blog/faulty-reward-functions/ - and there is a lot more in this vein!

View more: Next