Filter This week

You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

[Link] Citizen scientist space projects

1 morganism 28 October 2016 09:56PM

Counterfactual do-what-I-mean

1 Stuart_Armstrong 27 October 2016 01:54PM

A putative new idea for AI control; index here.

The counterfactual approach to value learning could be used to possibly allow natural language goals for AIs.

The basic idea is that when the AI is given a natural language goal like "increase human happiness" or "implement CEV", it is not to figure out what these goals mean, but to follow what a pure learning algorithm would establish these goals as meaning.

This would be safer than a simple figure-out-the-utility-you're-currently-maximising approach. But it still doesn't solve a few drawbacks. Firstly, the learning algorithm has to be effective itself (in particular, modifying human understanding of the words should be ruled out, and the learning process must avoid concluding the simpler interpretations are always better). And secondly, humans' don't yet know what these words mean, outside our usual comfort zone, so the "learning" task also involves the AI extrapolating beyond what we know.

Internal Race Conditions

1 SquirrelInHell 23 October 2016 01:23PM

Time start: 14:40:36

I

You might be familiar with the concept of a 'bug', as introduced by CFAR. By using the computer programming analogy, it frames any problem you might have in your life as something fixable... even more - as something to be fixed, something such that fixing it or thinking about how to fix it is the first thing that comes to mind when you see such a problem, or 'bug'.

Let's try another analogy in the same style, with something called 'race conditions' in programming. A race condition as a particular type of bug, that is typically very hard to find and fix ('debug'). It occurs when two or more parts of the same program 'race' to access some data, resource, decision point etc., in a way that is not controlled by any organised principle.

For example, imagine that you have a document open in an editor program. You make some changes, you give a command to save the file. While this operation is in progress, you drag and drop the same file in a file manager, moving to another hard drive. In this case, depending on timing, on the details of the programs, and on the operating system that you are using, you might get different results. The old version of the file might be moved to the new location, while the new one is saved in the old location. Or the file might get saved first, and then moved. Or the saving operation will end in an error, or in a truncated or otherwise malformed file on the disk.

If you know enough details about the situation, you could in fact work out what exactly would happen. But the margin of error in your own handling of the software is so big, that you cannot in practice do this (e.g. you'd need to know the exact milisecond when you press buttons etc.). So in practice, the outcome is random, depending on how the events play out on a scale smaller that you can directly control (e.g. minute differences in timing, strength of reactions etc.).

II

What is the analogy in humans? One of the places in which when you look hard, you'll see this pattern a lot is the relation of emotions and conscious decision making.

E.g., a classic failure mode is a "commitment to emotions", which goes like this:

  • I promise to love you forever
  • however if I commit to this, I will have doubts and less freedom, which will generate negative emotions
  • so I'll attempt to fall in love faster than my doubts grow
  • let's do this anyway, why won't we?

The problem here is a typical emotional "race condition": there is a lot of variability in the outcome, depending on how events play out. There could be a "butterfly effect", in which e.g. a single weekend trip together could determine the fate of the relationship, by creating a swing up or down, which would give one side of emotions a head start in the race.

III

Another typical example is making a decision about continuing a relationship:

  • when I spend time with you, I like you more
  • when I like you more, I want to continue our relationship
  • when we have a relationship, I spend more time with you

As you can see, there is a loop in decision process. This cannot possibly end well.

A wild emotional rollercoaster is probably around the least bad outcome of this setup.

IV

So how do you fix race conditions?

By creating structure.

By following principles which compute the result explicitly, without unwanted chaotic behaviour.

By removing loops from decision graphs.

First and foremost, by recognizing that leaving a decision to a race condition is strictly worse than any decision process that we consciously design, even if this process is flipping the coin (at least you know the odds!).

Example: deciding to continue the relationship.

Proposed solution (arrow represent influence):

(1) controlled, long-distance emotional evaluation -> (2) systemic decision -> (3) day-to-day emotions

The idea is to remove the loop by organising emotions into tho groups: those that are directly influenced by the decision or its consequences (3), and more distant "evaluation" emotions (1). A possibility to feel emotions as in (1) can be created by pre-deciding a time to have some time alone and judge the situation from more distance, e.g. "after 6 months of this relationship I will go for a 2 week vacation to by aunt in France, and think about it in a clear-headed way, making sure I consider emotions about the general picture, not day-to-day things like physical affection etc.".

V

There is much to write on this topic, so please excuse my brevity (esp. in the last part, giving some examples of systemic thinking about emotions) - there is easily enough content about this to fill a book (or two). But I hope I gave you some idea.

Time end: 15:15:42

Writing stats: 31 minutes, 23 wpm, 133 cpm

Weekly LW Meetups

0 FrankAdamek 28 October 2016 03:47PM

Your Truth Is Not My Truth

0 Bound_up 28 October 2016 01:35PM

Can someone help me dissolve this, and give insight into how to proceed with someone who says this?

 

What are they saying, exactly? That the set of beliefs in their head that they use to make decisions is not the same set of beliefs that you use to make decisions?

 

Could I say something like "Yes, that's so, but how do you know that your truth matches what is in the real world? Is there some way to know that your truth isn't only true for you, and not actually true for everybody?"

 

I'm trying to get a feel for what they mean by "true" in this case, since it's obviously not "matching reality."

[Link] Slashdot: study Finds Little Lies Lead To Bigger Ones

0 Gunnar_Zarncke 26 October 2016 06:53AM

[Link] Scientists Create AI Program That Can Predict Human Rights Trials With 79 Percent Accuracy

0 Gunnar_Zarncke 26 October 2016 06:47AM

Philosophical theory with an empirical prediction

-1 mgin 28 October 2016 04:14PM

I have a philosophical theory which implies some things empirically about quantum physics, and I was wondering if anyone knowledgeable on the subject could give me some insight.

It goes something like this:

As an anathema to reductionists, quarks (and by "quarks" I just mean, whatever are the fundamental particles of the universe) are not governed by simple rules a la conway's game of life, but rather, like all of metaphysics goes into their behavior.

The reductionist basically reduces metaphysics to the simple rules that govern quarks. Fundamentally there is no other identity or causality, everything else is just emergent from that, anything we want to call "real" that we deal with in ordinary experience, does not have any metaphysical identity or causal efficacy of its own, it's just an illusion produced by tons of atoms bouncing around. If the universe is akin to conway's game of life, then I don't think the things we see around us are actually what we think they are. They don't have any real identity on a metaphysical level, but rather they are just patterns of particles in motion, governed by mathematically simple rules.

But suppose there actually is metaphysical identity and causal power in the things around us, well the place I can see for that, is that the unknown rules governing quarks, are not mathematically simple rules, but literally that's where all of metaphysics is contained, quarks entangle together according to high level concepts corresponding to the things we see around us, including a person's identity, and have not the mathematically simple causal powers like conway's game of life, but the causal powers of the identity of the high-level agent.

The empirical question is this: do we observe the fundamental particles of the universe behaving according mathematically simple rules, or do they seem to behave in complex/unpredictable ways depending on how they are entangled / what they are interacting with?

 

Adding an example to clarify:

The behavior of the quarks corresponds to the identity of the things we see around us. The things we see around us are constituted by quarks - but the question is, are these quarks behaving mindlessly as billiard balls, or is their behavior the result of complex rules corresponding to the identity of the thing they form?

In other words, suppose we're talking about a living ant, are the quarks which constitute that ant behaving according to simple mathematical rules like billiard balls, and the whole concept of there being an "ant" is just an illusion produced by these particles bouncing around, or are these quarks constituting the ant actually behaving "ant-like"?

Is the causal behavior of the ant determined by the billiard-ball interactions of quarks bouncing around, or does the causal behavior actually originate in the identity of the ant, with the quark interactions being decided according to its nature?

What I'm saying is that there metaphysically is such a thing as an ant, when quarks "get together as an ant", they behave differently, they behave ant-like. Given there is a lot of unknown on exactly why quarks behave the way they do, why is this ruled out: that when they "get together as an ant", they behave ant-like?

Basically the idea is, when it comes to the interactions of the quarks constituting the ant with the quarks constituting the things the ant interacts with, the behavior of those interactions is determined not by simple, universal rules of quark behavior, but by the rules of quark behavior that are in effect "when the quarks are an ant".

To further clarify this example:

This is framed in general terms, because I don't actually know any quantum physics, but I'm talking about the fundamental physical particles ("quarks", for lack of a better term), and their behavior at the quantum level - behavior which we don't fully understand. So one could say in general terms, sometimes the quarks "swerve left" and other times they "swerve right", and we don't exactly know why they do that in any given case.

So the question is, suppose the behavior of quarks in general is not determined by simple, universal laws of quark behavior, e.g. "always swerve left 50% of the time", but rather, there are metaphysically real and physically meaningful "quark groups", like if a bunch of quarks are entangled together in a group constituting what we'd observe to be an ant, then quarks in that quark group behave differently. So for example, the quarks in that "ant quark group" might always swerve left when they interact with another quark group of a different kind.

Trying to find a short story

-1 mgin 25 October 2016 02:27AM

It's a story about a boy who is into science and transhumanism, and a girl he told about all these crazy things that were going to happen. He dies and all of the things he said started to happen. She ended up floating around Saturn remembering him.

Either he or she was in the wheelchair. He was dying and he was disappointed he was dying because of all the cool stuff that was going to happen that she was going to be around for, and some of it had to do with whatever problem she had that was going to get fixed.

Please help me find this story if you can.

View more: Prev