You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

[Link] arguman.org, an argument analysis platform

1 dyokomizo 19 October 2015 03:46PM

I recently found out about argumanIt's an online tool to dissect arguments and structure agreement and refutation.

It seems like something that's been discussed about in LW some times in the past.

AI: requirements for pernicious policies

7 Stuart_Armstrong 17 July 2015 02:18PM

Some have argued that "tool AIs" are safe(r). Recently, Eric Drexler decomposed AIs into "problem solvers" (eg calculators), "advisors" (eg GPS route planners), and actors (autonomous agents). Both solvers and advisors can be seen as examples of tools.

People have argued that tool AIs are not safe. It's hard to imagine a calculator going berserk, no matter what its algorithm is, but it's not too hard to come up with clear examples of dangerous tools. This suggests the solvers vs advisors vs actors (or tools vs agents, or oracles vs agents) is not the right distinction.

Instead, I've been asking: how likely is the algorithm to implement a pernicious policy? If we model the AI as having an objective function (or utility function) and algorithm that implements it, a pernicious policy is one that scores high in the objective function but is not at all what is intended. A pernicious function could be harmless and entertaining or much more severe.

I will lay aside, for the moment, the issue of badly programmed algorithms (possibly containing its own objective sub-functions). In any case, to implement a pernicious function, we have to ask these questions about the algorithm:

  1. Do pernicious policies exist? Are there many?
  2. Can the AI find them?
  3. Can the AI test them?
  4. Would the AI choose to implement them?

The answer to 1. seems to be trivially yes. Even a calculator could, in theory, output a series of messages that socially hack us, blah, take over the world, blah, extinction, blah, calculator finishes its calculations. What is much more interesting is some types of agents have many more pernicious policies than others. This seems the big difference between actors and other designs. An actor AI in complete control of the USA or Russia's nuclear arsenal has all sort of pernicious policies easily to hand; an advisor or oracle has much fewer (generally going through social engineering), a tool typically even less. A lot of the physical protection measures are about reducing the number of sucessfull pernicious policies the AI has a cess to.

The answer to 2. is mainly a function of the power of the algorithm. A basic calculator will never find anything dangerous: its programming is simple and tight. But compare an agent with the same objective function and the ability to do an unrestricted policy search with vast resources... So it seems that the answer to 2. does not depend on any solver vs actor division, but purely on the algorithm used.

And now we come to the big question 3., whether the AI can test these policies. Even if the AI can find pernicious policies that rank high on its objective function, it will never implement them unless it can ascertain this fact. And there are several ways it could do so. Let's assume that a solver AI has a very complicated objective function - one that encodes many relevant facts about the real world. Now, the AI may not "care" about the real world, but it has a virtual version of that, in which it can virtually test all of its policies. With a detailed enough computing power, it can establish whether the pernicious policy would be effective at achieving its virtual goal. If this is a good approximation of how the pernicious policy would behave in the real world, we could have a problem.

But extremely detailed objective functions are unlikely. But even simple ones can show odd behaviour if the agents gets to interact repeatedly with the real world - this is the issue with reinforcement learning. Suppose that the agent attempts a translation job, and is rewarded on the accuracy of its translation. Depending on the details of what the AI knows and who choose the rewards, the AI could end up manipulating its controllers, similarly to this example. The problem is that one there is any interaction, all the complexity of humanity could potentially show up in the reward function, even if the objective function is simple.

Of course, some designs make this very unlikely - resetting the AI periodically can help to alleviate the problem, as can choosing more objective criteria for any rewards. Lastly on this point, we should mention the possibility that human R&D, by selecting and refining the objective function and the algorithm, could take the roll of testing the policies. This is likely to emerge only in cases where many AI designs are considered, and the best candiates are retained based on human judgement.

Finally we come to the question of whether the AI will implement the policy if it's found it and tested it. You could say that the point of FAI is to create an AI that doesn't choose to implement pernicious policies - but, more correctly, the point of FAI is to ensure that very few (or zero) pernicious policies exist in the first place, as they all score low on the utility function. However, there are a variety of more complicated designs - satisficers, agents using crude measures - where the questions of "Do pernicious policies exist?" and "Would the AI choose to implement them?" could become quite distinct.

 

Conclusion: a more through analysis of AI designs is needed

A calculator is safe, because it is a solver, it has a very simple objective function, with no holes in the algorithm, and it can neither find nor test any pernicious policies. It is the combination of these elements that makes it almost certainly safe. If we want to make the same claim about other designs, neither "it's just a solver" or "it's objective function is simple" would be enough; we need a careful analysis.

Though, as usual, "it's not certainly safe" is a quite distinct claim from "it's (likely) dangerous", and they should not be conflated.

[Link] 3 Short Walking Breaks Can Reverse Harm From 3 Hours of Sitting

16 Gunnar_Zarncke 10 September 2014 10:26AM

I found the below link which is in the spirit of Lifestyle interventions to increase longevity:

3 Short Walking Breaks Can Reverse Harm From 3 Hours of Sitting"

The /.-summary:

Medical researchers have been steadily building evidence that prolonged sitting is awful for your health. One major problem is that blood can pool in the legs of a seated person, causing arteries to start losing their ability to control the rate of blood flow. A new experimental study (abstract) has discovered it's quite easy to negate these detrimental health effects: all you need to do is take a leisurely, 5-minute walk for every hour you sit. "The researchers were able to demonstrate that during a three-hour period, the flow-mediated dilation, or the expansion of the arteries as a result of increased blood flow, of the main artery in the legs was impaired by as much as 50 percent after just one hour. The study participants who walked for five minutes for each hour of sitting saw their arterial function stay the same — it did not drop throughout the three-hour period. Thosar says it is likely that the increase in muscle activity and blood flow accounts for this."

One way to incorporate this into ones habits is to use WorkRave.

 

 

 

Make evidence charts, not review papers? [Link]

14 XiXiDu 04 September 2011 01:26PM

How do you get on top of the literature associated with a controversial scientific topic? For many empirical issues, the science gives a conflicted picture. Like the role of sleep in memory consolidation, the effect of caffeine on cognitive function, or the best theory of a particular visual illusion. To form your own opinion, you’ll need to become familiar with many studies in the area.

You might start by reading the latest review article on the topic. Review articles provide descriptions of many relevant studies. Also, they usually provide a nice tidy story that seems to bring the literature all together into a common thread- that the author’s theory is correct! Because of this bias, a review article may not help you much to make an independent evaluation of the evidence. And the evidence usually isn’t all there. Review articles very rarely describe, or even cite, all the relevant studies. Unfortunately, if you’re just getting started, you can’t recognize which relevant studies the author didn’t cite.

[...]

Hal Pashler and I have created, together with Chris Simon of the Scotney Group who did the actual programming, a tool that addresses these problems. It allows one to create systematic reviews of a topic, without having to write many thousands of words, and without having to weave all the studies together with a narrative unified by a single theory. You do it all in a tabular form called an ‘evidence chart’. Evidence charts are an old idea, closely related to the “analysis of competing hypotheses” technique. Our evidencechart.com website is fully functioning and free to all, but it’s in beta and we’d love any feedback.

More: alexholcombe.wordpress.com/2010/09/02/make-evidence-charts-not-review-papers/

Example: What is the role of sleep on hippocampus-dependent memory consolidation?

I thought this was an interesting idea. Do you think it would be possible and useful to create an evidence chart for risks from AI, existential risks in general and other topics on lesswrong?

Link: The Uncertain Future - "The Future According to You"

6 XiXiDu 13 January 2011 05:44PM

Visualizing "The Future According to You"

The Uncertain Future is a future technology and world-modeling project by the Singularity Institute for Artificial Intelligence. Its goal is to allow those interested in future technology to form their own rigorous, mathematically consistent model of how the development of advanced technologies will affect the evolution of civilization over the next hundred years. To facilitate this, we have gathered data on what experts think is going to happen, in such fields as semiconductor development, biotechnology, global security, Artificial Intelligence and neuroscience. We invite you, the user, to read about the opinions of these experts, and then come to your own conclusion about the likely destiny of mankind.

Link: theuncertainfuture.com