You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

In partially observable environments, stochastic policies can be optimal

5 Stuart_Armstrong 19 July 2016 10:42AM

I always had the informal impression that the optimal policies were deterministic (choosing the best option, rather than some mix of options). Of course, this is not the case when facing other agents, but I had the impression this would hold when facing the environment rather that other players.

But stochastic policies can also be needed if the environment is partially observable, at least if the policy is Markov (memoryless). Consider the following POMDP (partially observable Markov decision process):

There are two states, 1a and 1b, and the agent cannot tell which one they're in. Action A in state 1a and B in state 1b, gives a reward of -R and keeps the agent in the same place. Action B in state 1a and A in state 1b, gives a reward of R and moves the agent to the other state.

The returns for the two deterministic policies - A and B - are -R every turn except maybe for the first. While the return for the stochastic policy of 0.5A + 0.5B is 0 per turn.

Of course, if the agent can observe the reward, the environment is no longer partially observable (though we can imagine the reward is delayed until later). And the general policy of "alternate A and B" is more effective that the 0.5A + 0.5B policy. Still, that stochastic policy is the best of the memoryless policies available in this POMDP.

[Stub] Ontological crisis = out of environment behaviour?

8 Stuart_Armstrong 13 January 2016 03:10PM

One problem with AI is the possibility of ontological crises - of AIs discovering their fundamental model of reality is flawed, and being unable to cope safely with that change. Another problem is the out-of-environment behaviour - that an AI that has been trained to behave very well in a specific training environment, messes up when introduced to a more general environment.

It suddenly occurred to me that these might in fact be the same problem in disguise. In both cases, the AI has developed certain ways of behaving in reaction to certain regular features of their environment. And suddenly they are placed in a situation where these regular features are absent - either because they realised that these features are actually very different from what they thought (ontological crisis) or because the environment is different and no longer supports the same regularities (out-of-environment behaviour).

In a sense, both these errors may be seen as imperfect extrapolation from partial training data.

Blind Spot: Malthusian Crunch

4 bokov 18 October 2013 01:48PM

In an unrelated thread, one thing led to another and we got onto the subject of overpopulation and carrying capacity. I think this topic needs a post of its own.

TLDR mathy version:

let f(m,t) be the population that can be supported using the fraction of Earth's theoretical resource limit m we can exploit at technology level t  

let t = k(x) be the technology level at year

let p(x) be population at year x  

What conditions must constant m and functions f(m,k(x)), k(x), and p(x) satisfy in order to insure that p(x) - f(m,t) > 0 for all x > today()? What empirical data are relevant to estimating the probability that these conditions are all satisfied?

Long version:

Here I would like to explore the evidence for and against the possibility that the following assertions are true:

  1. Without human intervention, the carrying capacity of our environment (broadly defined1) is finite while there are no *intrinsic* limits on population growth.
  2. Therefore, if the carrying capacity of our environment is not extended at a sufficient rate to outpace population growth and/or population growth does not slow to a sufficient level that carrying capacity can keep up, carrying capacity will eventually become the limit on population growth.
  3. Abundant data from zoology show that the mechanisms by which carrying capacity limits population growth include starvation, epidemics, and violent competition for resources. If the momentum of population growth carries it past the carrying capacity an overshoot occurs, meaning that the population size doesn't just remain at a sustainable level but rather plummets drastically, sometimes to the point of extinction.
  4. The above three assertions imply that human intervention (by expanding the carrying capacity of our environment in various ways and by limiting our birth-rates in various ways) are what have to rely on to prevent the above scenario, let's call it the Malthusian Crunch.
  5. Just as the Nazis have discredited eugenics, mainstream environmentalists have discredited (at least among rationalists) the concept of finite carrying capacity by giving it a cultish stigma. Moreover, solutions that rely on sweeping, heavy-handed regulation have recieved so much attention (perhaps because the chain of causality is easier to understand) that to many people they seem like the *only* solutions. Finding these solutions unpalatable, they instead reject the problem itself. And by they, I mean us.
  6. The alternative most environmentalists either ignore or outright oppose is deliberately trying to accelerate the rate of technological advancement to increase the "safety zone" between expansion of carrying capacity and population growth. Moreover, we are close to a level of technology that would allow us to start colonizing the rest of the solar system. Obviously any given niche within the solar system will have its own finite carrying capacity, but it will be many orders of magnitude higher than that of Earth alone. Expanding into those niches won't prevent die-offs on Earth, but will at least be a partial hedge against total extinction and a necessary step toward eventual expansion to other star systems.

Please note: I'm not proposing that the above assertions must be true, only that they have a high enough probability of being correct that they should be taken as seriously as, for example, grey goo:

Predictions about the dangers of nanotech made in the 1980's shown no signs of coming true. Yet, there is no known logical or physical reason why they can't come true, so we don't ignore it. We calibrate how much effort should be put into mitigating the risks of nanotechnology by asking what observations should make us update the likelihood we assign to a grey-goo scenario. We approach mitigation strategies from an engineering mindset rather than a political one.

Shouldn't we hold ourselves to the same standard when discussing population growth and overshoot? Substitute in some other existential risks you take seriously. Which of them have an expectation2 of occuring before a Malthusian Crunch? Which of them have an expectation of occuring after?

 

Footnotes:

1: By carrying capacity, I mean finite resources such as easily extractable ores, water, air, EM spectrum, and land area. Certain very slowly replenishing resources such as fossil fuels and biodiversity also behave like finite resources on a human timescale. I also include non-finite resources that expand or replenish at a finite rate such as useful plants and animals, potable water, arable land, and breathable air. Technology expands carrying capacity by allowing us to exploit all resource more efficiently (paperless offices, telecommuting, fuel efficiency), open up reserves that were previously not economically feasible to exploit (shale oil, methane clathrates, high-rise buildings, seasteading), and accelerate the renewal of non-finite resources (agriculture, land reclamation projects, toxic waste remediation, desalinization plants).

2: This is a hard question. I'm not asking which catastrophe is the mostly likely to happen ever while holding everything else constant (the possible ones will be tied for 1 and the impossible ones will be tied for 0). I'm asking you to mentally (or physically) draw a set of survival curves, one for each catastrophe, with the x-axis representing time and the y-axis representing fraction of Everett branches where that catastrophe has not yet occured. Now, which curves are the upper bound on the curve representing Malthusian Crunch, and which curves are the lower bound? This is how, in my opinioon (as an aging researcher and biostatistician for whatever that's worth) you think about hazard functions, including those for existential hazards. Keep in mind that some hazard functions change over time because they are conditioned on other events or because they are cyclic in nature. This means that the thing most likely to wipe us out in the next 50 years is not necessarily the same as the thing most likely to wipe us out in the 50 years after that. I don't have a formal answer for how to transform that into optimal allocation of resources between mitigation efforts but that would be the next step.

 

The Rosenhan Experiment

3 chaosmosis 14 September 2012 10:31PM

I haven't seen any links to this on Lesswrong yet, and I just discovered it myself. It's extremely interesting, and has a lot of implications for how the way that people perceive and think of others are largely determined by their environmental context. It's also a fairly good indict of presumably common psychiatric practices, although it's also presumably outdated by now. Maybe some of you are already familiar with it, but I thought I'd mention it and post a link for those of you who aren't.

There's probably newer research on this, but I don't have time to investigate it at the moment.

http://en.wikipedia.org/wiki/Rosenhan_experiment

Roommate interest and coordination thread

9 patrickscottshields 02 August 2012 09:22AM

This thread is for the discussion of options for people interested in changing their living environments some time in the next year or so. It's a place to:

  • Share your situation to get an outside view
  • Get on the radar of potential roommates
  • Discuss existing communities or places that may be a good fit
  • Describe what you're looking for in a living environment
  • Post your procedure for deciding where to live
  • Coordinate with others to find compatible roommates
  • Discuss which factors are relevant to deciding where to live
  • Post resources or data relevant to deciding where to live

Whether you're graduating from college, moving for a new job, or looking to further optimize your living environment for other reasons, talking with others can help you identify options, catch inaccurate beliefs or poor reasoning, meet potential roommates, and more. Thanks to everyone who contributes!

(This thread has been on my mind for a while. Reading this recent roommate-seeking post inspired me actually write and post it. I'll post my own situation in the comments below.)

To discuss the concept of this thread (rather than participating in the thread's intended discussion), please reply to this comment. Credit goes to the open transactions thread and group rationality diary for some of the style and wording of this post.