Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Elo 31 July 2015 04:52:46AM *  4 points [-]

I travelled to a different city for a period of a few days and realised I should actively avoid trying to gather geographical information (above a rough sense) to free up my brain space for more important things. Then I realised I should do that near home as well.

Two part question:-

  1. What do you outsource that is common and uncommon among people that you know?
  2. What should you be avoiding keeping in your brain that you currently are? (some examples might be birthdays, what day of the week it is, city-map-location, schedules/calendars, task lists, shopping lists)

And while we are at it: What automated systems have you set up?

Comment author: Kaj_Sotala 04 August 2015 12:45:44PM 1 point [-]

I was under the impression that "brain space" was unlimited for all practical intents and purposes, and that having more stuff in your brain might actually even make extra learning easier - e.g. I've often heard it said that a person loses fluid intelligence when they age, but this is compensated by them having more knowledge that they can connect new things with. Do you know of studies to the contrary?

Comment author: So8res 30 July 2015 05:08:23PM *  7 points [-]

Thanks for the reply, Jacob! You make some good points.

Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.

I endorse eli_sennesh's response to this part :-)

This again reflects the old 'hard' computer science worldview, and obsession with exact solutions.

I am not under the impression that there are "exact solutions" available, here. For example, in the case of "building world-models," you can't even get "exact" solutions using AIXI (which does Bayesian inference using a simplicity prior in order to guess what the environment looks like; and can never figure it out exactly). And this is in the simplified setting where AIXI is large enough to contain all possible environments! We, by contrast, need to understand algorithms which allow you to build a world model of the world that you're inside of; exact solutions are clearly off the table (and, as eli_sennesh notes, huge amounts of statistical modeling are on it instead).

I would readily accept a statistical-modeling-heavy answer to the question of "but how do you build multi-level world-models from percepts, in principle?"; and indeed, I'd be astonished if you avoided it.

Perhaps you read "we need to know how to do X in principle before we do it in practice" as "we need a perfect algorithm that gives you bit-exact solutions to X"? That's an understandable reading; my apologies. Let me assure you again that we're not under the illusion you can get bit-exact solutions to most of the problems we're working on.

For example - perhaps using lots and lots of computing power makes the problem harder instead of easier. How could that be? Because with lots and lots of compute power, you are naturally trying to extrapolate the world model far far into the future, where it branches enormously [...]

Hmm. If you have lots and lots of computing power, you can always just... not use it. It's not clear to me how additional computing power can make the problem harder -- at worst, it can make the problem no easier. I agree, though, that algorithms for modeling the world from the inside can't just extrapolate arbitrarily, on pain of exponential complexity; so whatever it takes to build and use multi-level world-models, it can't be that.

Perhaps the point where we disagree is that you think these hurdles suggest that figuring out how to do things we can't yet do in principle is hopeless, whereas I'm under the impression that these shortcomings highlight places where we're still confused?

In response to comment by So8res on MIRI's Approach
Comment author: Kaj_Sotala 02 August 2015 11:28:18PM *  4 points [-]

Hmm. If you have lots and lots of computing power, you can always just... not use it. It's not clear to me how additional computing power can make the problem harder -- at worst, it can make the problem no easier.

Additional computing power might not make the problem literally harder, but the assumption of limitless computing power might direct your attention towards wrong parts of the search space.

For example, I suspect that the whole question about multilevel world-models might be something that arises from conceptualizing intelligence as something like AIXI, which implicitly assumes that there's only one true model of the world. It can do this because it has infinite computing power and can just replace its high-level representation of the world with one where all high-level predictions are derived from the basic atom-level interactions, something that would be intractable for any real-world system to do. Instead real-world systems will need to flexibly switch between different kinds of models depending on the needs of the situation, and use lower-level models in situations where the extra precision is worth the expense of extra computing time. Furthermore, those lower-level models will have been defined in terms of what furthers the system's goals, as defined on the higher-levels: it will pay preferential attention to those features of the lower-level model that allow it to further its higher-level goals.

In the AIXI framing, the question of multilevel world-models is "what happens when the AI realizes that the true world model doesn't contain carbon atoms as an ontological primitive". In the resource-limited framing, that whole question isn't even coherent, because the system has no such thing as a single true world-model. Instead the resource-limited version of how to get multilevel world-models to work is something like "how to reliably ensure that the AI will create a set of world models in which the appropriate configuration of subatomic objects in the subatomic model gets mapped to the concept of carbon atoms in the higher-level model, while the AI's utility function continues to evaluate outcomes in terms of this concept regardless of whether it's using the lower- or higher-level representation of it".

As an aside, this reframed version seems like the kind of question that you would need to solve in order to have any kind of AGI in the first place, and one which experimental machine learning work would seem the best suited for, so I'd assume it to get naturally solved by AGI researchers even if they weren't directly concerned with AI risk.

In response to comment by LawrenceC on MIRI's Approach
Comment author: So8res 30 July 2015 09:55:29PM 5 points [-]

It's cited a lot in MIRI's writing because it's the first example that pops to my mind, and I'm the one who wrote all the writings where it appears :-p

For other examples, see maybe "Artificial Evolution in the Physical World" (Thompson, 1997) or "Computational Genetics, Physiology, Metabolism, Neural Systems, Learning, Vision, and Behavior or PolyWorld: Life in a New Context." (Yaeger, 1994). IIRC.

In response to comment by So8res on MIRI's Approach
Comment author: Kaj_Sotala 31 July 2015 08:58:59AM *  3 points [-]

Note that always only citing one example easily gives the impression that it's the only example you know of, or of this being an isolated special case, so at least briefly mentioning the existence of others could be better.

Comment author: Stuart_Armstrong 15 July 2015 09:53:31AM 0 points [-]

Because FAI's can change themselves very effectively in ways that we can't.

It might be that human brain in computer software would have the same issues.

Comment author: Kaj_Sotala 15 July 2015 01:02:16PM *  2 points [-]

Because FAI's can change themselves very effectively in ways that we can't.

Doesn't mean the FAI couldn't remain genuinely uncertain about some value question, or consider it not worth solving at this time, or run into new value questions due to changed circumstances, etc.

All of those could prevent reflective equilibria, while still being compatible with the ability for extensive self-modification.

Comment author: shminux 13 July 2015 03:15:44PM 10 points [-]

As you mention, so far every attempt by humans to have a self-consistent value system (the process also known as decompartmentalization) results in less-than-desirable outcomes. What if the end goal of having a thriving long-lasting (super-)human(-like) society is self-contradictory, and there is no such thing as both "nice" and "self-referentially stable"? Maybe some effort should be put into figuring out how to live, and thrive, while managing the unstable self-reference and possibly avoid convergence altogether.

Comment author: Kaj_Sotala 15 July 2015 06:59:46AM 10 points [-]

A thought I've been thinking of lately, derived from a reinforcement learning view of values, and also somewhat inspired by Nate's recent post on resting in motion... - value convergence seems to suggest a static endpoint, with some set of "ultimate values" we'll eventually reach and have ever after. But so far societies have never reached such a point, and if our values are an adaptation to our environment (including the society and culture we live in), then it would suggest that as long as we keep evolving and developing, our values will keep changing and evolving with us, without there being any meaningful endpoint.

There will always (given our current understanding of physics) be only a finite amount of resources available, and unless we either all merge into one enormous hivemind or get turned into paperclips, there will likely be various agents with differing preferences on what exactly to do with those resources. As the population keeps changing and evolving, the various agents will keep acquiring new kinds of values, and society will keep rearranging itself to a new compromise between all those different values. (See: the whole history of the human species so far.)

Possibly we shouldn't so much try to figure out what we'd prefer the final state to look like, but rather what we'd prefer the overall process to look like.

(The bias towards trying to figure out a convergent end-result for morality might have come from LW's historical tendency to talk and think in terms of utility functions, which implicitly assume a static and unchanging set of preferences, glossing over the fact that human preferences keep constantly changing.)

Comment author: Wei_Dai 14 July 2015 03:39:50AM 2 points [-]

Anyone know more about this proposal from IDSIA?

Technical Abstract: "Whenever one wants to verify that a recursively self-improving system will robustly remain benevolent, the prevailing tendency is to look towards formal proof techniques, which however have several issues: (1) Proofs rely on idealized assumptions that inaccurately and incompletely describe the real world and the constraints we mean to impose. (2) Proof-based self-modifying systems run into logical obstacles due to Löb's theorem, causing them to progressively lose trust in future selves or offspring. (3) Finding nontrivial candidates for provably beneficial self-modifications requires either tremendous foresight or intractable search.

Recently a class of AGI-aspiring systems that we call experience-based AI (EXPAI) has emerged, which fix/circumvent/trivialize these issue. They are self-improving systems that make tentative, additive, reversible, very fine-grained modifications, without prior self-reasoning; instead, self-modifications are tested over time against experiential evidences and slowly phased in when vindicated or dismissed when falsified. We expect EXPAI to have high impact due to its practicality and tractability. Therefore we must now study how EXPAI implementations can be molded and tested during their early growth period to ensure their robust adherence to benevolence constraints.

I did some searching but Google doesn't seem to know anything about this "EXPAI".

Comment author: Kaj_Sotala 14 July 2015 06:29:15AM 2 points [-]

I didn't find anything on EXPAI either, but there's the PI's list of previous publications. At least his Bounded Seed-AGI paper sounds somewhat related:

Abstract. Four principal features of autonomous control systems are left both unaddressed and unaddressable by present-day engineering methodologies: (1) The ability to operate effectively in environments that are only partially known at design time; (2) A level of generality that allows a system to re-assess and redefine the fulfillment of its mission in light of unexpected constraints or other unforeseen changes in the environment; (3) The ability to operate effectively in environments of significant complexity; and (4) The ability to degrade gracefully— how it can continue striving to achieve its main goals when resources become scarce, or in light of other expected or unexpected constraining factors that impede its progress. We describe new methodological and engineering principles for addressing these shortcomings, that we have used to design a machine that becomes increasingly better at behaving in underspecified circumstances, in a goal-directed way, on the job, by modeling itself and its environment as experience accumulates. The work provides an architectural blueprint for constructing systems with high levels of operational autonomy in underspecified circumstances, starting from only a small amount of designer-specified code—a seed. Using value-driven dynamic priority scheduling to control the parallel execution of a vast number of lines of reasoning, the system accumulates increasingly useful models of its experience, resulting in recursive self-improvement that can be autonomously sustained after the machine leaves the lab, within the boundaries imposed by its designers. A prototype system named AERA has been implemented and demonstrated to learn a complex real-world task—real-time multimodal dialogue with humans—by on-line observation. Our work presents solutions to several challenges that must be solved for achieving artificial general intelligence.

Comment author: chaosmage 13 July 2015 03:05:06PM *  3 points [-]


Couldn't another class of solutions be that resolutions of inconsistencies cannot reduce the complexity of the agent's morality? I.e. morality has to be (or tend to become) not only (more) consistent, but also (more) complex, sort of like an evolving body of law rather than like the Ten Commandments?

Comment author: Kaj_Sotala 13 July 2015 05:14:16PM 1 point [-]

morality has to be (or tend to become) not only (more) consistent, but also (more) complex

It's not clear to me that one can usefully distinguish between "more consistent" and "less complex".

Suppose that someone felt that morality dictated one set of behaviors for people of one race, and another set of behaviors for people of another race. Eliminating that distinction to have just one set of morals that applied to everyone might be considered by some to increase consistency, while reducing complexity.

That said, it all depends on what formal definition one adopts for consistency in morality: this doesn't seem to me a well-defined concept, even though people talk about it as if it was. (Clearly it can't be the same as consistency in logic. An inconsistent logical system lets you derive any conclusion, but even if a human is inconsistent WRT some aspect of their morality, it doesn't mean they wouldn't be consistent in others. Inconsistency in morality doesn't make the whole system blow up the way logical inconsistency does.)

Comment author: DeVliegendeHollander 09 July 2015 02:11:06PM -2 points [-]

Taboo "psychdelic". What kind of images they are?

Comment author: Kaj_Sotala 09 July 2015 03:25:29PM 3 points [-]
Comment author: Squark 03 June 2015 06:18:06PM 0 points [-]

I am not sure what you mean by "not even wrong". My interpretation of consequentialism is just following an algorithm which is designed to maximize a certain utility function. Of course you can get different kinds of consequentialism by using different utility functions. In what sense is it "not even wrong"? Or do you consider a narrower definition of consequentialism?

Comment author: Kaj_Sotala 08 July 2015 06:23:05PM 1 point [-]

I am not sure what you mean by "not even wrong".

I didn't answer this at first because I had difficulties putting my intuition to words. But here's a stab at it:

Suppose that at first, people believe that there is a God who has defined some things as sinful and others as non-sinful. And they go about asking questions like, "is brushing my teeth sinful or not", and this makes sense given their general set of beliefs. And a theologician could give a "yes" or "no" answer to that, which could be logically justified if you assumed some specific theology.

Then they learn that there is actually no God, but they still go about asking "is brushing my teeth sinful or not". And this no longer makes sense even as a question, because the definition of "sin" came from a specific theology which assumed the existence of God. And then a claim like "here's a theory which shows that brushing teeth is always sinful" would not even be wrong, because it wasn't making claims about any coherent concept.

Now consequentialists might say that "consequentialism is the right morality everyone should follow", but under this interpretation this wouldn't be any different from saying that "consequentialism is the right theory about what is sinful or not".

Comment author: Kaj_Sotala 07 July 2015 04:52:11PM 11 points [-]

I think that the psychedelic images that DeepDream produces today are just the start of what we'll see with this kind of technology. Wrote a bit about the ways in which it could be used for increasing image quality, putting artists out of work, and of course, generating porn.


View more: Next