aribrill - LessWrong

Thanks for the great writeup.

Superposition ("local codes") require sparsity, i.e. that only few features are active at a time.

Typo: I think you meant to write distributed, not local, codes. A local code is the opposite of superposition.

Natural abstractions are observer-dependent: a conversation with John Wentworth

Short answer: some goals incentivize general intelligence, which incentivizes tracking lots of abstractions and also includes the ability to pick up and use basically-any natural abstractions in the environment at run-time.
Longer answer: one qualitative idea from the Gooder Regulator Theorem is that, for some goals in some environments, the agent won't find out until later what its proximate goals are. As a somewhat-toy example: imagine playing a board game or video game in which you don't find out the win conditions until relatively late into the game. There's still a lot of useful stuff to do earlier on - instrumental convergence means that e.g. accumulating resources and gathering information and building general-purpose tools are all likely to be useful for whatever the win condition turns out to be.

As I understand this argument, even if an agent's abstractions depend on its goals, it doesn't matter because disparate agents will develop similar instrumental goals due to instrumental convergence. Those goals involve understanding and manipulating the world, and thus require natural abstractions. (And there's the further claim that a general intelligence can in fact pick up any needed natural abstraction as required.)

That covers instrumental goals, but what about final goals? These can be arbitrary, per the orthogonality thesis. Even if an agent develops a set of natural abstractions for instrumental purposes, if it has non-natural final goals, it will need to develop a supplementary set of non-natural goal-dependent abstractions to describe them as well.

When it comes to an AI modeling human abstractions, it does seem plausible to me that humans' lowest-level final goals/values can be described entirely in terms of natural abstractions, because they were produced by natural selection and so had to support survival & reproduction. It's a bit less obvious to me this still applies to high-level cultural values (would anyone besides a religious Jew naturally develop the abstraction of kosher animal?). In any case, if it's sufficiently important for the AI to model human behavior, it will develop these abstractions for instrumental purposes.

Going the other direction, can humans understand, in terms of our abstractions, those that an AI develops to fulfill its final goals? I think not necessarily, or at least not easily. An unaligned or deceptively aligned mesa-optimizer could have an arbitrary mesa-objective, with no compact description in terms of human abstractions. This matters if the plan is to retarget an AI's internal search process. Identifying the original search target seems like a relevant intermediate step. How else can you determine what to overwrite, and that you won't break things when you do it?

I claim that humans have that sort of "general intelligence". One implication is that, while there are many natural abstractions which we don't currently track (because the world is big, and I can't track every single object in it), there basically aren't any natural abstractions which we can't pick up on the fly if we need to. Even if an AI develops a goal involving molecular squiggles, I can still probably understand that abstraction just fine once I pay attention to it.

This conflates two different claims.

A general intelligence trying to understand the world can develop any natural abstraction as needed. That is, regularities in observations / sensory data -> abstraction / mental representation.
A general intelligence trying to understand another agent's abstraction can model its implications for the world as needed. That is, abstraction -> predicted observational regularities.

The second doesn't follow from the first. In general, if a new abstraction isn't formulated in terms of lower-level abstractions you already possess, integrating it into your world model (i.e. understanding it) is hard. You first need to understand the entire tower of prerequisite lower-level abstractions it relies on, and that might not be feasible for a bounded agent. This is true whether or not all these abstractions are natural.

In the first case, you have some implicit goal that's guiding your observations and the summary statistics you're extracting. The fundamental reason the second case can be much harder relates to this post's topic: the other agent's implicit goal is unknown, and the space of possible goals is vast. The "ideal gas" toy example misleads here. In that case, there's exactly one natural abstraction (P, V, T), no useful intermediate abstraction levels, and the individual particles are literally indistinguishable, making any non-natural abstractions incoherent. Virtually any goal routes through one abstraction. A realistic general situation may have a huge number of equally valid natural abstractions pertaining to different observables, at many levels of granularity (plus an enormous bestiary of mostly useless non-natural abstractions). A bounded agent learns and employs the tiny subset of these that helps achieve its goals. Even if all generally intelligent agents have the same potential instrumental goals that could enable them to learn the same natural abstractions, without the same actual instrumental goals, they won't.

Meetup : Yale: Initial Meetup

aribrill11y00

Unfortunately I am busy from 2-5 on Sundays, but I would certainly like to attend a future Yale meetup at some other time.

Rationality Quotes August 2013

aribrill11y550

In 2002, Wizards of the Coast put out Star Wars: The Trading Card Game designed by Richard Garfield.

As Richard modeled the game after a miniatures game, it made use of many six-sided dice. In combat, cards' damage was designated by how many six-sided dice they rolled. Wizards chose to stop producing the game due to poor sales. One of the contributing factors given through market research was that gamers seem to dislike six-sided dice in their trading card game.

Here's the kicker. When you dug deeper into the comments they equated dice with "lack of skill." But the game rolled huge amounts of dice. That greatly increased the consistency. (What I mean by this is that if you rolled a million dice, your chance of averaging 3.5 is much higher than if you rolled ten.) Players, though, equated lots of dice rolling with the game being "more random" even though that contradicts the actual math.

Mark Rosewater, Kind Acts of Randomness

Rationality Quotes June 2013

aribrill11y600

Why is there that knee-jerk rejection of any effort to "overthink" pop culture? Why would you ever be afraid that looking too hard at something will ruin it? If the government built a huge, mysterious device in the middle of your town and immediately surrounded it with a fence that said, "NOTHING TO SEE HERE!" I'm pretty damned sure you wouldn't rest until you knew what the hell that was -- the fact that they don't want you to know means it can't be good.

Well, when any idea in your brain defends itself with "Just relax! Don't look too close!" you should immediately be just as suspicious. It usually means something ugly is hiding there.

David Wong, The 5 Ugly Lessons Hiding in Every Superhero Movie

Rationality Quotes January 2013

aribrill12y190

"How is it possible! How is it possible to produce such a thing!" he repeated, increasing the pressure on my skull, until it grew painful, but I didn't dare object. "These knobs, holes...cauliflowers -" with an iron finger he poked my nose and ears - "and this is supposed to be an intelligent creature? For shame! For shame, I say!! What use is a Nature that after four billion years comes up with THIS?!"

Here he gave my head a shove, so that it wobbled and I saw stars.

"Give me one, just one billion years, and you'll see what I create!"

Stanislaw Lem, "The Sanatorium of Dr. Vliperdius" (trans. Michael Kandel)

Intelligence explosion in organizations, or why I'm not worried about the singularity

aribrill12y00

That's certainly true. It seems to me that in this case, sbenthall was describing entities more akin to Google than to the Yankees or to the Townsville High School glee club; "corporations" is over-narrow but accurate, while "organizations" is over-broad and imprecise.

Intelligence explosion in organizations, or why I'm not worried about the singularity

aribrill12y20

I think that as a general rule, specific examples and precise language always improve an argument.

Intelligence explosion in organizations, or why I'm not worried about the singularity

aribrill12y10

I get the sense that "organization" is more or less a euphemism for "corporation" in this post. I understand that the term could have political connotations, but it's hard (for me at least) to easily evaluate an abstract conclusion like "many organizations are of supra-human intelligence and strive actively to enhance their cognitive powers" without trying to generate concrete examples. Imprecise terminology inhibits this.

When you quote lukeprog saying

It would be a kind of weird corporation that was better than the best human or even the median human at all the things that humans do. [Organizations] aren’t usually the best in music and AI research and theory proving and stock markets and composing novels.

should the word "corporation" in the first sentence be "[organization]"?

Rationality Quotes April 2012

aribrill13y00

The typing quirks actually serve a purpose in the comic. Almost all communication among the characters takes place through chat logs, so the system provides a handy way to visually distinguish who's speaking. They also reinforce each character's personality and thematic associations - for example, the character quoted above (Aranea) is associated with spiders, arachnids in general, and the zodiac sign of Scorpio.

Unfortunately, all that is irrelevant in the context of a Rationality Quote.

LESSWRONG
LW

Posts

Wiki Contributions

Comments