This post is inspired by Chimera wanting to talk about "Levels of Organization in General Intelligence."

In the paper, Eliezer mentions giving a seed AI a sensory modality for code.  And I think it'll be fun to figure out just what that means.

 

Code is a pretty simple environment compared to the cool stuff humans can do with sight and hearing, so I'd like to start by giving a picture of a totally different sensory modality in a simple environment - vision in robot soccer.

The example system described in this paper can be used to find and track ball and robot movement during a soccer match.  For simplicity, let's imagine that the AI is just looking down at the field from above using a single camera.  The camera is sending it a bunch of signals, but it's not sensory modality yet because it's not integrated with the rest of the AI.  At least the following tasks needed to be implemented as part of the program before the AI can see:

 

  • Finding unusual pixels quickly.
  • Finding shapes of the same color in the picture.
  • Use object properties (what the tops of the robots look like) to collect shapes into physical objects.
  • Use a mapping between the picture and real space to get real space coordinates of objects.
  • Correlate with previous positions and motion to track objects and figure out which is which.

 

This work seems a lot like compression - it takes a big hunk of image data and turns it into a petite morsel of robot locations and orientations.  But evaluated as compression, it's awful, since so much information, like the texture of the robots, gets discarded.  Rather than calling it compression, it's more like translation from the language of images to the language the AI thinks in.  The AI doesn't need to think about the exact texture of the robots, and so the sensory modality doesn't pass it on.

 

So how do we do something like that with code?  We want something that takes raw code as an input and outputs a language for the AI to think about code in.  Additionally, as mentioned in LOGI, this language can be used by the AI to imagine code, so it should contain the necessary ideas.  It should discard a lot of low-level information, like the vision system extracting the coordinates and allowing the AI to ignore the texture.  It might also shuffle the information around to keep track of function of the code, like mapping points from the camera onto real space.

For some code, this might be done with nested black boxes - finding groups of code and replacing them with black boxes that say what they do and what they're connected to.  Then finding groups of black boxes and replacing those groups with what they do and what other groups they're connected to.  The big problem for this approach is figuring out how to label what a piece of code does.  Finding a short description of a piece of code is straightforward, but just condensing the code is not enough, the program needs to be able to separate function from form and throw out the form information.  Ideally, once our program had removed as much form as it could, the AI would be able in principle to rewrite the program just from the functional description.

 

Unfortunately, this is where the problem gets hard for me.  I have some thoughts, like "follow the flow of the program" and "check if locally arbitrary choices will cause nonlocal problems if changed." But first I'd like to check that I'm sort of on the right track, and then maybe forge ahead.

New Comment
28 comments, sorted by Click to highlight new comments since:

Code is incompressible. Optimally efficient compiled code cannot be described in fewer bits than the code itself occupies.

(The fact that we write code in highly compressible text form is a concession to human language-processing facilities. Humans wrote code in pure binary form well before we invented more convenient programming languages for doing it in.)

(The compiled binary for your Web browser is almost certainly compressible, by about a factor of two to one. This is in part because it contains lots of text in natural languages, which are very compressible; and in part because it is not optimally efficiently compiled. Compilers can be tuned to optimize for speed or for space; you can guess which we usually use.)

The notion of "sensory modality", or of senses at all, ultimately implies compressibility. For human survival purposes, the natural world is relatively compressible. I don't need all the bits that describe the ground or the wall to know that they are solid, opaque, and have other relatively simple but useful properties. I don't need all the bits about a tiger to respond to it as a dangerous animal. However, a single bit astray in a piece of code can mean the difference between something that runs and something that crashes … or that lets the bad guys take over your machine and use it to tile the solar system with Viagra spam.

So I am unconvinced that a "sensory modality for code" can exist, except insofar as this simply means a superior ability to reason logically and mathematically: to understand code qua code, not as the subject of the sort of compression and approximation involved in human (or robot) senses.

Code is incompressible. Optimally efficient compiled code cannot be described in fewer bits than the code itself occupies.

A dubious claim. Do you have a reference?

A dubious claim. Do you have a reference?

My interpretation is that is how he is defining "optimally efficient"- code that cannot be described in fewer bits than itself.

My interpretation is that is how he is defining "optimally efficient"- code that cannot be described in fewer bits than itself.

What - by the same machine it is being run on? That's a pretty strange way to use the term "efficient" - and not a very portable way to measure how compressible code is.

This sort of processing shouldn't be thought of as compression. Like a visual system that let's the AI "sense" the object coordinates in a video feed, the stuff done by a sensory modality for code would just throw away most of the information.

This sort of processing shouldn't be thought of as compression. Like a visual system that let's the AI "sense" the object coordinates in a video feed, the stuff done by a sensory modality for code would just throw away most of the information.

That is how "lossy" compression works.

True. This is a sort of lossy compression. Or several layers of lossy compression if we want the AI to be able to look at different levels of structure. But lossy compression is the huge space of everything that throws away some information and reduces message size - I think it takes more work to find the stuff we want within that space than it takes to note that we want lossy compression.

And considered as compression, even lossy compression (or co-compression with the AI's source code, or any such guff), a sensory modality is probably suboptimal, because it has to serve other purposes like making it convenient for the AI to think about code.

Visual systems — and other aspects of our cognition about physical objects — exploit the fact that physical objects are highly redundant.

For my purposes as a human, every square foot of a house wall is pretty much identical to every other square foot of it. One piece of wallboard is pretty much the same as any other, and minuscule differences are irrelevant: removing an atom or a crystal from the wallboard will not alter it much. So it is safe, both physically and epistemologically, to visually encode a whole wall as a bounded surface of a single color, material, and texture, rather than as a huge number of tiny individual constituent particles. There are of course many differences between one square foot of wall and the next, but those discrepancies are highly unlikely to be of any particular consequence.

The same is not true for code. Code is not highly redundant. One kilobyte of code is not pretty much the same as any other kilobyte of code; and minuscule differences are significant: altering a single bit can completely change the outcome of running the code. (Indeed, programmers try hard to reduce the redundancy of code; introducing redundancy into code is "cut & paste programming" which is considered a very bad practice.) Tiny differences in code are of substantial consequence.

With physical objects, most of the atomic-level details cancel out. With code, all of the bit-level details matter.

I like the example Oscar handed me: the sieve of Eratosthenes can be "compressed" as "a program that finds primes."

The bit-level details matter to a CPU. A human given the instruction "write a program that finds prime numbers" wouldn't start with bit 1 and then go to bit 2 and bit 3. They would start with high-level structure, maybe a key algorithm, and then move to fine details to make the high-level structure work. Another human looking at the code would be able to see this structure, see the key algorithm, and probably wouldn't remember all the fine details.

A human given the instruction "write a program that finds prime numbers" must use their background knowledge of prime numbers and algorithms to derive the sieve of Eratosthenes or an equivalent algorithm.

You can study a piece of code, discern that it is an algorithm for finding primes, and label it as such. However, to glance at a new piece of code and recognize it as prime-finding code, rather than defective prime-finding code, requires close inspection and not compression. This does not lend itself to the kind of optimizations suggested by the expression "sensory modality", i.e. analogy to human sensory processing which compresses away lots of details — even relevant details, which is why we have optical illusions, etc.

That sort of thing is quickly checkable, though - the trouble is if looks like it finds primes, does something else, and that something else isn't one of the hypotheses generated. Which could definitely make this much less useful if it was common.

I dunno, maybe you're right and a holistic approach is necessary. But I feel like I understand code modularly rather than holistically, and though I sometimes lose track of details I'm usually pretty good at knowing which ones will be important later.

Ever hear of the Underhanded C Contest?

It's easy to understand code modularly when it was written straightforwardly. It's a lot harder when it's spaghetti code ... or when it's actually intentionally tricky. Past a certain point there's no choice but to actually step through each possible branch of the code — which can be done, but it's hardly the sort of automatic layers of approximation and pattern-matching that our senses use.

Someone linked me to the 2008 contest once. I'm more familiar with the Obfuscated C Contest, which would also be a problem. But I think it's fine to only be able to "intuit" about straightfoward-ish code. The main use, after all, would be to understand code written by your own programmers in order to self-improve.

The word "code" might be misleading, evoking images of things that look like text. It should really read "computation".

Humans make everyday use of the famous Bohm-Jacopini result that showed that any computable function could be expressed in terms of sequence, iteration, alternation. This has allowed us to restrict our attention to a tiny fraction of all possible program texts. However an AI with a sensory modality for computation would have no such limitation; it would make sense directly of any kind of description of a computation.

It gets murky when we think about what "computation" has meant, historically, for humans, because sometimes there has been a lot of meaning in the interpreter compared to what there is in the language.

Would an AI with a sensory modality for computation be able to make sense instantly of sophisticated programs in Piet? Would their being represented in pictures rather than text would make no difference to it?

Interesting - if you could talk more about this, that would be great. To help, here's a big bunch o' questions :D

What would the input and outputs for the computation-processing be? What sort of language should be used by the AI to think high-level thoughts about the computation - a tree structure like the nested boxes? Since full generality is probably computationally intractable, what sort of things could it focus on? What sort of information would it throw away?

Being halfway through the LOGI paper now, I must retract the grand-parent; it seems that Eliezer does mean "code" in the straightforward sense, i.e. source or compiled code, e.g:

When considering which features to extract, the question I would ask is not "What regularities are found in code?" but rather "What feature structure is needed for the AI to perceive two identical algorithms with slightly different implementations as 'the same piece of code'?" Or more concretely: "What features does this modality need to extract to perceive the recursive algorithm for the Fibonacci sequence and the iterative algorithm for the Fibonacci sequence as 'the same piece of code'?" [...] Could a sensory modality for code look at two sets of interpreted bytecodes (or other program listing), completely different on a byte-by-byte basis, and see these two listings as the "same" algorithm in two slightly different "orientations"?

The difference between code and computation is that code (i.e. program text) is merely one particular way of expressing a computation with particular properties. Perceiving the properties of that computation is what I imagine the job of a "sensory modality for code" would be. (Just like the job of vision is more general than extracting properties of any one visual field.)

One way to think about this is to consider the towers of Hanoi puzzle.

ToH is a relatively simple computation, some of its salient features are "intuitive" even to humans. The elementary recursive solution can be expressed as "move all but the bottom disk to a storage peg, so that the bottom disk can be moved at once to the target peg". I'd suppose that anyone with some programming experience in modern languages will directly perceive the feature "recursion". (But maybe not a Cobol programmer of old?)

However it takes some cognitive work to get at subtler features, like computational cost of the solution (e.g. in number of moves), or the existence and "shape" of the non-recursive algorithm, or the Sierpiński Triangle showing up in a graph representation.

So as a first approximation of what it would feel like to have a sensory modality for code, I might imagine being able to directly intuit these properties, merely by "looking at" a description of the rules for moving disks around.

That seems to involve "easily perceiving" solutions to NP problems (general proofs of properties or existence and shapes of algorithms), and I'm not sure what simplifications could be used to avoid this without getting a ton of false negatives. Also, how would this help the AI think high-level thoughts about computation?

[-]asr20

Worse than that. Most of the properties you care about in code aren't NP. NP is the set of decision problems such that a "yes" answer can be verified in polynomial time, given a witness string. Properties like "this program is secure/deterministic/terminates" don't, in general, have short proofs. Many of the properties you care about are undecidable if you assume unlimited memory, and intractable even if you don't.

In contrast, the human visual system, as I understand, mostly does constant-time work, like edge detection, checking for color differences, etc.

how would this help the AI think high-level thoughts about computation

I'm checking out of the discussion temporarily while I reread the LOGI paper. I want to make sure I have the proper context to think of the above question.

It seems that the quickest way to say "this code loops from 1 to 10" really is just

for(i=1;i<=10;i++){ ... }

If the high level description of the code accurately describes what the code does, then the high level description is just code.

But... the high level description might be shorter code!

For instance suppose prime(n) is a long piece of code using a variation on the Sieve of Eratosthenes. Then we could describe what this code does by giving a short program that finds primes by brute force. Each subroutine of the code has and "describing program" attached to it that does the same thing but is much simpler.

This sort of processing should not accurately describe the code. In order to do its job it should throw away tons of information - obliterating the difference between a for and a while loop, ignoring how most variables are handled, and at the highest level (both of description and of proficiency) just labeling the sieve of Erastosthenes as "a program that finds primes."

Maybe it would be most human-friendly to imagine looking at code and "just knowing" what it does, the same way we "just know" that a splotch of color is a ball and how the ball is moving.

Code is a pretty simple environment compared to the cool stuff humans can do with sight and hearing, so I'd like to start by giving a picture of a totally different sensory modality in a simple environment - vision in robot soccer.

To refute the claim that code is a simple environment, consider that robot soccer can be played entirely in simulation using a rendering algorithm to generate visual input for the robot control programs. (This way of doing things has advantages at least for early prototyping.)

[-]asr00

I suspect you hit hard theoretical limits fairly quickly. Many questions about programs are undecidable in general. This tends not to be a major problem for working programmers because humans tend to write programs that humans can reason about and to keep comments, annotations, etc around to help that understanding.

But it doesn't mean that it's possible to reason about arbitrary correct and useful code. There might be a one-way transform that converts arbitrary intelligible programs into a likely-unintelligible form. Program shrouding with cryptographic guarantees, as it were.

Good point. But I think it's fine to not be prepared for arbitrary code, similar to how visual systems aren't prepared for arbitrary visual input. Human-style code currently dominates, and if AIs want to write in a different style they'll have to figure out how to think about it anyhow - the only obscured code would be deliberate.

Code is a pretty simple environment compared to the cool stuff humans can do with sight and hearing

Are you a programmer?

Certainly not a professional one. Maybe I'm projecting - humans are a bit more adapted to visual processing. On the other hand, code is static and discrete where light is dynamic and continuous.

[-]asr30

code is static and discrete

This isn't really true. Large programs often make heavy use of dynamic loading; some programs, such as the JVM, make heavy use of dynamic code generation. The program you see is can be quite different from the program that runs. If the program has a code-injection vulnerability, they can be unrelated.