Disclaimer: I don't necessarily believe this is a good candidate for building strong(er) AI but it is interesting none-the-less

Also, for here, when I say 'more intelligent' or 'less intelligent' and generally when I talk about intelligence in general I usually mean intelligence in the sense of I know it when I see it[1], and not in any sort of rigorous sense

And another: if I do something that doesn't seem to classify the same as 'normal LW user behavior', please tell me. I find it is quite difficult to course-correct without diving-in, so-to-speak; and so I need as much direct feedback as possible now that I have dived in. I'm feedback hungry. For example, if you thought this post needed more elaboration, or could be shorter or longer, etc that is good information and commentable in my opinion. (meta: my opinion here might also be subjected to this same kind of correction)


DAKs and questions about them

Heres the idea: you have a Dumb AI Kernel (DAK) that is essentially an effectively very stupid AI that doesn't have very many abilities and isn't very intelligent at all. Then you provide it with tools[2] that do very particular things, that themselves aren't very intelligent (if at all), that otherwise extend the DAK's abilities, and increase its intelligence. Once composed together, the DAK plus tools are an AI Aggregate (AIA). This may or may not be possible or feasible for making intelligent systems

The idea at a different level is: if you can create a more intelligent AI by providing new modules -- that themselves aren't considered intelligent -- to that AI, then is the reverse true: can you remove modules and make a less intelligent, but still intelligent, AI? And if so, the DAK is the smallest AI that is still an AI, once all other intelligence-increasing modules have been removed. Removing anything else will make the DAK no longer intelligent, or no longer functional altogether

Related to this, you can imagine a superintelligent AI in the future creating temporary mesa-optimizers, and budding off smaller sub-AIs to tackle particular problems. At what point can you have an AI that is actually really stupid do the exact same thing?

Here are a bunch of assorted questions related to DAKs and AIAs:

  • Does an AIA always increase in intelligence when new, orthogonal[3] tools are added?
  • What does it mean for an AIA to become smarter? If that isn't a useful question: what metrics can be used as a way to more or less heuristically quantify an AIA's intelligence?
  • Is it possible to make a DAK small enough that its exact behaviors can be optimized for in a meta-optimizer, given standard tools?
  • Just how small can a DAK be? Is it possible to make an absolutely minimal DAK that looks like effectively little more than a while(true) loop?
  • What does a DAK look like? What does it actually do? What type of program is it? (ie, like: a searcher, planner, logic solver, constrained system solver, etc). What does the DAK look like in abstract? (in the sense of: a search program looks through a list of items and picks the one that matches a query)
  • Does the DAK learn? Probably. But if it does, does learning have to be an internal function, or can it be delegated to tools? And if it can, then does the DAK have to learn how to use those tools?
  • There must be some core DAK functionalities, what are they? How flexible (in the sense of: how many different ways can we make the same thing) is this architecture?
  • What function does a DAK play in the system orthogonal to the function of its tools, in an abstract sense? Can the DAK itself be reduced / replaced with a module that cannot be said to be intelligent alone?
  • If a human-level intelligence AIA from a sub-human-intelligent DAK is possible, does that imply all intelligence is primarily a compositional phenomenon? (ie: in the sense of composing orthogonal modules into a more complex system). Or does intelligence still primarily arise from monolithic architectures?
  • Is a superintelligent AIA made from a subhuman-intelligent DAK easier to align and control?
  • Are humans like AIA? And if so, what does our DAK-equivalent look like? (I guess it would be more like a DIK in this case)
  • Is it (always / never / ever) possible for a human to look into a DAK while it is operating and make sense of what it is doing?
  • Does each provided tool have the same interface(s)[4]? The DAK obviously must dictate the interface(s) for the tools it uses because they must be compatible with it
  • If intelligence can always be defined in a very narrow sense, algorithmically, then do all DAKs take a similar form?
  • In a system like this, are there other datapaths[5] between modules / tools? Is the DAK simply the point where data is routed between other modules?
  • Can tools be DAKs themselves, and take their own tools? Is it easier to make a hierarchical, nested / fractal AIA system smarter than it is to make a regular AIA smarter?
  • How much internal state do DAKs' tools have?
  • When is it better to increase DAK capabilities and when is it better to increase tool capabilities? In terms of intelligence? In terms of safety?
  • What components of intelligence are / should be added as tools and which are part of the DAK's core functionality? Is memory a part of the DAK's core functionality?
  • Which behaviors[6] are / should be best attributed to the DAK's core functionality, and which to tools?
  • Presumably, every datapath must go through the DAK unless the DAK specifically sets them up to interact directly, so are more advanced DAK tool interactions such as this always a product of already having bootstrapped the AIA to a higher intelligence? Or can the DAK do that immediately?
  • Are introspective abilities[7] available in core DAK functionalities? Or are those added via tools?
  • Is it always safe to give an AIA a new tool? Or should some kind of training-tools be used that don't have potentially harmful / effective side-effects?
  • How does giving a new tool to an AIA affect it behavior-wise?
  • If it is significantly easier to control AIA, and they can be used to create arbitrary AI, then should we aim to create AIA instead of using other architectures?
  • Are DAK-based systems easier to implement with ANNs or with GOFAI[8] / symbolic architectures?
  • In the case of AIA which are trained a la ANNs, what does training an AIA look like? After adding a new tool to an already trained AIA, what does retraining look like?
  • Are DAK-based systems easier to implement with ANNs or with GOFAI / symbolic architectures?
  • Are other systems that involve compounding intelligence that don't involve DAKs possible?
  • Are there any hidden synchronization issues in AIA? (eg: processes that can only be parallelized in a certain way)
  • Is it possible for DAKs to utilize other AIAs as tools (synergistically)?
  • What proportion of the questions above are implementation specific?[9]

So far, I am imagining a DAK like how some psychologists believe storytelling is a fundamental form of human thought: everything gets passed through the DAK. Consequently, a DAK reinterprets all of the data it sees to match its internal functionality, despite how simple that functionality is, and inevitably every potentially superintelligent decision it makes is colored by that functionality

An example of tool composition is like: the DAK queries an associative database tool (which just has simple query and append interfaces) using an object Q, the result R is handled superficially in the same way by the DAK and is used as input to a spacial path-finding tool which returns a path object P, then P is sent (by the DAK) into a output device[10] control tool which returns P': a slightly modified version of P representing the active state of navigation along the planned route, and so on. I imagine objects like P would be used instead of persistent internal states within each tool, but it seems much more intuitive to have each tool have an internal state

Similar ideas

Some other interesting, but not necessarily functional or useful, ideas similar to this:

  • Societies of AI: bunches of not-very-intelligent intelligences interact to make a more intelligent entity. Like multiple more or less orthogonal DAKs interacting to make one AIA
  • Hub AI: where a simple routing 'hub' routes information between many simple and not-very-intelligent AIs. Like tools and DAKs switch places
  • Automatically routing tools: a system composed of modules that themselves are definitely not intelligent, but that determine their own datapaths when inserted into the system, and form an intelligent system through their composition

An opposite idea to a DAK / AIA might be called something like a Smart AI Monolith (SAM), or something. This would / may look like a bunch of highly-coupled modules whose individual, internal behaviors are highly correlated with one-another. This isn't necessarily a more or less likely-to-work architecture than the AIA architecture, but it certainly is a much more likely-to-be-used architecture because of the lack of composability / modularity of ANNs, and the requirement the entire ANN be correlated / trained all at once, which appears to eliminate any possible composability (I also write about this in just a moment)

Why something like an AIA may be an important future technology

The most prominent AI success in recent years was GPT-3 which operates exclusively on text. More recently, we saw OpenAI move into creating combined text and image models with their DALL-E model. Similarly, more and more research is being written related to ANN models which combine multiple types of data, especially for the purposes of generating one type of data (like images) from another (text)

This makes a lot of sense. Consider: GPT-3 used most / all of the text on the regular internet in training. GPT-4 would use all of it, and require even more to train well. But a hypothetical GPT-N that can consume and produce images, audio, text, etc would have a much larger training database to pull from. And having mostly orthogonal modules each for different types of inputs before they reach wherever they are combined internally is likely necessary for such a system. It could be that such a system with adapters from each type of input data to internal data could benefit from the modularity of those adapters. And I mean that in the sense of: can I add / remove this module and have the system still function?

Imagine: a system that can compose multiple input streams, and produce multiple output streams, depending on what functional modules are actually connected to it. Like: a future AI consultant for a company walks / rolls / beams into their server room and connects to a specialized interface with the company servers that gives information from across the company related to whatever it's doing, and allows the AI to perform what it needs to perform in a way that is more highly correlated with its internal function to some degree

But composability and modularity isn't something that comes easily, or at all to ANNs (the SOTA architecture apparently closest to strong AI right now) in particular, and retraining the entire system whenever a new module is attached or reattached is probably very untenable. But if a core subsystem is trained to be able to use external modules like tools, and the external modules are individually trained (or are untrained) to do whatever they do (generate an image from text, transcribe languages to other languages, etc), then the core subsystem could potentially react and understand when a new module is attached, and what to do with it, without needing to be retrained

Composability of ANNs in particular for me is almost verging on a holy-grail for ANN architecture. Having highly composeable ANN modules that are effectively plug and play, and modularity even just in terms of training: where you can train section of your model separately from other sections; would probably allow much, much larger models, and potentially much shorter training times. I suspect it would be like going from analog computers to networked, digital computers. I have worked on topics related to this some and have gotten some positive results. Though there is a lot of work still to be done. I will post what I've found / what I think related to this, if people want me to

Footnotes


  1. Intelligence is quite difficult to define. I have found that a I know it when I see it definition isn't satisfying but is quite helpful for getting defining it out of the way. I have been trying to pioneer this concept which I call a semantic domain, and such domains' semantics (which is a slight overloading of the term 'semantics'), and in this case intelligence has human internal semantics primarily and poor human communication semantics. I may write and describe semantic domains and this overloaded semantics in another post. Regardless, there are certain concepts like this: that are apparently best defined or described with 'I know it when I see it'. Like the color red, for example: it doesn't seem to be best defined as a particular wavelength, or by describing it in terms of red objects, but like: "you have to experience it, but when you do: you know it when you see it" ↩︎

  2. Tools in this document are equivalent to self-contained modules with a simple-enough interface, and hopefully with low / nonexistent coupling to the DAK ↩︎

  3. In the sorta-same sense as with linearity in mathematics: two functional modules A, B are orthogonal if the functionality of B isn't reproducible in any way by the functionality of A, and visa-versa. In general I mean it more like: B's functionality cannot be produced easily by A's functionality. eg: a dedicated FEM solver, and a solver for linear systems are orthogonal because one (mostly) cannot reproduce the other ↩︎

  4. I mean interface in the sense of object oriented programming's interfaces. Like: a well-defined set of patterns that other objects can use to interact with that thing. Think: user interface. When I talk about modules that usually means: as-defined black boxes that have some interfaces that are used for interaction with those modules ↩︎

  5. Datapaths: paths data take through the system by being copied into new memory locations, used as parameters in functions, and being handed off between functional modules ↩︎

  6. For example, the behavior 'writing' apparently involves lots of different functionalities in humans, like: remembering what you wrote, and where; knowing the language you're writing in; knowing what sequences of words look right, and which look wrong; making decision about what to immediately write, about what to write at longer time-scales, and making plans to write those things; converting plans to actions that are mutually consistent; mapping general concepts at a large scale down to actionable representations; accessing experiences and memories in a coherent-enough way that they can be converted into words; expanding high level behaviors into sequences of muscle movements, contextualized by hand position, eye orientation, etc; and remembering all of the tiny details that go into all of these things. These sub-behaviors can presumably be decomposed into the same mostly-orthogonal modules used to build an AIA ↩︎

  7. I generally consider nonpreferential access to (all?) internal data / state as if it was external data, and processing such data just as external data is processed, a likely candidate for how introspection may work internally ↩︎

  8. Good old fashioned AI / symbolic AI. Despite falling out of favor, it may have an important place in future strong(er) AI systems. It could also conceivably be used to create a strong AI all by itself, though it would obviously have to be developed more (and the related unknown unknowns and known unknowns discovered and resolved) to get to that point ↩︎

  9. This must be the most horrible question I keep coming across ↩︎

  10. I generally stick to the definitions that: input devices are the things that create data, and input data is exposed on simple interfaces, probably as a set of tensors; while output devices are parameterized by output data that control how the output devices are actuated ↩︎

New Comment
1 comment, sorted by Click to highlight new comments since:

possibly related literature if you haven't seen it: Comprehensive AI Services