tl;dr: It seems noteworthy that "deepware" has strong connotations with "it involves magic", while the same is not true for AI in general.
I would like to point out one thing regarding the software vs AI distinction that is confusing me a bit. (I view this as complementing, rather than contradicting, your post.)
As we go along the progression "Tools > Machines > Electric > Electronic > Digital", most[1] of the examples can be viewed as automating a reasonably-well-understood process, on a progressively higher level of abstraction.[2]
[For example: A hammer does basically no automation. > A machine like a lawn-mower automates a rigidly-designed rotation of the blades. > An electric kettle does-its-thingy. > An electronic calculator automates calculating algorithms that we understand, but can do it for much larger inputs than we could handle. > An algorithm like Monte Carlo tree search automates an abstract reasoning process that we understand, but can apply it to a wide range of domains.]
But then it seems that this progression does not neatly continue to the AI paradigm. Or rather, some things that we call AI can be viewed as a continuation of this progression, while others can't (or would constitute a discontinuous jump).
[For example, approaches like "solving problems using HCH" (minus the part where you use unknown magic to obtain a black box that imitates the human) can be viewed as automating a reasonably-well-understood process (of solving tasks by decomposing & delegating them). But there are also other things that we call AI that are not well described as a continuation of this progression --- or perhaps they constitute a rather extreme jump. On the other hand, deep learning automates the not-well-understood process of "stare at many things, then use magic to generalise". And the other example is abstract optimisation, which automates the not-well-understood process of "search through many potential solutions and pick the one that scores the best according to an objective function". And there are examples that lie somewhere inbetween --- for example, AlphaZero is mostly a quite well-understood process, but it does involve some opaque deep learning.]
I suppose we could refer to the distinction as "does it involve magic?". It then seems noteworthy that "deepware" has strong connotations with magic, while the same isn't true for all types of AI.[2]
Or perhaps just "many"? I am not quite sure, this would require going through more examples, and I was intending for this to be a quick comment.
To be clear, I am not super-confident that this progression is a legitimate phenomenon. But for the sake of argument, let's say it is.
An interesting open question is how large hit to competitiveness would we suffer if we restricted ourselves to systems that only involve a small amount of magic.
Yeah, this is a pretty interesting twist in the progression, and one which I failed to see coming as a teenager learning about AI. I looked at the trend from concrete to abstract -- from machine-code to structured programming to ever-more-abstract high-level programming languages -- and I thought AI would look like the highest-level programming language one could imagine.
In some sense this is not wrong. Telling the machine what to do in plain natural language is the highest-level programming language one could imagine.
However, naive extrapolation of ever-more-sophisticated programming languages might lead one to anticipate convergence between compilers and computational linguistics, such that computers would be understanding natural language with sophisticated but well-understood parsing algorithms, converting natural-language statements to formal representations resembling logic, and then executing the commands via similarly sophisticated planning algorithms.
The reality is that computational linguistics itself has largely abandoned the idea that we can make a formal grammar which captures natural language; the best way to parse a bunch of English is, instead, to let machine learning "get the idea" from a large number of hand-parsed examples! Rather than bridging the formal-informal divide by fully formalizing English grammar, it turns out to be easier to formalize informality itself (ie, mathematically specify a model of messy neural network learning) and then throw the formalized informality at the problem!
Weird stuff.
However, at some point I did get the idea and make the update. I think it was at the 2012 AGI conference, where someone was presenting a version of neural networks which was supposed to learn interpretable models, due to the individual neurons implementing interpretable functions of their inputs, rather than big weighted sums with a nonlinear transform thrown in. It seemed obvious that the approach would be hopeless, because as the models got larger and larger, it would be no more interpretable than any other form of neural network. I had the startling realization that this same argument seems to apply to anything, no matter how logic-like the underlying representation: it will become an opaque mess as it learns the high complexity of the real world.
Agreed.
It seems relevant, to the progression, that a lot of human problem solving -- though not all -- is done by the informal method of "getting exposed to examples and then, somehow, generalising". (And I likewise failed to appreciate this, not sure until when.) This suggests that if we want to build AI that solves things in similar ways that humans solve them, "magic"-involving "deepware" is a natural step. (Whether building AI in the image of humans is desirable, that's a different topic.)
Perhaps new nomenclature may be helpful, but after 43 years in this crazy field, I dare say the preponderance of new nomenclature has often only served to further obfuscate and cloud understanding. But here in the wild-west of early LLM's and "AI" (I fully agree with your analysis on the inappropriateness of this term) it might be warranted, but maybe a little early.
Good nomenclature is grounded in terminology or phraseology that points to its origin(s), and therein provides some semblance of where or what, its describing. The world has enough inane software-naming barbarisms like "Choclatey" but 'Deepware' does in fact ring appropriate.
That is, in hopes that it doesn't get 'adjectiviszed' into abject disgrace by pr0n-minded dilettantes.
Good article.
The other critical thing that the current terms seem to miss is the deep and inscrutable nature of the systems
I have one comment here. Electronic systems that have a fault, and low level software constructs with a sporadic fault, can feel inscrutable when you are trying to chase down the bug. This also happens when you have poor layering in a software stack, the stack becomes 'deep' and difficult to debug because the layers are fused together, and so a bug can be at any layer. (aka it could be application/runtime libs/userspace driver/driver/comm protocol/pcie switch/firmware or a silicon bug). SWEs can separate into isolated layers using various techniques, the layering for TCP/IP is a famous example.
Early electronics, using less reliable gates, were also like this. Vacuum tubes blew often.
LLMs are effectively trained on a bunch of trash - whatever human text they were trained on, and so there's a lot of dirt in there. Unreliable behavior is going to always be the norm, same as early electronics.
"Deepware" trained on synthetic data may be better. (have an LLM generate a bunch of text with a consistent voice and grounded reasoning, and train another one on that, as an example. this is also bootstrapping -> early unreliable electronics controlled the machines that made their successors)
There also may be a way to layer 'deepware' into accountable and separable systems.
With that said, yes it's a new paradigm. Just don't assume the problems of today are eternal to the field.
Summary: { dirty human text is similar to impure materials in early electronics, and deep networks that perform all functions have no layer separation and are not debuggable, similar to early software. Neither is the standard now }
Good point, and I agree that it's possible that what I see as essential features might go away - "floppy disks" turned out to be a bad name when they ended up inside hard plastic covers, and "deepware" could end up the same - but I am skeptical that it will.
I agree that early electronics were buggy until we learned to build them reliably - and perhaps we can solve this for gradient-descent based learning, though many are skeptical of that, since many of the problems have been shown to be pretty fundamental. I also agree that any system is inscrutable until you understand it, but unlike early electronics, no-one understands these massive lists of numbers that produce text, and human brains can't build them, they just program a process to grow them. (Yes, composable NNs could solve some of this, as you point out when mentioning separable systems, but I still predict they won't be well understood, because the components individually are still deepware.)
but I am skeptical that it will.
I agree that early electronics were buggy until we learned to build them reliably - and perhaps we can solve this for gradient-descent based learning, though many are skeptical of that, since many of the problems have been shown to be pretty fundamental. I also agree that any system is inscrutable until you understand it, but unlike early electronics, no-one understands these massive lists of numbers that produce text, and human brains can't build them, they just program a process to grow them
I can say from experience that no one "understands" complex software and hardware systems. Any little bug can take weeks to isolate and patch, and you end up in calls with multiple domain specialists. Rare non deterministic bugs you will never find the cause for in the lifetime of the project.
This means you need to use black box methods for testing and reliability analysis. You empirically measure how often it fails, you don't know how often it will fail by looking at the design.
These same methods apply mostly unchanged to AI. The only major difference I see with AI is the possibility of coordinated failure, where multiple AI systems conspire to all fail at the same time. This implies that AI alignment may ultimately be based around the simple idea of reducing complex AI systems that can fail in a coordinated way, to simpler AI systems that fail independently. (Note this doesn't mean it will be simple to do. See what humans have to do to control fire. Just 3 ingredients you have to keep apart, yet you need all this infrastructure)
This is the "pinnacle" of software engineering at present : rip apart your software from this complex, coupled thing to a bunch of separate simpler things that are each testable, and a modern flavor is to do things like make every element in a complex GUI use a separate copy of the JavaScript libraries.
Perhaps I'm missing some key things here. While I can see the point that calling ML/AI software is likely both something of a misclassification and a bit misleading for many purposes, I'm not really finding the approach helpful for me.
My, albeit naive and uninformed view, is that distinction is really between an algorithm and software. Before ML, my takes is, algorithms were very specialized elements of software, e.g., efficient sort functions. The big move seems to have been into the world of some generalized algorithms that lets software kind of do things on it's own rather than just what a programmer specifically "tells" the software to do.
I agree that previous methods for analysing software largely break down for LLMs and other ML systems trained on lots of data. Some software tooling can be used effectively in compiling, deploying, hosting, and managing models, but what goes on within them or how to tune them is different. I'm unhappy with the name choice though. Not only for the reason given by Gerald Monroe that the "deepness" or inscrutability was also present in previous technologies, but also because it doesn't lead to an easy adjective. Consider this:
I think it may be too early to give the new level a new name as we don't know what the key adjective will be. It could be generative, predictive, even autonomous. Just going by what we see right now generative seems like a good candidate. It's even used in many contacts and fits the pattern: Generative software.
It should be called A-ware, short for Artificial-ware, given the already massive popularity of the term "Artificial Intelligence" to designate "trained-rather-than-programmed" systems.
It also seems more likely to me that future products will contain some AI sub-parts and some traditional-software sub-parts (rather than being wholly one or the other), and one or the other is utilized depending on context. We could call such a system Situationally A-ware.
A few weeks ago, I (David) tried to argue that AI wasn’t software. In retrospect, I think I misstated the case by skipping some inferential steps, and based on some feedback from that article, and on an earlier version of this article, with a large assist by Abram Demski, I’m going to try again.
The best response to my initial post, by @gjm, explained the point more succinctly and better than I had; “An AI system is software in something like the same way a human being is chemistry.” And yes, of course the human body is chemistry. So in that sense, I was wrong - arguing that AI isn’t software is, in some sense, arguing that the human body isn’t chemistry. But the point was that we don’t think about humans in terms of chemistry.
“The Categories Were Made For Man, Not Man For The Categories.” That is, what we call things is based on inference about which categories should be used, and what features are definitive or not. And any categorization has multiple purposes, and so the question of whether to categorize things together or separately is collapsing many questions together.
The prior essay spent time arguing that Software and AI are different. The way software is developed is different from the way AI is developed, and the way software behaves and how it fails is different from how AI behaves and fails. Here, I’ll add two more dimensions; that the people and the expertise for AI is different than that for software, and that AI differs from software in ways similar to how software differs from hardware. In between those two, I’ll introduce a conceptual model from Abram Demski explaining that AI is a different type of tool than software that captures much more of the point.
Based on that, we’ll get to the key point that was obscured initially; if AI is a different type of thing, what does that imply about it - and less importantly, what should we name the new category?
Who does what?
If we ask a stranger what they do, and they say chemistry, we would be surprised to learn that they were a medical doctor. On the other hand, if someone is a medical doctor, we expect them to know a reasonable amount about biochemistry.
I have a friend who was interested in business, but did a bachelors in scientific computational methods. He went on to get an MBA, and did research on real time pricing in electrical markets - a field where his background was essential. He told me once that as an undergrad, he managed As in his classes, but got weird looks when he asked what a compiler was, and how he was supposed to run code on his computer. He wasn’t a computer scientist, he was just using computer science. Computational numerical methods were a great tool for him, and it was useful to understand financial markets, but he certainly wouldn’t tell people he was a computer scientist or mathematician. These two domains are connected, but not the same.
Returning to the earlier question, software and AI are connected. If someone says they do software development, we would be surprised if they mainly published AI research. And this goes both ways. The skills needed to do AI research or build AI systems sometimes require a familiarity with software development, but other times, it does not. There are people who do prompt engineering for language models that can’t write any code - and their contributions are nonetheless absolutely vital to making many AI systems work. There are people who do mathematical analysis of deep learning, and can explain the relationship between different activation functions and model structures and how that affects how they converge, and also don’t write code. People who write code may or may not work with AI, but everyone who does prompt engineering for LLMs or mathematical analysis of deep learning is doing work with AI.
What Kind of Tool is AI?
Abram suggests that we can make a rough accounting of shifting technological paradigms as follows:
Each of these is largely but not entirely a subset of the prior level. Yes, there are machines that aren’t really tools, say, because they are toys, and yes, there are electric or electronic systems that aren’t machines in the mechanical or similar senses. Despite this, we can see a progression - not that when machines were invented people stopped using tools, or that digital devices replaced earlier devices, but that they are different.
What makes each category conceptually different? Each shift in paradigm is somewhat different, but we do see a progression. We might still ask what defines this progression, or what changes between levels? A full account would need its own essay, or book, and the devil is in the details, but some common themes here are increasing complexity, increasing automation, (largely) diminishing size of functional components, asking less from humans (first, less time and energy; later, less information).
The shift from "electric" to "electronic" seems complex, but as electrical components got smaller and more refined, there was a shift away from merely handling energy and toward using electricity for information processing. If I ask you to fill in the blank in "electric ____" you might think of an electric lightbulb, electric motor, or electric kettle, appliances which focus primarily on converting electricity to another form of energy. And if I ask you to fill in the blank in "electronic ____" you might think of an electronic calculator, electronic thermometer, or electronic watch. In each case, these devices are more about information than physical manipulation. However, this is not a shift from using electric current to using electrons, as one early reader suggested. Both use electricity, but we start to see a distinction or shift in conceptual approaches from "components" like resistors, magnets, motors, and transistors, to "circuits'' which chain components together in order to implement some desired logic.
Shifting from the electronic paradigm to the digital one, we see the rise of a hardware/software distinction. Pong was (iirc) designed as a circuit, not programmed as software -- but video games would soon make the switch. And "programming" emerges as an activity separate from electrical engineering, or circuit design. "Programmers" think about things like algorithms, logic, and variables with values. Obviously all of these are accomplished in ways logically equivalent to a circuit, but the conceptual model changed.
Hardware, software, and.. deepware?
In a comment, Abram noted that a hardware enthusiast could argue against making a software/hardware distinction. The idea of "software" is misleading because it distracts from the physical reality. Even software is still present physically as magnetic states in the computer’s hard drive, or in the circuits. And obviously, software doesn't do anything hardware can't do, since software doing something is just hardware doing it. This could be considered different than previous distinctions between levels; a digital calculator is doing something an electric device can’t, while an electric kettle is just doing what another machine does by using electricity instead of some chemical fuel.
But Abram pointed out that thinking in this way will not be a very good way of predicting reality. The hypothetical hardware enthusiast would not be able to predict the rise of the "programmer" profession, or the great increase in complexity of things that machines can do thanks to "programming".
The argument is that machine learning is a shift of comparable importance, such that it makes more sense to categorize generative AI models as "something else" in much the same way that software is not categorized as hardware (even though it is made of physical stuff).
It is more helpful to think of modern AI as a paradigm shift in the same way that the shift from "electronic" (hardware) to "digital" (software) was a paradigm shift. In other words: the digital age has led to the rise of generative AI, in much the same way that the electric age enabled the rise of electronics. One age doesn’t end, and we're still using electricity for everything (indeed, for even more things,) but "electric" stopped being the most interesting abstraction. Now, a shift to deep learning and AI means that things like "program", "code", "algorithm" are starting to not be the best or most relevant abstraction either.
Is this really different?
When seeing the above explanation, @gjm commented that “I suppose you could say a complicated Excel spreadsheet monstrosity is ‘software’ but it's quite an unusual kind of software and the things you do to improve or debug it aren't the same as the ones you do with a conventional program. AI is kinda like these but more so.”
The question is whether "more so" is a evolutionary or revolutionary change. Yes, toasters are different from generators, and the types of things you do to improve or debug them are different, but there is no conceptual shift. You do not need new conceptual tools to understand and debug spreadsheets, even if they are the types of horrendous monstrosities I've worked with in finance. On the other hand, there have obviously been smaller paradigm shifts within the larger umbrella of software, from assembly to procedural programming to object oriented programming and so on. And these did involve conceptual shifts and new classes of tools; type systems and type checking were a new concept when shifting from machine code programming in assembly to more abstract programming languages, even though bits, bytes, and words were conceptually distinct in machine code.
It could be debated which shifts should be considered separate paradigms, but the shift to deep learning required a new set of conceptual tools. We need to go back to physics to understand why electronic circuits work, they don’t really work just as analogies to mechanical systems. @gjm explained this clearly; “We've got a bunch of things supervening on one another: laws of physics, principles of electronics, digital logic, lower-level software, higher-level software, neural network, currently-poorly-understood structures inside LLMs, something-like-understanding, something-like-meaning. Most of the time, in order to understand a higher-level thing it isn't very useful to think in terms of the lower-level things.”
This seems to hit on the fundamental difference. When a new paradigm supervenes on a previous one, it doesn't just add to it, or logically follow. Instead, the old conceptual models fail, and you need new concepts. So type theory is understood via concepts that are coherent in the terms we use to talk about debugging logic in earlier programming. On the other hand, programming instead of circuit design or electronics required a more fundamental regrounding. The new paradigm did not build further on physics and extend mathematical approaches previously used for analog circuit design. Instead, it required the development of new mathematical formalisms and approaches - finite state machines and Turing completeness for programs, first-order predicate logic for databases, and similar. The claim here is that deep learning requires a similar rethinking, not just building conceptual tools on top of those we already have.
What’s in a Name?
Terminology can be illuminating or obscuring, and naming the next step in technological progress is tricky. Electronics is used as a different word than electric, but it’s not as though electrons are more specifically involved; static electricity, resistors, PN junctions, and circuits all involve electrons. Similarly, “software” is not a great name to describe a change from electronic components to data, but both terms stuck. (I clearly recall trying to explain to younger students that they were called floppy disks because the old ones were actually floppy; now, the only thing remaining of that era is the icon that my kids don’t recognize as representing a physical object.)
Currently, we seem to have moved from calling these new methods and tools “machine learning” to calling them “AI,” and both indicate something about how this isn’t software, it’s something different, but neither term really captures the current transition. The product created by machine learning isn’t that the machine was learning, it’s that the derived model can do certain things on the basis of what it infers from data. Many of those things are (better or different versions of) normal types of statistical inference, including categorization, but not all of them. And calling ML statistics misses the emergent capabilities of GANs, LLMs, Diffusion models, and similar.
On the other hand, current “AI” is rightly considered neither artificial nor intelligent. It’s not completely artificial, because other than a few places like self-play training for GANs, it’s trained on human expertise and data. In that way, it’s more clearly machine learning (via imitating humans.) It’s also not currently intelligent in many senses, because it’s non-agentic and very likely not conscious. And in either case, it’s definitely not what people envisioned decades ago when they spoke about “AI.”
The other critical thing that the current terms seem to miss is the deep and inscrutable nature of the systems. There are individuals who understand at least large sections of every large software project, which is necessary for development, but the same is not and never need be true for deep learning models. Even to the extent that interpretability or explainability is successful, the systems are far more complex than humans can fully understand. I think that “deepware” captures some of this, and am indebted to @Oliver Sourbut for the suggestion.
Conclusion
Deep learning models use electricity and run on computers that can be switched on and off, but are not best thought of as electric brains. Deep learning models run on hardware, but are not best thought of as electronic brains. Deep learning models are executed as software instructions, but are not best thought of as software brains. And in the same way, software built on top of these deep learning models to create ”AI systems” provides a programming-interface for the models. But these are not designed systems, they are inscrutable models grown on vast datasets.
It is tempting to think of the amalgam of a deep learning model and the software as a software product. Yes, it is accessed via API, run by software, on hardware, with electricity, but instead of thinking of software, hardware, or electrical systems, we need to see them as what they are. That doesn’t necessarily mean the best way of thinking about them is as inscrutable piles of linear algebra, or as shoggoths, or as artificial intelligence, but it does mean seeing them as something different, and not getting trapped in the wrong paradigm.
Thanks to @gjm for the initial comment and the resulting discussion, and to @zoop for his disagreements, an to both for their feedback on an earlier draft. Thanks to @Gerald Monroe and @noggin-scratcher for pushback and conversation on the original post. Finally, thanks to @Daniel Kokotajlo for initially suggesting "deepnets" and again thanks to @Oliver Sourbut for suggesting "deepware"