No, no, a thousand times no. This is a huge step backwards for UI. This is taking us back to the UI of old-school text adventure games, where I have to guess what specific words the inscrutable interpreter is looking for in order to do what I want it to do.
I do not want the system to "do what I mean", I want the system to do exactly what I tell it to do. In practice, every system that attempts to "do what the user means" ends up becoming an extremely janky command-line system with syntax that makes tcsh look sane.
Instead of constantly chasing new interaction paradigms like novelty-addicted squirrels, I would much rather UI designers and developers spend time improving the performance and organization of existing systems.
The AI understands what you mean by your words, rather than you having to put thought into what keywords to use, since it understands natural language.
This sounds like it would be “natural” to use, but it would not be, because translating intention into language is cognitively effortful, and very unnatural for a very wide array of action types.
I often do not think in words about what I want to do, or want done. Indeed I often don’t think about doing the thing at all, I just do it, and insofar as there’s cognition to be done, it’s done as part of the action, transparently.
Having to translate everything into words would dramatically narrow the cognitive bandwidth between me and the effects I can accomplish with my various technological tools.
A lot of people, including me, sometimes think in words, and otherwise can effortlessly translate, so I don't think it's a rule that people would have to think about it too much.
Eventually, I think, everyone would acclimatize, and instead of effortlessly doing the thing, they would learn to effortlessly command the AI.
It's an interesting point I hadn't considered before.
Edit: I also like how both our comments are correctness-strong-downvoted by a single person, yet we more or less contradict each other. Oh, well.
Edit: I also like how both our comments are correctness-strong-downvoted by a single person, yet we more or less contradict each other. Oh, well.
Well, while it’s unlikely that we’re both right, so long as our views are not literally logical negations of each other it is surely possible for us to both be wrong…
In addition to what Said Achmiz wrote, I would also add that an AI that unerringly knows what I mean is a superhuman intelligence.
People have to clarify their instructions to other people all the time, and in a non-trivial number of instances, the person giving the instruction gets frustrated and says something to the effect of, "It would be faster if I'd just done it myself."
You may not realize it, but our existing UI/UX paradigms are mainly from the 1970s
For some historical context which really shocked me, behold The Mother of All Demos from 1968: Youtube video, Wikipedia page. It's a 100-minute video, but you can skip lots and watch at >>1x speed.
Quote from Wikipedia:
"The Mother of All Demos" is a name retroactively applied to a landmark computer demonstration...
The live demonstration featured the introduction of a complete computer hardware and software system... The 90-minute presentation demonstrated for the first time many of the fundamental elements of modern personal computing: windows, hypertext, graphics, efficient navigation and command input, video conferencing, the computer mouse, word processing, dynamic file linking, revision control, and a collaborative real-time editor. Engelbart's presentation was the first to publicly demonstrate all of these elements in a single system. The demonstration was highly influential and spawned similar projects at Xerox PARC in the early 1970s. The underlying concepts and technologies influenced both the Apple Macintosh and Microsoft Windows graphical user interface operating systems in the 1980s and 1990s.
Perhaps we ought to start thinking about adding novelties like natural language AI assistants after our current UIs match those shown in that video.
The underlying concepts and technologies influenced both the Apple Macintosh and Microsoft Windows graphical user interface operating systems in the 1980s and 1990s.
As far as ordinary users are concerned, graphical UI is from the late eighties or early nineties, when it could be implemented affordably. It was arguably ahead of its time in the seventies and sixties.
I continue to expect that I will prefer to control my computer with formal grammars - I have spent significant time using caster speech recognition with dragon, and I'm sure I'll keep doing so with openai whisper. nobody has ever beaten the CLI and nobody ever will [edit: to clarify, I believe it is a weakness of current ai that it struggles to take formal grammar seriously].
now, if an AI could automatically generate interfaces that standardize messy UIs into coherent formal grammars that fit comfortably in a CLI workflow, that would be amazing...
also, I want a CLI that shows me what commands are available at any step.
yeah, eventually keyboard clis will be beaten, but even with whisper I expect to sometimes prefer keyboard. it's just that hard to beat CLIs.
(Inspired by a spirited discussion on #lesswrong)
You may not realize it, but our existing UI/UX paradigms are mainly from the 1970s, when human-oriented computing was still in dark ages. Menus, forms, check boxes, search bars, file systems, you name it. They are fine tools, polished by decades of doing the best we could with limited computing power we had, but they are still very much dated. Various attempts to introduce better UX have fallen flat due to limitations of algorithms, memory and compute. Bold attempts like Microsoft Bob and Clippy failed because they made user experience worse, not better.
Things are different now. Natural language processing in limited domains is good enough or nearly good enough to provide an audio- and typed DWIM interface without forcing the user to go through the levels of menus (usually poorly designed by the developers). This is not to replace the menus or the check boxes, but to add another way to do something. Here are some things you might say or type in an ever-present "do box" (not just a "search box"), some of which already sort of work on some gadgets:
Google Assistant, Siri and ilk can already recognize and interpret many of those, and execute some if asked just right, but it is still an afterthought most of the time, compared to the traditional way to do things. My point is that the technology is good enough now to make a natural language interface a convenient way to do things that require digging through menus and settings, and searching for solutions online. But the UI/UX mindset is still stuck in the old days, and it is basically hit-and-miss.
This is a bit ironic, given that this UI/UX is front and center in sci-fi movies and shows, where, in addition to pressing some weird symbols on a colorful star trek touch panel, one can also say "Computer, find all Ferenghi ships nearby and list those that recently stopped at Risa" or "set course to Star Base 23" or "recalibrate tachyon sensors".
My hope is that the UI/UX paradigm will evolve to take advantage of the new AI capabilities and will require less memorization and digging through the forest of settings hidden in the bowels of each app or device.
Of course, a next logical step is a version of Scott Alexander's Magic Whispering Earring, though maybe not as sinister.