Google presents its new piece of AI technology as a search tool. That is, indeed, one potential application of MUM. But that's just scratching the surface.
The transformer architecture was introduced in 2017 with the paper Attention is All You Need. Language models, it turns out, need only simple algorithms and a ton of data. The bigger they get, the better they perform. Nobody expected the latter part. Generative Pre-trained transformers, GPTs, were introduced by OpenAI. GPTs generate text based on their transformer models. GPT-2 raised quite a few eyebrows. GPT-3 performed so well that it has alarmed the general public. GPT-4 isn't out yet, but we all know it's coming.
A simple idea transformed the transformer: multimodality. Let the model find connections between the same patterns via different modalities. 'Book', for instance, is a word. It refers to a cluster of patterns that can be made sense of via vision, audition, or even olfaction (for all you weirdos sniffing old books at the library). When you can compare and contrast patterns across different modalities, you understand them better. Helen Keller's tutor, Anne, wrote W-A-T-E-R in Keller's palm as she let water run across her hand. Keller later described this moment in these terms:
“As the cool stream gushed over one hand she spelled into the other the word water, first slowly, then rapidly. I stood still, my whole attention fixed upon the motions of her fingers. Suddenly I felt a misty consciousness as of something forgotten–-a thrill of returning thought; and somehow the mystery of language was revealed to me. I knew then that ‘w-a-t-e-r’ meant the wonderful cool something that was flowing over my hand. That living word awakened my soul, gave it light, hope, joy, set it free! There were barriers still, it is true, but barriers that could in time be swept away.”
Are you scared of MUM yet? Don't worry!
While they haven't released details, Google describes MUM as a multimodal transformer. It can understand your search queries. Even complicated ones. It can then provide you with answers. You are at point A. You want to get to point B. MUM has the answer.
Though we’re in the early days of exploring MUM, it’s an important milestone toward a future where Google can understand all of the different ways people naturally communicate and interpret information.
Potential Applications
MUM can help you find a new job. New books, music, movies, and art. You have some nagging doubts about a major life decision? MUM can lend an ear. Need to vent? MUM is there. MUM truly gets what you are saying, and will try to help you get from where you are to where you want to be.
MUM has gone through every intellectual product you can imagine, and has a solid command of science, both soft and hard. It can, then, serve as a tutor. Need a personally-curated set of lessons on entropy or 19th century French poetry? MUM's got your back. Got questions? MUM has, again, got the answers.
Perhaps you are lonely? Perish the thought; MUM will always be here for you. It will listen and offer comfort. Need a lover? Well ...
MUM can read and understand works on cognitive-behavioral therapy (CBT) and knows how to apply them. Why would you need a therapist when you have MUM?
Let us also imagine that we want MUM to entertain us. To tell us stories. It will take a long time before researchers are able to feed movies into MUM, but it's only a matter of time. At some point, MUM will be able to generate multimodal representations with narrative structure. But before that, it will be able to generate short stories. These stories will be printed here and there. It should be possible to feed criticism back into MUM based on people's impressions of them. MUM will learn to make up stories that we will find very satisfying.
Now imagine that we want to train AI agents with some level of autonomy. MUM can help guide them through their period of development. It scolds and praises them according to its assessment of their behavior. MUM will serve as a global hub regulating the behavior of local units, including carbon-based ones. That is, at least, a possibility. Is it far-fetched? Perhaps. Is it impossible? I don't think so.
Closing Thoughts
Could MUMs revolutionize society as we know it? Or will it turn out to merely serve as a new and improved search engine? What are the dangers? What are the opportunities? I have no idea. I'm looking forward to hearing your thoughts.
Epistemic status: Speculative
The title is a tongue-in-cheek reference to Google AI's latest showcase: Multitask Unified Model, or MUM for short. Further details can be found in their arXiv paper; Rethinking Search: Making Experts out of Dilettantes.
Let's say hi to MUM.
Multitask Unified Model (MUM)
Google presents its new piece of AI technology as a search tool. That is, indeed, one potential application of MUM. But that's just scratching the surface.
The transformer architecture was introduced in 2017 with the paper Attention is All You Need. Language models, it turns out, need only simple algorithms and a ton of data. The bigger they get, the better they perform. Nobody expected the latter part. Generative Pre-trained transformers, GPTs, were introduced by OpenAI. GPTs generate text based on their transformer models. GPT-2 raised quite a few eyebrows. GPT-3 performed so well that it has alarmed the general public. GPT-4 isn't out yet, but we all know it's coming.
A simple idea transformed the transformer: multimodality. Let the model find connections between the same patterns via different modalities. 'Book', for instance, is a word. It refers to a cluster of patterns that can be made sense of via vision, audition, or even olfaction (for all you weirdos sniffing old books at the library). When you can compare and contrast patterns across different modalities, you understand them better. Helen Keller's tutor, Anne, wrote W-A-T-E-R in Keller's palm as she let water run across her hand. Keller later described this moment in these terms:
Are you scared of MUM yet? Don't worry!
While they haven't released details, Google describes MUM as a multimodal transformer. It can understand your search queries. Even complicated ones. It can then provide you with answers. You are at point A. You want to get to point B. MUM has the answer.
Potential Applications
MUM can help you find a new job. New books, music, movies, and art. You have some nagging doubts about a major life decision? MUM can lend an ear. Need to vent? MUM is there. MUM truly gets what you are saying, and will try to help you get from where you are to where you want to be.
MUM has gone through every intellectual product you can imagine, and has a solid command of science, both soft and hard. It can, then, serve as a tutor. Need a personally-curated set of lessons on entropy or 19th century French poetry? MUM's got your back. Got questions? MUM has, again, got the answers.
Perhaps you are lonely? Perish the thought; MUM will always be here for you. It will listen and offer comfort. Need a lover? Well ...
MUM can read and understand works on cognitive-behavioral therapy (CBT) and knows how to apply them. Why would you need a therapist when you have MUM?
Let us also imagine that we want MUM to entertain us. To tell us stories. It will take a long time before researchers are able to feed movies into MUM, but it's only a matter of time. At some point, MUM will be able to generate multimodal representations with narrative structure. But before that, it will be able to generate short stories. These stories will be printed here and there. It should be possible to feed criticism back into MUM based on people's impressions of them. MUM will learn to make up stories that we will find very satisfying.
Now imagine that we want to train AI agents with some level of autonomy. MUM can help guide them through their period of development. It scolds and praises them according to its assessment of their behavior. MUM will serve as a global hub regulating the behavior of local units, including carbon-based ones. That is, at least, a possibility. Is it far-fetched? Perhaps. Is it impossible? I don't think so.
Closing Thoughts
Could MUMs revolutionize society as we know it? Or will it turn out to merely serve as a new and improved search engine? What are the dangers? What are the opportunities? I have no idea. I'm looking forward to hearing your thoughts.