o1 has shown a strange behavior where it thinks in Mandarin, while processing English prompts, and translates the results back to English for the output. I realized that the same could be possible for humans to utilize, speeding up conscious thought. [1]
What makes Mandarin useful for this is that it:
- Has compact tokens
- Has compact grammar
- Has abundant training material online
- Can be used on computer systems easily (unicode support)
Language[2] | English | Mandarin | Toki Pona[3] | Latin | Google Translate Intermediate[4] |
---|---|---|---|---|---|
Compact Tokens | ❌ | ✅ | ✅ | ❌ | ✅✅ |
Compact Grammar | ❌ | ✅ | ❌ | ✅ | ✅✅ |
Significant Training Material | ✅✅ | ✅✅ | ❌ | ✅ | ❌ |
Software Support | ✅✅ | ✅ | ❌ | ✅ | ❌ |
Human Learning Ability | ✅ | ✅ | ✅✅✅ | ✅ | ❌ |
As a Jew, I learned the Hebrew Alphabet (but not vocabulary) to study for my Bar Mitzvah, and as a student of US public education, I had the choice in secondary school to learn Spanish, French, or German, in addition to learning English natively. I chose German, and I am very unlikely to change or stop learning this, but I wonder if it would be useful to learn a new language specifically to think in. This would pose some different requirements than traditionally learning a language, as reading and writing would be much less important for this task. Knowing many different words, and correct grammar would be much more important.
The idea of Brain-Machine interfaces installed into one's brain, ending the need for languages altogether would bring a major improvement to human society, but intentional control of thought[5] via language could bring the same effect. Aside from the normal cognitive benefits of being bilingual or multilingual, would learning some new language (or a conlang for this purpose) specifically to have conscious thought with be useful?
- ^
https://techcrunch.com/2025/01/14/openais-ai-reasoning-model-thinks-in-chinese-sometimes-and-no-one-really-knows-why/
- ^
These are not empirical or quantitative in any way, just the general ideas I sense from these. The other ideas expressed in this post are severable from this chart.
- ^
Intentionally made from-scratch language (conlang) with a very limited character set
- ^
Language used for Google Translate, all messages are first translated into this language, then translated into the output language. This method only requires 2 models per language, rather than the exponentially growing number needed for one between each language pair.
- ^
The idea of not learning certain words, as a way to make certain concepts slower to conceive has occured to me, but this seems to be a bad idea for obvious reasons.
On a more substantive note:
Not sure if this is exactly what you had in mind, since it's fictional transhumanist tech, but I was reminded of this passage from Richard Ngo's recent short story The Gentle Romance:
That last link goes to Kurt Vonnegut on the 8 “shapes” of stories. The story is that Vonnegut wrote a master’s thesis on the shapes of stories that he submitted to the anthropology department at the University of Chicago, which rejected it. Here's a YouTube video of him talking about it; below is an infographic from that article:
That said, Richard's Vonnegut-inspired fictional tech is about communicating narratives efficiently, not precise facts or statistics. For that, Gwern's On the Existence of Powerful Natural Languages persuaded me that you can't really have powerful general-purpose conlangs that boost cognition across a wide variety of domains.