o1 has shown a strange behavior where it thinks in Mandarin, while processing English prompts, and translates the results back to English for the output. I realized that the same could be possible for humans to utilize, speeding up conscious thought. [1]
What makes Mandarin useful for this is that it:
- Has compact tokens
- Has compact grammar
- Has abundant training material online
- Can be used on computer systems easily (unicode support)
Language[2] | English | Mandarin | Toki Pona[3] | Latin | Google Translate Intermediate[4] |
---|---|---|---|---|---|
Compact Tokens | ❌ | ✅ | ✅ | ❌ | ✅✅ |
Compact Grammar | ❌ | ✅ | ❌ | ✅ | ✅✅ |
Significant Training Material | ✅✅ | ✅✅ | ❌ | ✅ | ❌ |
Software Support | ✅✅ | ✅ | ❌ | ✅ | ❌ |
Human Learning Ability | ✅ | ✅ | ✅✅✅ | ✅ | ❌ |
As a Jew, I learned the Hebrew Alphabet (but not vocabulary) to study for my Bar Mitzvah, and as a student of US public education, I had the choice in secondary school to learn Spanish, French, or German, in addition to learning English natively. I chose German, and I am very unlikely to change or stop learning this, but I wonder if it would be useful to learn a new language specifically to think in. This would pose some different requirements than traditionally learning a language, as reading and writing would be much less important for this task. Knowing many different words, and correct grammar would be much more important.
The idea of Brain-Machine interfaces installed into one's brain, ending the need for languages altogether would bring a major improvement to human society, but intentional control of thought[5] via language could bring the same effect. Aside from the normal cognitive benefits of being bilingual or multilingual, would learning some new language (or a conlang for this purpose) specifically to have conscious thought with be useful?
- ^
https://techcrunch.com/2025/01/14/openais-ai-reasoning-model-thinks-in-chinese-sometimes-and-no-one-really-knows-why/
- ^
These are not empirical or quantitative in any way, just the general ideas I sense from these. The other ideas expressed in this post are severable from this chart.
- ^
Intentionally made from-scratch language (conlang) with a very limited character set
- ^
Language used for Google Translate, all messages are first translated into this language, then translated into the output language. This method only requires 2 models per language, rather than the exponentially growing number needed for one between each language pair.
- ^
The idea of not learning certain words, as a way to make certain concepts slower to conceive has occured to me, but this seems to be a bad idea for obvious reasons.
Speedtalk?