I would normally publish this on my blog, however, I thought that LessWrong people might be interested in this topic. This sequence is about my experience creating a logical language, Sekko (still a work-in-progress).
What's a logical language?
Languages can be divided into two categories: natural languages, which arise...naturally, and constructed langugages, which are designed. Of constructed languages, there are two categories: artlangs, which are designed as works of art or as supplements to a world (think Quenya or Klingon), or engelangs (engineered languages), which are designed to fulfill a specific purpose.
Loglangs (logical languages) are a subset of engelangs. Not all engelangs are loglangs -- Toki Pona is a non-loglang engelang.
Despite the name, what constitutes a "loglang" is rather undefined. It's more of a "vibe" than anything. Still, I'll try to lay out various properties that loglangs have:
- Adherence to a logic system. Most logical languages have second-order logic. Toaq and Lojban have plural logic; Eberban has singular logic.
- Self-segregating morphology (SSM). SSM refers to a morphology (scheme for making word forms) such that word boundaries can be unambiguously determined. Several schemes exist: Lojban uses stress and consonant clusters, Toaq uses tone contours, Eberban and Sekko use arbitrary phoneme classification. Sekko's SSM will be discussed in later posts.
- Elimination of non-coding elements. Many natural languages have features which add complication but do not encode information. Examples include German grammatical gender, Latin's five declensions or English number and person agreement (i.e. "I am" vs "You are" vs "He is"; "It is" vs "They are"). Loglangs aim to remove non-coding grammatical elements like these.
- Exceptionless grammar and syntactic unambiguity. Many natural languages have irregular forms. Examples include
- Machine parsability. Basically all loglangs are able to be parsed to check for syntactic correctness and correct scope of grammatical structures. Most loglangs (Loglan, Lojban, Toaq, and Eberban) have Parsing Expression Grammar (PEG) parsers.
- Ergonomics and extensibility. Loglangs try to have as ergonomic a grammar as possible -- one that gives access to the most semantic space for the least amount of grammar complexity. An example is the fact that in basically all loglangs, there are really only two parts of speech (aside from particles): predicates and arguments. Adjectives in English can simply be taken to be copular verbs (eg. X is beautiful.)
- Audiovisual isomorphism. (AVI) Loglangs will usually have a script or writing system (usually there is one based on Latin script, and then later, a completely novel script is devised) that has audiovisual isomorphism. What this means is that text and writing should correspond to each other phonetically -- that the text must encode speech precisely (i.e. words are spelled as they are said). AVI only necessitates that the phonemic (significant or distinguished) speech features are encoded. For example, if your language does not distinguish between short and long vowels, it is unnecessary to have an encoding for them.
Phonemic differences are those for which a language has pairs of words differing only in that aspect. For example, English distinguishes between "pat" and "bat". We may therefore conclude that English distinguishes voicing on the bilabial plosive (p and b sound). Mandarin does not -- all of its plosives are voiceless (rather, it distinguishes on aspiration, which English does not do). Another example is English "thin" and "thing" -- this represents that English distinguishes between the alveolar nasal /n/ and the velar nasal /ŋ/.
Why logical languages?
Frankly, it was not the logic aspect of loglangs that attracted me to them, but the presence of parsers and exceptionless grammar. I'm interested in the "structure". I'm not a subscriber to the Sapir-Whorf hypothesis, but I did find that learning Lojban was much, much easier than learning a natural language.
Lojban, such as it is (it has many, many problems), was still much more regular than any natural language -- a common way to learn is to participate in conversations with a dictionary in the other tab, knowing only the grammar. The regular grammar means that even if you don't know what a word might mean, you know its syntactic role. You do not have to pay attention to little natural language quirks, such as the "come-go" difference in English (both meaning "to move/travel", or the "kureru-ageru" (both meaning "to give") difference in Japanese.
There is also the syntactic ambiguity in a sentence such as "Do you want to drink tea or coffee?". The joke answer is to say, "Yes", since it's ambiguous whether the question is a true-or-false question or a choice question. In Lojban (and other loglangs), there is no ambiguity since the syntax of those two questions are different.
Parser: BPFK Lojban
TRUE OR FALSE.
.i xu do djica lonu do pinxe lo tcati .a lo ckafi
Is the following statement true?: You desire the event of you drinking tea OR coffee.
Possible answers:
go'i
The previous statement is true.
nago'i
The previous statement is false.
CHOICE
.i do djica lonu do pinxe lo tcati ji lo ckafi
You desire the event of you drinking tea ??? coffee. (where ??? asks for a logical connective)
Possible answers:
.e
Both.
.enai
The former, but not the latter. (tea only)
na.e
Not the former, but the latter. (coffee only)
na.enai
Neither.
.a
OR (one or the other, or both)
.o
XNOR (both, or neither)
.onai
XOR (one or the other)
I don't think any loglang is going to replace natlangs anytime soon -- this is just a hobby. It's very pleasing to speak a logical language, and I often wish that English had support for some of the constructs in the logical languages I speak (or, at least, know about).
Sekko, my loglang
I will be publishing documentation on my work-in-progress logical language Sekko in this sequence. Now, all of the documentation published as of now should be treated as temporary. It is likely that I'll make sweeping changes on one or more of the parts of the grammar -- some of it hasn't been made yet. Likely, future posts will invalidate past posts. I plan on restructuring it once I've written something on each topic. I'm planning on using mdbook
and Github pages, similar to Eberban.
The documentation posts I have tried to write and annotate such that you need little background to understand. I have plans to split up this initial documentation into a reference grammar and teaching course, which are separate (and the latter may even be further separated based on the sort of background you already have). I have also annotated the documentation with analogies if you already know either Lojban or Toaq. Please feel free to make suggestions on design.
I have a few thoughts about designing new languages.
Generally, backreferencing often is quite complicated. Words like "he", "her", "this" and "that" can often only be interpreted based on context. In German grammatical gender often contains information that helps to make such backreferencing clear. If I remember correctly backreferencing was quite complicated and complex in Lojban.
One of the bigger problems of Lojban is that it is designed by focusing on having words for a specific list of known concepts. Part of what a good language allows it to make up new words to describe new concepts. All good science involves making up new terms to describe observed phenomena.
Ideally, you have a way to easily create new terms that are understood.
If you take English for example you have existing pairs like see - watch and hear - listen but you can't easily get to the same distinction for a word like think. When learning to meditate that distinction is useful as doing the think/see is okay but think/watch is to be avoided. I know a person who said that the fact that Esperanto allowed easily to make a word for the new concept allowed him to have a conversation where meditation started to make sense for him when he previously didn't get the point.
On the same token English has student and teacher which is similar to child and parent but has no easy way to say the equivalent of sibling for the first pair that exists for the second. There's also no equivalent for cousin. You could design a language in a way where there's a structure that easily gives you ways to extend all sorts of other contexts in a similar way.
Similar to those relations I think that spatial concepts could be a lot better.
If a language is bad for what you want to talk about, you run in a lot of Motte and Bailey issues. It's a sign of a good language when you are able to be precise to clarify what you mean. The way the English language overloads "to feel" makes it really hard to speak well about a lot of distinction. I don't have a good way to ask feel!physical sensation, feel!emotion and feel!mood (and the see/watch distinction for each one...).
When it comes to avoiding people from misunderstanding each other, it's helpful if a person who hears a single phoneme wrong doesn't hear another word that exists and that means something completely different. In informatics, there's the concept of error-correcting codes to make sure that messages are resilient against errors. Especially a language that's fully a-priori can think well about how to use the available combinations of phonemes to assign words in a way that's resilient to a few errors.
I gave an example of my friend having an experience where Esperanto already allowed him to have a conservation about meditation that he couldn't have had easily in English or German which are the languages he otherwise speaks.
Lojban put little effort into it as evidenced by having words for individual cardinal directions instead of going for a more systematic approach.
When it comes to family relations and also for things like lover/metamour, you would model them mathematical... (read more)