I would normally publish this on my blog, however, I thought that LessWrong people might be interested in this topic. This sequence is about my experience creating a logical language, Sekko (still a work-in-progress).
What's a logical language?
Languages can be divided into two categories: natural languages, which arise...naturally, and constructed langugages, which are designed. Of constructed languages, there are two categories: artlangs, which are designed as works of art or as supplements to a world (think Quenya or Klingon), or engelangs (engineered languages), which are designed to fulfill a specific purpose.
Loglangs (logical languages) are a subset of engelangs. Not all engelangs are loglangs -- Toki Pona is a non-loglang engelang.
Despite the name, what constitutes a "loglang" is rather undefined. It's more of a "vibe" than anything. Still, I'll try to lay out various properties that loglangs have:
- Adherence to a logic system. Most logical languages have second-order logic. Toaq and Lojban have plural logic; Eberban has singular logic.
- Self-segregating morphology (SSM). SSM refers to a morphology (scheme for making word forms) such that word boundaries can be unambiguously determined. Several schemes exist: Lojban uses stress and consonant clusters, Toaq uses tone contours, Eberban and Sekko use arbitrary phoneme classification. Sekko's SSM will be discussed in later posts.
- Elimination of non-coding elements. Many natural languages have features which add complication but do not encode information. Examples include German grammatical gender, Latin's five declensions or English number and person agreement (i.e. "I am" vs "You are" vs "He is"; "It is" vs "They are"). Loglangs aim to remove non-coding grammatical elements like these.
- Exceptionless grammar and syntactic unambiguity. Many natural languages have irregular forms. Examples include
- Machine parsability. Basically all loglangs are able to be parsed to check for syntactic correctness and correct scope of grammatical structures. Most loglangs (Loglan, Lojban, Toaq, and Eberban) have Parsing Expression Grammar (PEG) parsers.
- Ergonomics and extensibility. Loglangs try to have as ergonomic a grammar as possible -- one that gives access to the most semantic space for the least amount of grammar complexity. An example is the fact that in basically all loglangs, there are really only two parts of speech (aside from particles): predicates and arguments. Adjectives in English can simply be taken to be copular verbs (eg. X is beautiful.)
- Audiovisual isomorphism. (AVI) Loglangs will usually have a script or writing system (usually there is one based on Latin script, and then later, a completely novel script is devised) that has audiovisual isomorphism. What this means is that text and writing should correspond to each other phonetically -- that the text must encode speech precisely (i.e. words are spelled as they are said). AVI only necessitates that the phonemic (significant or distinguished) speech features are encoded. For example, if your language does not distinguish between short and long vowels, it is unnecessary to have an encoding for them.
Phonemic differences are those for which a language has pairs of words differing only in that aspect. For example, English distinguishes between "pat" and "bat". We may therefore conclude that English distinguishes voicing on the bilabial plosive (p and b sound). Mandarin does not -- all of its plosives are voiceless (rather, it distinguishes on aspiration, which English does not do). Another example is English "thin" and "thing" -- this represents that English distinguishes between the alveolar nasal /n/ and the velar nasal /ŋ/.
Why logical languages?
Frankly, it was not the logic aspect of loglangs that attracted me to them, but the presence of parsers and exceptionless grammar. I'm interested in the "structure". I'm not a subscriber to the Sapir-Whorf hypothesis, but I did find that learning Lojban was much, much easier than learning a natural language.
Lojban, such as it is (it has many, many problems), was still much more regular than any natural language -- a common way to learn is to participate in conversations with a dictionary in the other tab, knowing only the grammar. The regular grammar means that even if you don't know what a word might mean, you know its syntactic role. You do not have to pay attention to little natural language quirks, such as the "come-go" difference in English (both meaning "to move/travel", or the "kureru-ageru" (both meaning "to give") difference in Japanese.
There is also the syntactic ambiguity in a sentence such as "Do you want to drink tea or coffee?". The joke answer is to say, "Yes", since it's ambiguous whether the question is a true-or-false question or a choice question. In Lojban (and other loglangs), there is no ambiguity since the syntax of those two questions are different.
Parser: BPFK Lojban
TRUE OR FALSE.
.i xu do djica lonu do pinxe lo tcati .a lo ckafi
Is the following statement true?: You desire the event of you drinking tea OR coffee.
Possible answers:
go'i
The previous statement is true.
nago'i
The previous statement is false.
CHOICE
.i do djica lonu do pinxe lo tcati ji lo ckafi
You desire the event of you drinking tea ??? coffee. (where ??? asks for a logical connective)
Possible answers:
.e
Both.
.enai
The former, but not the latter. (tea only)
na.e
Not the former, but the latter. (coffee only)
na.enai
Neither.
.a
OR (one or the other, or both)
.o
XNOR (both, or neither)
.onai
XOR (one or the other)
I don't think any loglang is going to replace natlangs anytime soon -- this is just a hobby. It's very pleasing to speak a logical language, and I often wish that English had support for some of the constructs in the logical languages I speak (or, at least, know about).
Sekko, my loglang
I will be publishing documentation on my work-in-progress logical language Sekko in this sequence. Now, all of the documentation published as of now should be treated as temporary. It is likely that I'll make sweeping changes on one or more of the parts of the grammar -- some of it hasn't been made yet. Likely, future posts will invalidate past posts. I plan on restructuring it once I've written something on each topic. I'm planning on using mdbook
and Github pages, similar to Eberban.
The documentation posts I have tried to write and annotate such that you need little background to understand. I have plans to split up this initial documentation into a reference grammar and teaching course, which are separate (and the latter may even be further separated based on the sort of background you already have). I have also annotated the documentation with analogies if you already know either Lojban or Toaq. Please feel free to make suggestions on design.
Anaphora is super complicated, and I've thought long and hard about how to express them. Each loglang has its own ways of dealing with anaphors. Yes, you are correct that Lojban anaphora is poorly designed. There's the ko'V series, the vo'V series, goi, the letteral series...it's really bad.
Most people use a variant of the ko'V series. How it works is that you bind a variable to ko'a (or the others in the series), and then when you repeat "ko'a", it recalls the bound variable. The extremely big issue with this is that it requires forethought. It's fine when you're writing, but when you're speaking, you don't necessarily know whether or not you'll need to refer back to something you said before. You could simply repeat the words, and context plus good faith/Grice's Maxims usually means you can safely assume you meant to refer to the same thing, but you didn't state it explicitly. Very unloglangy.
Toaq anaphora is also not good. The new Toaq anaphor system is such that all arguments are classified into several classes: animate entity (really Toaq? Animacy distinction?), inanimate entity, abstract entity, adjectives, clauses, LU-clauses, genitives, personal pronouns and demonstratives. Each pronoun refers to the closest argument that fulfills its type -- each class has its own pronoun. The issue is if you want to talk about things which belong to the same class, this type of anaphor becomes unwieldy. The plus side is that it requires no forethought.
I plan on having a variation of Toaq anaphors, which I'll discuss in a later chapter.
Creating new words is something that all loglangs encourage. It's more of an infrastructure issue -- Lojban and Toaq both have community dictionaries that anyone can add to (Jbovlaste and Toadua respectively). People can then define new words to talk about what they want to talk about, as they wish.
It also saves a lot of effort on the part of the language maker(s).I distinguish between vagueness and ambiguity. Vagueness is when a word encloses a large volume in semantic space. This is totally fine, and most root words ought to be on the vague side. Ambiguity is when a word encloses disconnected volumes in semantic space. This is unacceptable and should be removed. Consider the vagueness of the word "animal" and the ambiguity of the word "set".
Sorry, I don't understand what you mean.
Yes, this is something that upset me with Lojban and Eberban and pleased me with Toaq. Lojban usually tries to make particle families have similar forms. This is bad because single-phoneme errors can cause misunderstanding, since particles in the same family would usually take the same places as each other. It's best to have particles in the same family be phonetically far away, even if it makes it harder to learn. Phonetically-close words should be semantically far away such that even if point errors occur, context can be sufficient to correct it.