(Disclaimer: I have no training in or detailed understanding of these subjects. I first heard of Tarski from the Litany of Tarski, and then I Googled him.)

In his paper The Semantic Conception of Truth, Tarski says that he analyzes the claim, '"Snow is white" is true if and only if snow is white' as being expressed in two different languages. The whole claim in single quotes is expressed in a metalanguage, while "snow is white" is in another language.

For Tarski's proof to succeed, it is (if I understood him correctly) both necessary and sufficient for the metalanguage to be logically richer than the other language in certain ways. What these ways are is, according to Tarski, difficult to make general statements about without actually following his very involved technical proof.

If I remember correctly, this implies that the two languages cannot be identical. Tarski seems to be of the opinion that for a given language satisfying specific conditions, concepts of truth, synonymy, meaning, etc. can be defined for it in a metalanguage that is richer than it in logical devices, establishing a hierarchy of truth defining languages.

My main question is, since MIRI aims to mathematically prove Friendliness in recursively self-improving AI, is "essential richness" in language handling ability something we should expect to see increasing in the class of AIs MIRI is interested in, or is that unnecessary for MIRI's purposes? I understand that semantically defining truth and meaning may not be important either way. My principal motive is curiosity.

New to LessWrong?

New Comment
7 comments, sorted by Click to highlight new comments since: Today at 12:41 PM

I think we will see 'essential richness' in language handling ability to increase.

But I don't think that an infinite lattice of languages with increasing complexity (with or without some decreasing probability measure) is needed. I think one more layer is needed, albeit a non-'sentency' one. That's because obviously humans can derive/invent (meta) languages without working within an super meta language (except maybe in some metaphorical sense, but surely not because they have some super-natural quality).

Whatever structure it is the human brain uses to create languages with meaning it can at least in principle be applied to model this problem. The brain does derive/converge on precise structures from less exact and expressive source structures (input). It does so by inexact, probabilistic means. Such procedures also used by machine learning algorithms. These are inexact in a symbolic sense but this could nonetheless be employed to reason about the meaning creating process.

The Definability of Truth paper says that Kleene's logic makes it difficult to judge which statements are undefined because that answer also comes out as undefined. Does this mean the probabilistic approach adopted by MIRI is capable of separating cases where the truth of a statement is not infinitely certain because of purely verbal paradoxes from statements whose truth is probabilistic for other reasons? In particular, I'm interested to know whether it can discriminate between those and scientifically interesting paradoxes, but it's too soon to be asking questions like that if I'm not mistaken.

It is possible to construct probabilistic logics to normatively characterize the behavior of ideal goal-oriented agents, but the actual human brain probably strings together all sorts of partial, ad hoc, redundant and/or multiply realized implementations of abstract languages in a variety of ways. It is difficult to prove that an intelligence with an architecture like that will never do certain things in the future. In fact, it is probably a better idea to model a given brain physically than to describe the abstract mathematical reasoning followed by its workings, because the relevant wiring actually changes over time, and the same calculation could be performed in different ways.

It occurs to me that humans might learn languages with all sorts of "essential richness" by generalizing from the rules needed to achieve certain tasks. We may be born with the potential to learn some of these languages in this way, but can an AI running a pure probabilistic logic learn to generalize other abstract languages? It may not need to, mind you.

I believe that they're working on a probabilistic definition of truth, to bypass certain problems like these. Check their research papers out at intelligence.org.

Also, I feel that a question of this sort might be better suited to the open thread. Admittedly, some people have been thinking of making the discussion thread for, y'know, discussions, so if a substantive discussion does come out of this, I'm probably wrong.

Wow, they have a paper addressing this very subject. Unfortunately, I lost the thread of the argument halfway through. (But I'm not giving up. I am going to watch the Paul Christiano on Probabilistic Metamathematics video on Youtube. Any other aids to understanding will be greatly appreciated. I found quite a few myself, actually.) I did not know about the Open Thread either. Sorry about that. Will lurk more. I have no objection to this thread being used for general discussion.

Lurking for a while before posting on a website is a good idea :)

It may help to note that there's been a lot of work on what we mean by sentences being true since Tarski. The most notable in this context would be Kripke(pdf).

The reason different "languages" are referred to is to try to prevent paradoxes like "this sentence is false," or "all sentences that do not refer to themselves" (does this sentence refer to itself? If not, then it does!). The mathematical analogue would be a type theory.

And yet, this attempt to prevent paradoxes didn't really work - arithmetic is only allowed to talk about numbers, not about itself, but Gödel's theorem is all about using numbers to talk about the proof system that's trying to prove things about numbers. If the Gödel sentence is true but unprovable (or false but nonstandard, or whatever), why not just let things talk about themselves, and call "this sentence is false" false but nonstandard (or whatever)? We've kind of lost our motivation for having this hierarchy of languages in the first place.