Pentashagon comments on Open Thread, September 15-30, 2012 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (206)
I'm trying to find the existing research on the topic Paul Graham discusses in this article (regarding the relative merits of programming languages in footnote 3 and surrounding text) and which EY touches on here (regarding Turing tarpits).
Basically, within the realm of Turing-complete languages, there is significant difference in how easy it is to write a program that implements specific functionality. That is, if you want to write a program that takes a bunch of integers and returns the sum of their squares, then it's possible to do it by writing in machine code, assembly, brainfsck, BASIC, C, Java, Python, Lisp, but it's much easier (and more concise, intuitive, etc) in some than others.
What's more, Graham speculates there's a ranking of the languages in which programmers too comfortable in one of the languages "don't get" the usefulness of the features in languages above it. So BASIC-addicts might not appreciate what they can do with recursion or structured programming (i.e. by abolishing go-to statements); C-addicts might not appreciate what they can do with functions as first class objects, and Python-addicts might not appreciate what they can do with Lisp macros.
I'm interested in research that formalizes (or deconstructs) this intuition and attempts to come up with a procedure for comparing programming languages in this respect. The concept is similar to "expressive power", but that's not exactly the same thing, because they can all express the same programs, but some can span a broader array of programs using fewer symbols (due to how they combine and accumulate meaning faster).
Also, the theory of Komogorov complexity holds that "when compressing data, choice of language doesn't matter except to the extent that it adds a data-independent constant equal to the length of a program that converts between the two languages"; however, this still allows that some programs (before the converter) are necessarily longer, and I've heard of results that some Turing-complete formalisms require exponential blow-up in program size over e.g. C. (This is the case with Wolfram's tiny Turing-complete language in A New Kind of Science.)
tl;dr: I want to know what research is out there (or how to find it) regarding how to rigorously evaluate programming languages regarding relative ease and brevity of writing programs, and in what senses python is better than e.g. assembler.
Do you mean only programming languages or programming languages plus the commonly available standard libraries? Some programming languages are very simple and powerful (LISP, Haskell), and some provide a large standard library and other tools to make it easy and straightforward to get things done in specific problem spaces (MATLAB, VB).
The most powerful and concise language I can imagine would be in the language of set theory with a magic automated solver: Define the domain of a problem space and a predicate for solutions and the result is the set of solutions. A standard library built on such a language would consist mostly of useful definitions for converting real world problems into mathematics and back again.
I think most programming languages try to approximate this to some degree, but the programmer is necessary to fill in the algorithmic steps between the problem space and the solution space. The fewer explicit steps the programmer has to specify to write a function, the more concise and powerful the language. The fewer definitions and functions necessary to convert real world input into a mathematical model and back to real world output, the more concise and powerful the standard library is.
Graham addresses the point you're making about the difference between the language being powerful vs it having a large standard library. As he says in the link, by his metric, two languages are "at the same level" if they only differ by one of them lacking a library function that the other has. His test for proving that language A is "above" B is the question: "To do a task in B, do you have to write an interpreter for A (or language on its level)?" For example, adding recursion to Basic requires going beyond writing one more library function.
After thinking about this for (a long) while, I think the most powerful languages would then be the ones that are internally extensible and allow modification of their own syntax and semantics. For instance BASIC would become a far more powerful language than C or LISP if it had a function for altering the interpreter itself. Certain versions of BASIC effectively had this by allowing functions to be written in machine language and then executed. Theoretically, the machine language snippets could extend the interpreter to add structured programming features or first class functions. The same could potentially be done with C by just including the Tiny C Compiler in the source (as both the compiled functions and C strings containing the source) and then reflexively recompiling the compiler to add features.
What is most interesting to me is that making a language that can modify itself requires a complete definition of its execution environment in order to be safely extensible. The most powerful languages would have to fully and formally define their semantics as well as their syntax and make both accessible within the language so that extension of the syntax could could safely extend the semantics to match. In other words BASIC with assembly language is not enough if you don't know everything about the hardware and the interpreter you start with. From my CS student days I seem to recall reading that few programming languages have rigorously defined semantics ("This feature is implementation specific" and "This behavior is undefined" appears far too often in most specifications). The closest thing I can find is the Gödel machine, but that's defined directly in terms of a Turing machine, afaict.