The C-style "oops I used an object after freeing it and now anyone can execute arbitrary code" style of vulnerabilities is confined to a fairly narrow set of programming languages, most notably C and C++, which unfortunately happen to be popular and to have some interoperability advantages. One of the key requirements of a security-amenable language is that it can never tempt its users into writing parts of their project in C, which happens if the language is too slow (eg Python) or can't otherwise interoperate with important systems (most languages, unfortunately). Some programming language choices create performance and sysadmin problems that are treated by end-users as dealbreakers; in fact this was responsible for most of C's historical success.
A lot of issues crop up at the interfaces between programming languages, which tend not to fall neatly into one of the languages' scope. SQL injection is a classic example. A language can't do a whole lot in practice to protect against that, but a library can: by designing the APIs so that doing things the safe way is easy, and doing things the dangerous way requires calling functions with "dangerous" in the name (like React's dangrouslySetInnerHTML
).
If I import third-party libraries from the internet, there is a lot of opportunity for mischief (both by those third parties, and by the people who might hack them). This is probably the nodejs ecosystem's weakest link at the moment. This problem is largely social; the best defense is a trustworthy group of curators providing a core set of libraries that you rarely need to step outside, plus an expectation that libraries are large projects that avoid creating indirect dependencies and that you use a small number of. (As opposed to having a hundred different left-pad
style tiny libraries from a hundred different authors.)
Reducing security vulnerability incidence to zero is possible in some domains, but a programming language alone can't do it; I can write if(password=="backdoor") acceptLogin()
in any language.
I think there's a fair amount of room left for incremental improvement, but in practice I think it looks less like "move everyone to Haskell and Coq" and more like "design a good core-library crypto API for X" and "reform common practices around npm".
it looks less like "move everyone to Haskell and Coq" and more like "design a good core-library crypto API for X" and "reform common practices around npm".
+1, coq is my dayjob and I'm constantly telling people to be less messianic when they talk about formal verification.
by designing the APIs so that doing things the safe way is easy, and doing things the dangerous way requires calling functions with "dangerous" in the name
+1, want to extend it by paraphrasing the ancient folk wisdom that I think comes in flavors for both libraries and languages: "t...
It is already possible to build an embedded language inside of Coq which requires arbitrary security proofs for all executable code. It's theoretically possible to make those proofs guard against all known side channels, including, theoretically, hardware-based side channels such as Row Hammer.
Are you asking about which kinds of attacks can't be stopped by improving software? Or are you asking about the theoretical limits of PL technology? The latter question is not so interesting without more constraints: as stated above, they're basically unbounded.
(If you think the question is too underspecified to answer, you probably shouldn't try to post an answer in the answers section. There is a comments section.)
(I'll try to work this into the question description)
Are you asking about which kinds of attacks can't be stopped by improving software?
That would be an interesting thing to see discussed, sure.
Or are you asking about the theoretical limits of PL technology?
No, that might be interesting from the perspective of.. what kinds of engineering robustness will exist at the limits of the singularity (this top...
Did not know about the answer/comment distinction! Thanks for pointing that out.
Before I dig deeper, I'd like to encourage you to come bring these questions to Formal Methods for the Informal Engineer ( https://fmie2021.github.io/ ), a workshop organized by myself and a few other researchers (including one or two more affiliated with LessWrong!) specifically dedicated to helping build collaborations to increase the adoption of formal methods technologies in the industry. Since it's online this year, it might be a bit harder to have these deep open-ended conversations, but you sound exactly like the kind of person we want to attend. (To set expectations, I should add that registrations already exceed capacity; I'm not sure how we plan to allocate spots.)
I'd also like to share this list of formal methods in industry: https://github.com/ligurio/practical-fm . In the past decade, there's been a huge (in relative, not absolute terms) increase in the commercial use of formerly Ivory Tower tools.
You may also be interested in the readings from this course: https://6826.csail.mit.edu/2020/
BTW, I've been trying to think about whom I know that directly works in language-based security. ...
Given the most security-amenable language you know, or that you could imagine building (or given some tools and processes that change the qualities of an existing language), by what proportion could we reduce the incidence rate of major security errors? In what domains could incidence be reduced all the way to zero?
Are there situations where your more secure language can't currently be used, because it imposes higher development or runtime costs?