You might be interested in Fred Brooks' seminal essay, No Silver Bullet -- Essence and Accident in Software Engineering. In it, he distinguishes between essential complexity and accidental complexity. Essential complexity is complexity that comes from the problem domain. It cannot be factored out of the program, and any attempt to do so will likely introduce bugs. Accidental complexity is complexity that arises from details of the implementation, and which can be simplified out of the implementation.
A good example of accidental complexity is memory management. A good chunk of programmer effort in languages like C and C++ goes towards ensuring that memory is managed properly, and that the program returns memory to the operating system when it is finished using it. Memory managed languages take that burden away from the programmer and place it either with the compiler (in the case of Rust's borrow checker) or with the runtime environment (in the case of garbage collected languages like Java, or Python). The effect of this has been a significant reduction in accidental complexity (at the cost of some performance), with a commensurate increase in programmer productivity.
There are several sources of spaghetti code that are possible:
If I had to guess, number 2 is the largest source of spaghetti code that Less Wrong readers are likely to encounter. Number 4 may be account for the largest volume of spaghetti code worldwide, because of the incredible amounts of line-of-business code churned out by major outsourcing companies. But even that is a reflection of economic realities. Therefore, one could say that spaghetti code is primarily an economic problem.
I agree with you. I have seen several times how underbudgeted software projects sacrifice general quality due to the reasons you point, and this is later paid in the maintenance phase. I also think that an extreme domain complexity is not the most common cause of the problems.
Another source of maintenance difficulties is the laziness when writing the software documentation. A hard-to-read code can be a good code but very difficult to understand by other person when adequate explanations are unavailable.
I realized that the documentation I initially wrote correctly explained the problem and its solution, and that the comments in the source code were useful and sufficient.
Feels to me like you may have fallen into the same trap twice, of deeming documentation "sufficient" when it explains things to the satisfaction of the version of yourself that already understands the problem and the solution.
But by the time you achieve understanding, naturally the necessary insights (those required to cross the gap from a starting point of not understanding) feel obvious, unnecessary to mention, positively insulting to the future reader's intelligence by their simplicity... and yet the you of a few hours earlier would have probably thanked yourself for spelling them out explicitly.
Without knowing all the specifics, it is of course impossible for me to say for sure if this actually applies to your case. But as a rule it seems like something to check for whenever you look back at documentation that has mysteriously come to seem more complete and sufficient without any actual edits.
“...But by the time you achieve understanding, naturally the necessary insights (those required to cross the gap from a starting point of not understanding) feel obvious, unnecessary to mention, positively insulting to the future reader's intelligence by their simplicity...”
Incorrect appreciation. When I achieved understanding, the insights felt far from obvious up to the point it was necessary to include a warning.
Take into account that that was only an illustrative example of the psychological effect of expecting an immediate understanding but instead finding something very hard to understand.
Thanks for your point of view.
Why hard-to-read code can be a good code. A complex criticism to the “keep it simple” universal coding advice
It is easy to find articles and decalogues where experienced software developers manifest what they think are good rules for coding. Usually those recommendations are expressed with strong belief due to the reinforcement from similar opinions of most other authors, and due to they apparently emerge from plausible goals like readability, maintainability or simplicity. Who could question those goals? Probably no one since they are desirable goals but…are they objectively measurable?
After years of trying to think like a machine, what surely has affected my perception of complexity, I can write this response to some of those recommendations, not because of an opposed belief but because a reflection about how those recommendations supposedly drive to the goals is necessary.
I would like to begin with an anecdote
Some years ago I was working in an R&D project trying to solve a really intricate problem. I had been blocked for more than five months when I finally saw the light. One week later I had the corresponding algorithm implemented and it worked. I felt like a victorious warrior after the hardest battle.
Having passed near two years I noticed that a determined casuistic was not correctly treated by the algorithm so I had to review it, nothing special. I read the algorithm’s documentation, I read the source code and, to my surprise, I was unable to understand it!. How was it possible!, I figured out the solution!, I implemented the algorithm!, I commented the source code!, I wrote the extended documentation!, then why wasn’t I able to understand what my own program was doing?
I read all again several times and it took me two hours of thinking about the problem until I achieved the understanding of the algorithm I implemented to solve it. The experience was so disgusting that I promised myself this will never happen again. I had to be more careful and exhaustive when documenting a process as complex as that and when commenting its implementation. Then I prepared to rewrite the extended documentation and…amazingly, I realized that the documentation I initially wrote correctly explained the problem and its solution, and that the comments in the source code were useful and sufficient.
What had actually happened was very simple. After so much time working on other different problems I had almost completely forgotten the details of this one and how I walked around it. The problem was too devious and the solution too complex to expect an immediate comprehension. But despite the initial documentation was adequate I concluded it was necessary to add the following warning for the next time that code need to be maintained: “WARNING: You will have to reflect about the problem and the proposed solution more than an hour before working on it in order to rebuild the mental schemes necessary to understand it.”
“Alice in wonderland” versus “The theory of general relativity”
Between the “Hello world” application and the most sophisticated scientific or AI tools, there are many types of software projects each one offering various degrees of complexity. All of them share a pretty obvious characteristic: they are all pieces of software written in one of the different available programming languages. This common feature usually leads to the belief that any piece of software can be maintained by any IT professional with enough experience in the corresponding programming language. While this is true for some maintenance tasks, the whole maintenance job in its full extension requires something else.
In order to successfully perform the complete and continuous maintenance work for a piece of software by a person distinct to the author, the professional should not only be an expert in the technology used to develop the software, this person also needs a deep understanding of the underlying problem that the software solves what, in return, reveals the very sense of the code to be maintained. When this problem is complex and new to the person, the immersion in the source code will demand effort, patience and, probably, additional experience in a specific knowledge field.
You can read “Alice in wonderland” at a rate of one page per minute, but you cannot do that when trying to understand “The theory of general relativity”. And you cannot ask Einstein for keeping it simple, just because it is not simple. It may take several months to feel comfortable with the existent source code when you join a new project, even when it is exhaustively documented. And if you are expecting to find “Alice in wonderland” but you face “The theory of general relativity” you will blame the original programmer because the code is unreadable, is obfuscated and is too hard to maintain arguing he didn’t follow those universal programming rules that everybody knows to generate “good code”.
The spaghetti mind
In the other hand, it is also true that there are disastrous programmers that write code unnecessarily devious, poorly documented and, when criticized, may argue they are misunderstood precisely because they are like Einstein. If you are programming the “Hello world” application and nobody is able to understand it, then you should review your coding methodology. But I’m not speaking about this case, I’m speaking about the case where the deviousness of the code arises from the real and unavoidable complexity of the underlying problem. Is there then a universal methodology to “keep it simple”?
Spaghetti code, this dreadful programming concept where too many things are chaotically interconnected, could be more natural than we usually think. Curiously the software is one of the best artificial representations of the processes carried out within our mind. Indeed our brain storage system is spaghetti memory (the technical term is hypergraph). Unfortunately most of these mental processes are intrinsically more complex than we would like, and little can be done to simplify them.
There is nothing wrong with implementing six or more nested loops when working with multidimensional objects if that is the natural way to do it. It is OK to modify the arguments of a function if you know what you are doing and why it is convenient. There is nothing bad in writing a function or a class method with 500 lines of code if the process demands it. And splitting that code in smaller functions may be of little help. Do not encapsulate a code fragment within a function if you don’t foresee to use that function in several parts or in a recursive algorithm. Forcing the creation of a function is creating a new object that complicates the code. It breaks the natural flow of the code and hinders the possibility of reusing variables already instantiated. And take into account that a function call is a process itself that consumes computing resources so avoid calling functions within a loop if you can.
If you want to increase readability, it’s better to separate the code fragments corresponding to specific subprocesses with blank lines and give them a title with an uppercase comment.
Conclusion
“Sometimes we have to find the beauty in the complexity of an efficient solution to a devious problem”
Sometimes an algorithm is extremely complex because the underlying problem to solve is extremely complex, and nothing can be done to make the code easily understandable. Some methodologies and recommendations demand a limit for the size of functions, methods or classes, but you should question how this can really make the code simpler. Make your code compact, optimize it removing unnecessary instructions and document it in detail.
The reaction before complexity depends much on psychology. We are naturally conditioned to perceive harmony in simplicity. But sometimes we have to find the beauty in the complexity of an efficient solution to a devious problem and in the exhaustive management of its wide casuistic.
To deal with devilish deviousness is a dirty work but, from time to time, someone has to do it. Keep it simple? Yes of course…when you can.