The Art of UNIX Programming
I've just finished reading The Art of UNIX Programming, by Eric S. Raymond. Its a fascinating book, and if you haven't read it, I highly recommend it. While I was reading it, someone asked me what it was about. The analogy that I came up with, is that this book is to programming (on UNIX and it's descendants at least) as Literature is to Language. If you've never read Tolstoy, you cannot claim to be fluent in Russian. You might have the vocabulary, however being fluent in something requires a cultural understanding. The Art of UNIX Programming gives that cultural understanding to all aspects of life, not just on on UNIX and Linux, but on programming in general. There are lessons that can be learned across all Languages (the programming sort this time), and all Operating Systems from the lowest level embedded system, to the highest level application software.
The most striking point for me was the concept of optimisation. While we all know the famous quote, 'Premature optimisation is the root of all evil', what I hadn't realised was that the quote is taken out of context. The full quote is 'Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a negative impact when debugging and maintenance are considered. We should forget about the small efficiencies, say about 97% of the time: premature optimisation is the root of all evil. Yet we should not pass up our opportunities in that critical 3%' [1]
This full quote, along with ESR's analysis of it finally drove the point home to me: yes, I should strive to produce the best code possible, however best and fastest are not always synonymous. I used to naively think that the fastest solution to a particular task was the best, and if that created an unmaintainable mess, then so be it. The overriding point that I took away from The Art of UNIX Programming was that the best solution is the cleanest and easiest to maintain. Shaving half a millisecond off the runtime is almost inconsequential (unless you have a very quick program to start with), however adding a bug because either you or your successor didn't quite understand the complex logic is definitely not inconsequential.
By the same token, make a program run in 1 second instead of 1.5 by rewriting the entire thing in C or even Assembler is probably not worth it, if you could write it in Python in a tenth the time. The bug count in software is directly proportional to the line count, and independent of the language.[2] Given this fact, writing in the highest possible language will reduce your lines of code, and thus your bug count. While in some cases, higher level languages are very slightly slower, paying even 1 programmer to spend a month creating something is significantly more expensive than just buying a better computer. This point is also made by Jeff Atwood on his blog, most specifically in Hardware is Cheap, Programmers are Expensive. Jeff is a VB.NET/C# developer, who blogs frequently about Windows relating things. The fact that he, from a totally different background, also makes the same point just reinforces it: spending huge amounts of time on tiny optimisations is rarely worth it. Jeff Atwood and Eric S. Raymond are both looking at this from different perspectives, one from a pure cost/benefit analysis point of view, and the other from the point of view of correctness and bug counts. This simply goes to show that not only is this specific point worthy of note, but also conveniently proves ESR's point that the design philosophies of UNIX work very well, even when not used in their fully intended way.
There are of course, exceptions to this rule. If many users are running your program, then the relative proportions change. The Linux Kernel, for example, is used globally, and therefore a minor speedup there will potentially benefit billions of people. But even there, the majority is written in C for portability reasons. The extra potential efficiency gained by rewriting in assembly would be lost when the entire kernel would have to be ported to a new platform. A kernel written entirely in assembly would be far buggier, and less portable.[3] Even with this in mind, moving to a higher level language would not be a good idea due to the efficiency lost, relative to the amount of people using this code.