EmailMainTagsEditHistoryDiscussion (25)

  1. Introduction
  2. General coding principles
    1. Write your code for humans
    2. Don't repeat information
      1. Don't implement the same logic in multiple locations
      2. Understand the cost of maintaining copies of your data
    3. Extend your language to match your domain
    4. Procedures should be referentially transparent
    5. Prefer descriptive code over comments
    6. Code should reflect the intention
    7. Be consistent
  3. Specific coding practices
    1. Make your types explicit
      1. Example: Orc attacks Robot for 4 hit points!
      2. Sets
    2. Give names to your constants
    3. Mark your constants
    4. Name your procedures in a general manner
    5. Use the right level of quantification
    6. Procedures should receive few parameters
    7. Use enumerations rather than booleans as parameters
    8. Deal with the easy cases first
    9. Constants at the left when testing equality with a variable
    10. Variables should have the smallest possible scope
  4. Functional aspects
    1. Defaults
    2. Logging
      1. printf debugging
      2. Libraries
      3. Semantically rich information
        1. Severity level
        2. Module
      4. What to log
      5. Logging in console applications
      6. Liberal logging defaults
  5. Anti-principles
    1. Premature optimization
    2. Premature generalization
    3. Nested scopes
    4. Threads
    5. Global variables
    6. Anti-principles with exceptions
      1. Catch all exceptions
      2. Exceptions for normal conditions
      3. Catch exceptions in big blocks
  6. The process of writing code
    1. Constantly evaluate your work
    2. Split your program into modules
    3. Work in small chunks
    4. Refactor your code
    5. Don't do The Simplest Possible Change That Works
    6. Use version control
  7. Test your software
    1. Assertion checks
    2. Unit tests
    3. End-to-end tests
    4. Other types of tests
    5. A word of caution about interactive testing
    6. Static type checks are useful

Introduction

These are generic principles for writing software that I follow.

I don't expect others to abide by them. I hereby renounce any claims of:

  • Being an expert in the topic.
  • This document containing objective or universal truths.
  • This document being insightful in any way, as seasoned programmers may find most of it rather obvious.

I'm writing mostly for my own benefit as a way to organize my thoughts about the practice of programming. I decided to publish it in order to spark some conversations with others about how they perceive these practices. Most definitely, I am not telling anyone that this is how they should write their software.

Do you know the discourse of the “rules of photography” —the “rule of thirds”, “keep it simple” or “uncluttered”, considerations about the effect of straight and diagonal lines, the “golden rule”, the “theory of colors”, etc.—, which serve as generic guidelines on what and how to shoot? These principes are just like those rules: the point is not to follow them blindly but to understand the knowledge they embody and know how and when to break them.

There are exceptions to these principles and they sometimes contradict each other. As I was writing them I found my self writing things like “Of course, there are cases where this principle will actually harm you”. I've decided to just add a disclaimer here instead of everywhere. Statements like "Software should always do X”, “Never do X”, “All your procedures should do X” or “Doing X is pointless” in this writing mean that this is so in most cases, but there are plenty of exceptions. As with most generalizations, these shouldn't be taken to the extreme. They are no substitute for common sense.

I expect to update this document in the future as my points of view change or as I notice things I've left out. If you come back to this document in the future, you may want to check the History to get a list of changes I may have applied since you last read it.

I've included examples of how I've used these principles in some projects I've worked on. I've also added some links to related information.

Without further ado, here they are.

General coding principles

Write your code for humans

Programs must be written for people to read, and only incidentally for machines to execute. —Abelson & Sussman, Structure and Interpretation of Computer Programs

62

Think of your code as a document for a human. If you were to write a document explaining how a certain algorithm works, what would you write? Your code should be very close to this document, while still being written in a form that allows a computer to execute it.

In some environments I use literate programming for this. I'm experimenting with writing my programs as wiki pages from which the code is extracted: the xc calculator, my embedded-test egg and my stream-ext egg. I've found that this makes it more pleasant for me to write code. I think it makes a nice combination with my somewhat aggressive partitioning of programs into modules (a topic that I'll come back to). Each module becomes a document on its own.

However, you don't have to use literate programming for this: you can continue to use the language you're using (without any literate programming framework), but strive to write your code in the form that makes the most sense for a human to understand, not for the computer to execute.

To decide things such as whether to move a part of a procedure into a new procedure or what semantics to use for your variables, your criteria should be to do whatever makes the code easier to understand, not what makes the most sense from the point of view of the computer or what makes the resulting program run faster.

Organizing and documenting code for human consumption takes more time than just writing it for a computer to run it. However, structuring it in the way that makes the most sense for human understanding tends to raise its quality significantly, as it makes it easier to reason about it and encourages you to document design decisions that would otherwise be forgotten. I've found that this practice makes the act of writing software more enjoyable.

Making code shorter does not necessarily mean it will be easier to understand. Your goal should be to reduce the amount of complexity that a human has to grasp to see how your program works. This often coincides with reducing the amount of code —specially if you refine the statement of the problem— but not always. Avoid reducing the amount of code when it makes your program harder to follow. Long but predictable is better than short and hairy.

Don't repeat information

There should be a well-defined canonical location for information. This principle has two important ramifications that I'll explore separately in the next sections.

Don't implement the same logic in multiple locations

Firstly, it applies to source code: don't repeat yourself. Don't implement the same logic in multiple locations.

Look for problems that are solved repeatedly in the body of different procedures: in this case, capture the solution in a third procedure that the original procedures can use. For example, you'll sometimes need to build a set of procedures that provide similar services in a given program and, as a result, share some logic. In this case, create a procedure with the common logic and make the others use it.

Nobody Moves

Here is an example, which comes from a real program that we wrote at Google to process (downsample) information in a large database; I've greatly simplified the procedures shown here for the purpose of removing unrelated complexity. The program walks a directory tree with information and, for every file, creates a downsampled copy in another location. As part of this program, we had functions such as these:

def DownsampleAllFiles(path):
  """Schedule for downsampling all files in path."""
  if IGNORE_FILE_REGEXP.match(path):
    return
  if not IsDir(path):
    if not HasBeenDownsampled(path):
      ScheduleDownsamplingOfFile(path)
    return
  for file in ListDirContents(path):
    DownsampleAllFiles(os.path.join(path, file))

def CountDownsampledFiles(path):
  """Count files in path that have already been downsampled."""
  if IGNORE_FILE_REGEXP.match(path):
    return
  if not IsDir(path):
    if HasBeenDownsampled(path):
      return 1
    return 0
  count = 0
  for file in ListDirContents(path):
    count += CountDownsampledFiles(os.path.join(path, file))
  return count

def DeleteDownsampledFiles(path):
  """Delete files that have been downsampled."""
  if IGNORE_FILE_REGEXP.match(path):
    return
  if not IsDir(path):
    if HasBeenDownsampled(path):
      os.unlink(path)
    return
  for file in ListDirContents(path):
    DeleteDownsampledFiles(os.path.join(path, file))

Clearly, the logic for walking the directory recursively should be captured in a separate procedure:

def YieldFiles(path):
  """Yield (path, IsDownsampled(path)) pairs for all files in path."""
  if IGNORE_FILE_REGEXP.match(path):
    return

  if not IsDir(path):
    yield (path, IsDownsampled(path))
    return

  for file in ListDirContents(path):
    for entry in YieldFiles(path):
      yield entry

Then the previous procedures (and any other in the program that walk a path in a similar fashion) could be rewritten as:

def DownsampleAllFiles(path):
  for (downsampled, file) in YieldFiles(path):
    if not downsampled:
      DownsampleFile(file)

def CountDownsampledFiles(path):
  count = 0
  for (downsampled, file) in YieldFiles(path):
    if downsampled:
      count += 1
  return count

def DeleteDownsampledFiles(path):
  for (downsampled, file) in YieldFiles(path):
    if downsampled:
      os.unlink(file)

However, you can still apply the principle further by defining YieldDownsampledFiles and YieldNonDownsampledFiles as wrappers around YieldFiles, procedures that yield only files that have and have not been downsampled, respectively. With those you could simplify the others further to:

def DownsampleAllFiles(path):
  map(ScheduleDownsamplingOfFile, YieldNonDownsampledFiles(path))

def CountDownsampledFiles(path):
  return len(list(YieldDownsampledFiles(path)))

def DeleteDownsampledFiles(path):
  map(os.unlink, YieldDownsampledFiles(path))

This example shows some of the advantages of avoiding repeating information in your code: it tends to make the code clearer and, of course, more easily reused. Chances are that new procedures will be needed in the future that also require the shared logic (walking a directory tree) and these will be much easier to write. This is related with the next principle.

However, two additional advantages that may not be that obvious from the example are that:

  • errors are now less likely to occur, as the amount of code that needs to be audited and tested is reduced, and that
  • adjusting the program to meet future requirements will be easier. If the logic for doing a certain operation changes, application of this principle will have reduced the amount of code that will require changes. Consider, in both cases in the example above, what happens if, in addition to files matched by IGNORE_FILES_REGEXP, you need to ignore also those that are older than a certain timestamp or smaller than a certain size.

Applying this principle doesn't always require creating a new procedure. You will often encounter blocks of code like this:

if (rect->info()->has_label_name())
  rendered_image.DrawTextInRect(label_position,
                                label_bottomright,
                                text_color,
                                rect->info()->label_name());
else
  rendered_image.DrawTextInRect(label_position,
                                label_bottomright,
                                text_color,
                                default_name);

According to this principle, this code should probably be rewritten as:

string name = default_name;
if (rect->info()->has_label_name())
  name = rect->info()->label_name());

rendered_image.DrawTextInRect(label_position,
                              label_bottomright,
                              text_color,
                              name);

Understand the cost of maintaining copies of your data

This principle also applies to data. Only store the same data in multiple locations when you have a reason, such as reducing the computational complexity of your program. If a value can be obtained in reasonable time by inspecting structures on which it is implicitly available, you shouldn't keep any copies of it. If you have to repeat information, document which is the canonical source and, if you can, periodically verify the consistency of copies.

For example, if obtaining the length of a list (or an array of objects that ends with a special value) didn't require traversing the whole structure, it would be preferable not to keep its length explicitly. The only reason for doing so is that computing it by traversing the list can be very expensive.

The main concern with repeating data is not so much that it requires additional resources but rather the need for additional logic to keep the copies synchronized: all your functions that modify the original data now need to modify the copy. By introducing copies you open the door to errors such as the possibility of the copies falling out of sync, introducing temporary or permanent inconsistencies.

This has been known since 1971 in the databases domain as part of the “normalization” theory. Second normal form (2NF) and Third normal form (3NF) are nothing more than ways reduce the possibilities for data inconsistencies in your database tables simply by designing them in ways that avoid redundancy of data. This principle, however, also apply to the design of program structures.

The program for performing transformations (“downsampling”) of files in a large database at Google that I described in the previous section also serves as a real life example of this principle. At some point during its construction we decided to add a web interface to present reports such as the list of files that have been processed. When we first added this, the class for the web server kept internal variables such as the number of files processed and the sum of their sizes. This had the disadvantage that a lot of the operations in our program had to update these fields of the web server class explicitly as they churned through the data. We quickly refactored this and now the web server just receives callback functions to obtain this information directly from the canonical locations when needed.

According to this principles, the use of caches for information is justified as they can greatly reduce the time required to perform certain operations. Obvious examples of this are HTTP caching proxies, memcached and in-process memoization. One should, however, be aware of the cost, in terms of additional complexity, that they introduce.

Central Station

The Baked, not Fried model for web development makes the problem of keeping copies of data evident. According to this architecture, most accesses to a dynamic web site should be served from static files without requiring any special functionality in the web server, static files that have been previously generated dynamically based on, for instance, data from a database or, as is the case of Svnwiki, a Subversion repository. When Aaron Swartz wrote about it in 2002 (the oldest reference to this model I've found), he said:

The one problem with the “bake” philosophy is dependencies. It’s difficult to keep track of which pages depend on which others and regenerate them correctly when they change. Movable Type handles this in the obvious cases, but when you do anything other than creating or editing an entry, it makes you manually rebuild the corrector portions of the site. Tinderbox, a speedy C++ program, seems to regenerate the whole site every time. It seems that for this philosophy of database-backed static pages to take off, we’d need a really good dependency system to back it.

There you have it: in this model you have a program that can generate a page from information stored in a database (the canonical location), and you maintain a copy in the form of a pregenerated static file. Keeping the two in sync (detecting what needs to be updated as a result of changes in the canonical location) is not a trivial problem. This has also been a source of significant complexity for Svnwiki, which uses this model. Needless to say, the Baked, not Fried model brings significant advantages that justify its use in many circunstances.

Be aware of the cost introduced by each copy of your data that you maintain.

Related links:

Extend your language to match your domain

If your base language is not the best for your logic, extend it and then use the extended language for encoding your logic. Do this by creating new procedures, macros (if your language has good macro support) or specialized languages. Do this specially for the most complex parts of your program.

There are countless examples of the advantages of very popular sub-languages that extend programming languages and make it easier to represent certain types of logic:

  • The prevalence of the printf procedure in C and similar procedures in nearly every language. The Scheme community tends to abhor its Scheme-equivalent, format (see for example the background section in the fmt egg), but they are the perfect example of what you should aim for. I rely on format so much that I ended up re-implementing it (see the format-modular egg, where I authored most of the original implementation, borrowing some ideas from Alex Shinn, and which has now been extended by Kon Lovett) because the original implementation that came with Chicken was ridden with bugs and not reentrant. In any case, it's clear to me that printf-like procedures will very often be a better way to encode some logic than having to explicitly convert things to strings and writing them.
  • The use of regular expressions. They are often a very good language for expressing certain operations. As the official Perl documentation says, “regular expressions are one of the big factors behind” Perl's fame for “excellence in text processing”.
  • The use of SQL, which can very succinctly express operations that would otherwise be very verbose.
  • The use of parser generators such as Bison.

By using these and similar languages, one can very succintly express models that would otherwise be less readable.

There are two approaches to extending a language for a specific purpose. The first is simply to define new procedures or libraries that allow you to express the most complex aspects more readily. Programmers do this all the time.

Life Support

The second, less obvious, is to actually design new languages or data structures —such as those listed in the examples above— in which the complexity can be represented. Sometimes you'll find that as a program grows in complexity one may significantly improve its readability by splitting the problem into two layers: one implementing a language to talk about the problem and another using that language to implement the particular model that solves the problem.

There are several examples of occasions in which I've done this. I guess the most note-worthy are the following:

  • I decided that using strings (vector of bytes) wouldn't work for representing most of my algorithms in Svnwiki as well as using lazy streams of characters. So I created a library to perform common string operations on streams of characters. Then I created a library for easily creating parsers for this type of streams (as a set of macros). Finally, I used that library for the rules for parsing wiki documents. I ended up reusing that library for xc.
  • I created my own framework for expressing unit tests in Scheme, which has a somewhat different approach to most of the ones that Chicken had when I created it.
  • I created xc precisely because I wanted to be able to type an arithmetic expression and get the result of its evaluation in the most readable way. I've used this to, among many other things, express the logic behind some estimates for resource requirements for several data-centers resulting in resource requests on the order of magnitude of tens of thousands of servers. By expressing the logic in the xc language, others can focus exactly on the logic of what I'm doing and extend it directly to produce new results, without having to spend any time on the implementation details of the language that executes that logic. If, instead, I had used a general-purpose programming language, I would have made it slightly harder for them to follow the logic of my models and extend them.

Procedures should be referentially transparent

Make your procedures pure functions: write them in such a way that they'll always return the same value when passed the same parameters and that they are free of side-effects. The inputs and outputs that a procedure depends on should be clearly delimited.

Decreasing the number of expressions that are not referentially transparent in your programs makes them easier to understand and test.

Use command-query separation. While you'll be able to free most of your procedures from side-effects and dependencies in global state, you should clearly mark those that are not, probably by following some naming convention or language constructs. In Scheme, for example, it is customary to end the name of procedures with side-effects with an exclamation mark (though this is only done for those that modify the state, not for others such as opening a file or returning a random number).

Related links:

Prefer descriptive code over comments

Make your code describe what is happening by itself, instead of relying on comments that explain it.

Don't let this hold you from commenting your code, though. If you think a comment helps, by all means add it.

If you have a block inside a procedure that does something where the intention is not entirely obvious, instead of (or in addition to) writing a comment describing it, turn it into a separate procedure and describe its intention in its name.

Code should reflect the intention

The computing scientist's main challenge is not to get confused by the complexities of his own making. —E. W. Dijkstra

Your code should be written in a way that reflects its intention. Name your procedures descriptively.

Write your procedures in a way that the goal (and sub-goals) are evident instead of just being side effects of their body. Avoid being too smart for your own good. If side effect of operations are important, make them evident. Be explicit.

The point is to minimize the time it takes for someone that looks at the code to figure out why it was written the way it was written, how it works and how to modify it.

Museum of Communism

Be consistent

Respect and promote conventions. Do similar tasks in similar ways. Create and use idioms in your programs (for things for which widely used idioms don't exist). For example, write certain types of iteration in the same way always.

If you're extending your program, write your new code in such a way that it conforms to the current standards. If the current standards are wrong, change them (and every part of the code where they are embedded); if you don't have time to change them right now (or you're not allowed by the maintainer to change them), continue to follow them, even if you don't agree with them. If you don't do this your code will tend to become a tangled mess of different ways to do the same thing and it won't be clear to newcomers what the preferred ways are.

Most programming languages will develop idioms over the years. Use these, as they tend to briefly encode the intention of what you're doing. One sub-case of this is the use of design patterns in OOP. If you're into OOP, learn and use standard design patterns.

If your program is large enough to warrant it and you expect others to contribute to it, document the coding style.

Related links:

Specific coding practices

Make your types explicit

Other than for trivial structures, use explicit types for the different data structures that you use, types that reflect the semantics of the operations being performed. This is related with the generic principle that states that code should reflect the intention. What this principle specifies is that, in general, you should make explicit the information about how a given data structure works.

The principle has two related aspects.

  • The primitives used to inspect or modify your data structures should reflect the semantics of the data structures (and hide clutter from details of their underlying implementation).
  • The data structures themselves should be tagged corresponding to their type, making instances of two different types trivially distinguishable. This is less important than the first aspect, but nice to have.

I'll explain both with an example but first I'd like to point out that I'm not talking about static (compile-time) versus dynamic (run-time) type checking: this principle is orthogonal and applies to both —though, in practice, it seems more relevant on dynamic languages, as static languages tend to discourage not following this principle.

Example: Orc attacks Robot for 4 hit points!

Lets say you're writing a web-based game and you need to represent monsters. Each monster has a name, a path for its corresponding image, an integer encoding how many hit points it currently has and an integer encoding how many hit points of damage it takes in every turn.

A simple approach, which is quite common in Lisp environments, is to represent this as a list. In Python, you could define a group of monsters along the lines of this:

NASTY_FELLOWS = (
    ('Orc', 'orcs01.jpg', 20, 5),
    ('Orc boy', 'orcs02.jpg', 14, 3),
    ('Robot', 'robot.jpg', 40, 3),
    ('Troll', 'arhuaco.jpg', 80, 20),
    )

The problem with this approach is that your code that manipulates the monsters won't reflect its intentions very readably. Your function that makes a monster attack another monster would look like this:

def Attack(attacker, victim):
  points = int(random.random() * victim[3])
  victim[2] -= points
  Show('%s attacks %s for %d hit points!', attacker[0], victim[0], points)

Instead, this principle states that you should create a proper data structure to represent the monsters. The option to choose probably depends on your programming language. In our example in Python, you could probably use a Monster class:

class Monster(object):
  def __init__(self, name, picture, hitpoints, damage):
    self.name = name
    self.picture = picture
    self.hitpoints = hitpoints
    self.damage = damage

The code that manipulates monsters now becomes much more readable:

def Attack(attacker, victim):
  points = int(random.random() * victim.damage)
  victim.hitpoints -= points
  Show('%s attacks %s for %d hit points!', attacker.name, victim.name, points)

This makes it easier to reason about your code, as it reduces the amount of information that you need to keep in your mind while doing so (“is the second field the hitpoints or is that the path?”).

Another approach would be to define functions to access and modify the attributes of the object, but to keep on using lists. While this is already a significant improvement over the first approach, it means that instances of different types could be indistinguishable. This makes it harder to detect typing errors.

Sets

A corollary to this principle is that, when designing programming languages, the fact that a set can be trivially implement on top of a hash table does not imply that the language should not also have an explicit representation for sets. Sets of objects occur incredibly frequently in computer science and programs that use them should not have to confuse their readers by hiding their intention and confusing them with hash tables (or other structures). A set should not look as a hash table: that's implementation details.

If you're using a programming language without an explicit type for sets, you should probably define a simple library and use it. I did this for Scheme:

Give names to your constants

Introduce variables instead of hard-coding constants. For example, define “int seconds_per_day = 24 * 60 * 60” and then use “seconds_per_day” instead of the value. The names of the variables should describe what the values represent.

The main advantage from this that you're making evident what the value represents. A secondary, albeit smaller, advantage is that it makes it easier to redefine these constants.

Mark your constants

If the typing system of your language supports it, mark your constants as such, to protect them against unintended modification.

In C++, use const. In Python, use tuples (eg. (1, 2, 3)) instead of lists (eg. [1, 2, 3]).

Name your procedures in a general manner

Name your procedures after what they can do in general terms, not according to one particular use of them. The name should reflect the intention.

Your procedure names should be verbs or verbs followed by nouns. I often prefix them by the name of the library or data type that they operate on (eg. hash-table-get), specially on languages like Scheme that don't separate objects from different libraries into different name-spaces.

Use the right level of quantification

When expressing operations on collections, use the level of quantification that makes the logic easier to follow. In other words, introduce names for variables only when doing so aids comprehension.

Sometimes it makes sense to give a name to every member of the set and express your logic in terms of the operations performed on the member but sometimes it makes sense to not name individual members but rather express the logic in terms of set operations.

Cats Cafe

This should be obvious but I've seen people struggle with this. Interestingly, it not only applies to code itself but also to documents in natural language.

Consider the following code:

def FindLocalTowers(self, tower):
    towers = set()
    for t in self.towers_by_region[tower.region]:
        if not t.enabled:
            continue
        if t.protocol != tower.protocol:
            continue
        towers.add(t)
    return towers

One small improvement is that you can eliminate the use of the towers variable with a generator:

def FindLocalTowers(self, tower):
    return set(t for t in self.towers_by_region[tower.region]
               if t.enabled and t.protocol == tower.protocol)

However, the main point is that you can eliminate the use of the t variable if you don't need to refer to elements of the set individually:

def FindLocalTowers(self, tower):
    region = self.towers_by_region[tower.region]
    protocol = self.towers_by_protocol[tower.protocol]
    enabled = self.enabled_towers
    return region & protocol & enabled - set([tower])

Sometimes, however, it will make more sense to do the opposite: to explicitly name the elements on which you are operating.

The Lovers & The Wall

Procedures should receive few parameters

Procedures should not receive more than 4 parameters. Adding parameters to procedures adds overhead to their use.

There are multiple ways to deal with this:

  • You may want to create new structures (or classes or records or however your language calls them) and pass those to the procedure. I've found that this is useful specially when more than one procedure receives a lot of parameters and they just pass them around. Examples of this are “configuration” objects that encompass a lot of values.
  • You may want to rethink the way your complexity is split into procedures. For example, you may want to split a complicated procedure into multiple ones or instead of passing a lot of values to a procedure, passing it another procedure (a call-back/closure) as an argument to perform some of the operations.

Some languages (eg. Common Lisp, Python) support optional named parameters with default values, where you can call your procedures passing just the few parameters that you need. In this case, adding optional parameters doesn't add any overhead to calling them. However, if in most cases programmers will actually be passing more than 4 parameters you're still doing something wrong.

As Svnwiki was evolving, I found that the number of parameters that I passed around to a lot of procedures started to grow out of control. I would often need to start passing a certain parameter from some code way up in the stack to a procedure that would only be called indirectly by means of a long chain of calls. This required me to start passing the new parameter in a lot of procedures in the call-chain, even though only one would use it for more than just passing it around. It became obvious that a change was needed.

I decided to implement an “environment” object to pass around, which, simplifying things a great deal, acts simply as a hash table. This greatly simplified maintainance of Svnwiki, by making it trivial to add new information to be passed around by all procedures and by significantly reducing the overhead of calling most of Svnwiki's procedures. This environment object is also at the heart of Svnwiki's extension system and allows me to add to the information passed around anytime I want to, without having to modify the API that Svnwiki's extensions use.

You could argue that one needs to create an environment object before being able to call a procedure and thus the cost is not really reduced, just displaced. However, the environment object only needs to be created once and then it can be used (with or without modifications) to call these procedures countless times. The savings are quite obvious.

Also, compared to using global variables, the environment objects have the main advantage of making the code reentrant, which matters a lot in this particular program where lazy evaluation is very often used. Using global variables would create problems when a given function has to change their value only for the purposes of the procedures it calls. I'll come back to the topic of global variables.

Related links:

Use enumerations rather than booleans as parameters

Make your procedures receive values from particular enumerations instead of booleans. Consider the following procedure (which comes from Svnwiki's stream-wiki egg):

; This is the parser for text that occurs inside a given set of
; <p>, <pre>, <blockquote>, <li>, <dt> or <dd> tags.

(define (text-transform info strong em literal start newline-rep)
  ...)

If strong, em and literal are booleans, a call with some values will look like this:

(text-transform myinfo #t #t #f mystart #\newline)

If, instead, they are elements from enumerations (each from its own enumeration), a call would look like this:

(text-transform
  myinfo
  text-transform-strong-enabled
  text-transform-em-enabled
  text-transform-literal-disabled
  mystart
  #\newline)

When you see a call in the second form, you don't have to go back to the procedure's definition to look up the order of the parameters and figure out which are the ones you're enabling and disabling. Also, if you accidentally get the order wrong in a call, the code is more likely to detect the problem (by static or dynamic type checking).

Fiat

Deal with the easy cases first

When writing code to tackle some problem that can be partitioned into multiple cases of different complexity, deal with the easy cases first.

As you gradually shrink the problem by dealing with particular cases, by starting with the simple cases, at each time you're dealing with the simplest possible problem. When you get to the hardest case, this will be the only remaining case, which is when you most benefit from having already simplified the problem. This reduces the amount of context that the reader has to keep in mind to understand your code.

Constants at the left when testing equality with a variable

When testing if a variable has a given constant value, write the constant value at the left of the equality operator (ie. write 0 == variable instead of variable == 0). The main reason to do this is that it helps differentiate comparison expressions from variable assignment.

In many languages (Java, Javascript, C++, ...), if you miss one of the equal characters, the second form will be incorrectly interpreted as an assignment, even if it occurs in a position that is almost exclusively used for comparisons and very rarely for assignments such as the guard of an if or while expression. In contrast, the recommended form, of placing the constant value at the left, will turn into a syntax error, a much preferable behavior.

Variables should have the smallest possible scope

Functional aspects

Defaults

TODO: write

Logging

Software contains defects. If you're lucky, they will be easy to reproduce. If you aren't, they'll only occur very rarely and they'll be very difficult to debug. When a defect manifests itself in production, you'll want to have as many information as possible about the state of the program that lead to the problem. Availability of good logs will often be of great help, sometimes making it trivial to find the cause.

In order to do this, add as many logging statements as feasible to your program, documenting the trace of the operations it performs (or code paths it executes) and their inputs. You will find that you'll be able to add a significant amount of useful logging statements to most programs. The only limit to the amount of logging statements in your program should be the point where it starts impacting its performance in a measurable way.

printf debugging

Sometimes it will help you to add log statements in the suspect areas and re-run the program if you need to find a reproductible bug which the logs of a faulty run are not good enough to pinpoint (and also consider adding assertion checks, as described below).

This technique is called “printf debugging” in some circles and it's relatively frowned upon. Certainly, use of a proper debugger —where one can do sophisticated things like adding watchpoints, evaluating complex expressions that were not part of the original program, run the program backwards or modify the state of the program— beats adding printf statements to have to recompile your program and re-run it.

However, if you use a reasonably good API for logging, you won't need to erase your logging statements once you're done debugging the problem and they may help you in the future. In my opinion, the only problem with printf-debugging is that it's often done poorly, using the wrong frameworks; if done correctly, approaching the problem as an issue of improving the logging performed by the application until the logs are meaningful enough to find and understand the cause of a defect, it works well in many cases.

Libraries

Several libraries/APIs exists to perform logging. In this article I will consider the following:

The syslog API
One of the most used APIs may be the syslog API, used by most GNU/Linux programs. It is expected that system programs in GNU/Linux (and other Unix systems) will use it to report information to the system administrator, which may configure centrally which entries should be logged for which programs in which files.
The Google Logging Library (glog)
A Google library for C++ applications, released under a BSD license. It's rather easy to use and convenient, but it adds a dependency to your program on a non-standard library. I've used it exhaustively as part of my work at Google.

Semantically rich information

You will, of course, want to include semantically rich information in your log entries. In addition to the free text description of the information being logged, I would recommend including the following information with every entry:

Date
Every entry should include the date in which it was added to the log. When inspecting the logs, this will help you verify that different operations take a normal amount of time to run.
Process or thread ID
This is meaningful if your program uses multiple threads (but see the threads anti-principle) or multiple process, all of which log to the same file.
Source code file and line number
This is the location in the code where the logging statement that caused the entry to be logged occurs. The reason to log this is to make it easier to trace back log entries pointing out anomalies to the code where they happen. Note that if one uses syslog as the framework for logging, one will probably want to wrap the calls to syslog(3) with a C macro to include this information at the beginning of the line.
Verbosity or severity level
Since you'll be logging a lot of information, it is often desirable to be able to supress the most verbose entries and only look at relatively meaningful entries. Typically, you'll have a “debug” severity, which will be disabled by default as it will tend to be very verbose (and you'll add a command line to enable it), and more important severities, which will be enabled by default. I'll get back to this.
Module or facility
Each entry should have a tag describing it's type. This will be used to control which types of log entries to display from a given log, to let the programmer focus on certain parts of the program —the ones relevant to the problem under investigation. I'll also get back to this.
Severity level

Typically, the severity level is a hierarchy of how serious or semantically meaningful the condition is, ranging from

  • critical conditions —such as “the output file could not be updated because of permission errors, information is being lost!”— through
  • less significant but still important conditions —such as “I appended 2 records, of sizes 2056 and 8923 to the output file, with IDs 9A38 and 9A39”— to
  • simple debugging statements —such as “I'm allocating 2048 bytes of memory for an input buffer”.

In the syslog Unix standard, these levels would be:

LOG_EMERG
system is unusable
LOG_ALERT
action must be taken immediately
LOG_CRIT
critical conditions
LOG_ERR
error conditions
LOG_WARNING
warning conditions
LOG_NOTICE
normal, but significant, condition
LOG_INFO
informational message
LOG_DEBUG
debug-level message

In the glog library they are:

  • FATAL, which terminates the program execution,
  • ERROR,
  • WARNING, and
  • INFO

In addition to these, the glog library provides a VLOG statements which lets you specify the verbosity of an entry as an integer. Through a flag, you specify the desired level of verbosity that you want the application to log.

Module

The module or facility that caused the entry to be logged can be used to easily focus the attention on a specific subsystem of a large program, ignoring information about the rest. For example, one may give attention solely to the aspects related with a particular layer of the networking code.

The syslog specifies several facilities, mostly with the goal of letting the system administrator specify where the log entries for different programs should go. Unfortunately, the list of available facilities is fixed, forcing one to pick from a list:

LOG_AUTH
security/authorization messages (DEPRECATED Use LOG_AUTHPRIV instead)
LOG_AUTHPRIV
security/authorization messages (private)
LOG_CRON
clock daemon (cron and at)
LOG_DAEMON
system daemons without separate facility value
LOG_FTP
ftp daemon
LOG_KERN
kernel messages
LOG_LOCAL0 through LOG_LOCAL7
reserved for local use
LOG_LPR
line printer subsystem
LOG_MAIL
mail subsystem
LOG_NEWS
USENET news subsystem
LOG_SYSLOG
messages generated internally by syslogd
LOG_USER (default)
generic user-level messages
LOG_UUCP
UUCP subsystem

This shows that syslog was designed with the system administrator, rather than the software engineer, in mind. In other words, while very helpful to analyze the behavior of a particular machine, syslog may not be as helpful to debug programming errors as other libraries such as the Google Logging Library: no semantically meaningful field is provided for log entries other than facility, which, as shown above,

  1. has entries so coarse that they encompass multiple programs and
  2. has a fixed set of entries (rather than accepting a free-form string).

A potential way to solve this would be to standarize for your application a format for of all lines logged, that includes as its prefix a free-form string with the module name (probably in addition to the source code file and line number for the statement that caused the entry to be logged).

In the Google Logging Library, the module is the name of the source code file minus its extension (and some other suffixes). This means that a single source code file may only use a single module, a restriction which seems to work well in practice.

What to log

TODO(alejo): write

Logging in console applications

For command-line programs that are expected to be run by relatively savvy users, good logging may even serve as a very convenient way to provide information about the state of a process as it runs. Many programs in Unix indeed do this, by displaying information in their standard error output when a certain flag —typically -v or --verbose— is given. I make this the default behavior for some of my programs.

However, many Unix programs simply print these logging statements to their standard error output. In addition to that, they should store them in a log file that can be consulted in the future, should some hard to reproduce bug occur.

Liberal logging defaults

It's a common practice to run applications with conservative logging and only enable logging while debugging issues. Many applications default to not producing any logging statements whatsoever, unless a certain command-line flag is provided explicitly requesting a given verbosity level or enabling logging for their specific module or facility.

However, applications should always produce logs with all but the most verbose severities (ie. all but debugging, which may actually impact performance). It should be the task of the investigator, when reading the logs, to filter out any irrelevant entries.

Of course, the logging subsystem should be smart enough to not run out of space during a long run of the program: it should only keep the last N log entries at each given severity (or the last N MB worth of entries). This is relatively easy to do if entries of different severities are stored in different log files and the files are rotated periodically, compressing and eventually erasing old ones. Both the standard syslogd implementation for most Unix systems (with the addition of the logrotate package) and the Google Logging Library support this, to some extent.

Anti-principles

Premature optimization

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. —Donald Knuth

Only optimize your code (other than for readability) when you have a reason to do so, as optimizations increase the complexity of the code. Or, in other words, the optimization criteria should be to reduce the complexity of the code.

If you need to make your code run faster, use a profiler to identify which are the procedures where your program spends most of its time. Unless you've already optimized it, chances are that you'll find that it spends most of its time in a very small set of procedures.

Premature generalization

If you're willing to restrict the flexibility of your approach, you can almost always do something better. —John Carmack

Don't make your procedures more general than strictly needed, in order to prepare for an eventual future when this generality may be needed. Your code should be as simple as possible and it should only be generalized when there are actual existing requirements for the generalization. In other words, if you think a certain feature or extension may be needed, don't go ahead and implement it just yet: wait until you're sure it is going to be needed.

In the Extreme Programming circles this is known as the You Ain't Gonna Need It principle. As Ron Jeffries —one of the 3 founders of Extreme Programming— writes:

Always implement things when you actually need them, never when you just foresee that you need them.

This and the previous principle may be seen as the same: adding unrequired flexibility may be seen as a case of premature optimization, optimization for generality. However, I decided to analyze the two dimensions as two different principles because

  • the discourse around premature optimization usually centers around optimization for performance and
  • the discourse around the You Ain't Gonna Need It principle is not typically formulated in terms of performance.

Making your code more flexible than it currently needs to be to try to anticipate future use cases will often increase its complexity. However, the future where the additional flexibility is required may never come. You will often find that the new features or functionality that turns out to be required is different, even if slightly, than what you anticipated. In other words, there are too many dimensions in which code could be made more general and it's very difficult to anticipate in which one you'll need to improve your code. You end up with a code base more complex than is required to support use cases that are not actually needed.

Of course, poorly designed code will often be made simpler by applying generalizations to it. In this case, you should obviously simplify your code. Making your code more general is not a bad thing per-se, the additional complexity that this often brings is what makes it problematic.

If you're designing a library, you should make it as general as required by the programs that will use it, but no more. The lesson from this principle is that you should start with a simple interface and only improve it as is required by the programs that use it.

This principle is related with the principle about working in small chunks, that I explain below. Designing overly general procedures from the start would be a case of working in too large a chunk. If you're working in small chunks, you should only do those that make the most sense at any given time.

A corollary of this is that if some particular requirement in your software changed and some of the generality in a given procedure or module is not required any longer, you should simplify the code by removing the generalizations. If they are needed in the future, they can be restored from the older versions.

Related links:

Nested scopes

Procedures should not have more than 2 or 3 nested scopes, specially when they are introduced with flow-control statements like loops or conditionals. I find that nested scopes tend to make the code harder to follow. Use return and continue wisely.

For example, consider the following code:

Dancing Buildings
def ListValidRecordsByKey(db):
    records = {}
    for source in ListSources(db):
        if SourceIsEnabled(db, source):
            print 'Loading records from %s' % SourceGetName(source)
            if SourceAcceptsQueries(source):
                for record in ListRecordsInSource(source):
                    if RecordIsValid(record):
                        if RecordHasKey(record):
                            key = GetRecordKey(record)
                        else:
                            key = None
                        if key in records:
                            records[key] = []
                        records[key].append(record)
    if records:
        return records
    else:
        raise Exception('no valid records found')

This becomes clearer if written as:

def ListValidRecordsByKey(db):
    records = {}
    for source in ListSources(db):
        if not SourceIsEnabled(db, source):
            continue
        print 'Loading records from %s' % SourceGetName(source)
        if not SourceAcceptsQueries(source):
            continue
        for record in ListRecordsInSource(source):
            if not RecordIsValid(record):
                continue
            key = None
            if RecordHasKey(record):
                key = GetRecordKey(record)
            if key in records:
                records[key] = []
            records[key].append(record)
    if records:
        return records
    raise Exception('no valid records found')

As you can see, when you deal with one particular case (eg. not SourceIsEnabled(db, source) above), if you create nested scopes to handle the hard cases, you continue to carry some complexity from the easy cases —those that don't require much code to be handled. Instead, by handling the easy cases without nesting your scope for the hard cases, you make it easier to forget the easy cases when one studies the hard ones.

An exception to this is the use of “(let ...)” and similar forms in Lisp or similar languages, that simply introduce new variables but don't affect control flow.

Another exception to this would be if the body of a for-loop consists of some initialization followed by two disjoint cases of about the same complexity. In this case, using

for object in GetListOfObjects(data):
  object.PrepareForOutput(param)
  ...
  if 'xml' == output.format:
    print '<object>'
    ...
    print '</object>'
  else:
    print 'Object follows in ALEJOFORMAT:'
    ...

may reflect your logic better than a corresponding

for object in GetListOfObjects(data):
  object.PrepareForOutput(param)
  ...
  if 'xml' == output.format:
    print '<object>'
    ...
    print '</object>'
    continue
  print 'Object follows in ALEJOFORMAT:'
  ...

as, in the second case, the intention of the continue statement may not be as clear.

Basilica di San Marco a Venezia

Threads

My experience suggets [...] that multi-threading is part of the problem, not part of the solution; that essentially no application programmer understands threads well enough to avoid deadlocks and races and horrible non-repeatable bugs. –Tim Bray, On Duct Tape

Threads make the code much more difficult to reason about; any advantages they may bring are likely smaller than the cost you pay.

Programs that use threads tend to have a lot of additional complexity. This complexity comes mostly from the fact that the code may get preempted anywhere, which means that the execution may jump from any part of the code to any part of the code. The end result is that the code has a lot of additional structures that the programmer needs to track. To make even simple changes, you now have to be aware of which locks protect which data. In my experience, this makes programs very difficult to maintain.

Any program that depends on threads can be turned into one that does not simply by moving the storage of the state of the different computations from the stack to the heap and using non-blocking APIs. Even if you did long-running computations in some threads, they may be split into many small units of work and use the scheduler that best fits your domain. If you do this, your code won't be preempted but rather it will explicitly forfeit control.

Using threads is the wrong approach to making CPU-bound programs run faster in multi-core machines. Using processes —where, instead of sharing the virtual memory space, they communicate over streams— to exploit parallelism is a much better alternative: with threads, an execution of the program is constrained to a single machine but with processes it can run over multiple machines. This is vital for some programs that I often run over thousands of machines to produce a single result; if my program was constrained to a single machine, it would take years, as opposed to hours, for them to terminate.

If, for some reason, you decide that do need to share some memory among multiple threads of execution, consider using processes (which don't share their memory space) and only sharing, using APIs for sharing memory accross processes, the specific portions that need to be shared. I must confess I've never needed to do this, but I suspect it may be a better approach than threads.

Someone recently stated to me that threads and processes are the same in the end, they just differ in the amount of sharing going on: in threads, all the memory is shared, whereas none is in the case of processes. If you believe that, you'll also believe that capitalism and communism are just the same. The fact is that due to the differences in the amount of sharing going on, choosing to use threads instead of processes usually has a significant impact on the complexity of the code.

So only use threads if you have important reasons to do so. And, if you do use them, minimize the amount of data threads share as much as possible and use cleanly defined interfaces for sharing data to keep things as simple as possible.

I figured I would add an explicit disclaimer to this section: there's a lot of arguing about threading, in both sides of the fence. A lot of very intelligent people argue strongly for threads. So forgive me for repeating the disclaimer from the introduction: this is just my opinion and it may not apply to you. Some people disagree on this topic.

Global variables

Avoid the use of global variables. Put another way, try to make procedures depend only on the parameters they get passed (see also the section on referential transparency). While global variables are very useful for some things, do not abuse them: be aware of the cost that you introduce in making the code slightly harder to understand with each that you introduce (and, sometimes, be aware that you're making procedures non-reentrant).

One good practice is to clearly mark global variables. In Lisp tradition they are surrounded by stars (eg. *extensions* instead of simply extensions).

Anti-principles with exceptions

While I don't oppose prudent use of exceptions, I think careless abuse of them can very easily lead to difficult to read code and increase the likelyhood of defects being introduced or unnoticed. I decided to add an entire section about specific anti-principles affecting the use of exceptions.

Catch all exceptions

When catching exceptions, never use catch all constructs; always list the entire set of exceptions that you expect to catch.

If specifying the entire list of exceptions every time becomes too cumbersome, think about using a hierarchy of exceptions if your language supports it (such as Python, where exceptions are classes). In this case, list reasonably specific exceptions in your blocks.

The reason for this is that catch-all constructs make it easy for unrelated errors in your code to go undetected. Suppose that you have a ReadAndParseContents procedure that may raise IOError or a ParsingError exceptions. Suppose that you wrap the call in a block that catches all exceptions:

try:
  contents = ReadAndParseContents(file)
except:
  print >>sys.stderr, '%s: Error reading file' % file
  contents = None

The problem with this is that it's possible that ReadAndParseContents may contain other problems and raise exceptions other than the ones you expect. For instance, what if it sometimes uses an undefined variable? Surely you won't want to ignore that bug, blaming it on an error on the file, but rather detect the internal error. One way to handle internal errors —when the program detects a condition that should never happen— is by raising exceptions of a certain type. In this case, you want to make sure that those exceptions won't be caught and ignored in blocks such as the one in the example above. There are more details about this in the sections about testing.

There is one exception to this anti-principle: you may want to, in your outter-most procedure, catch any unexpected exceptions and provide a way to handle them that is preferable for your application than just doing the default action of crashing. Again, there are more details about this in the sections about testing.

Exceptions for normal conditions

Use exceptions only for exceptional conditions, not for normal situations. Of course, there's a gray area between «exceptional» and «normal». For instance, if an attempt to open a file finds that it doesn't exist, should that throw an exception? I think in most cases it shouldn't, but the answer depends on the particular program.

A rule of thumb that works reasonably well for this is to ask yourself if it's likely that, as a result of the condition, control should be given to a procedure in the call stack other than your parent (or even that the execution should be aborted). If control should simply be returned to the caller procedure, chances are you're better off not using exceptions but just returning a special value.

Catch exceptions in big blocks

You should, in general, minimize the amount of code where exceptions will be caught. In other words, just because one or two calls in your procedure may raise exceptions, you should not wrap your entire procedure against exceptions.

If you want to protect against open raising an IOError exception in some code that opens and processes a file, instead of:

try:
  file = open(path, 'r')
  for line in file:
    if recognize_pattern.match(line):
      RegisterLine(line)
except IOError:
  print >>sys.stderr, '%s: Error reading file' % file

Write:

try:
  file = open(path, 'r')
except IOError:
  print >>sys.stderr, '%s: Error reading file' % file
  return
for line in file:
  if recognize_pattern.match(line):
    RegisterLine(line)

The reason here is the same for the anti-principle against catch-all constructions: you should only catch the exceptions that you anticipate from the procedures that you anticipate. If some modification in the future unexpectedly makes RegisterLine start raising IOError exception, you don't want the bug concealed from you.

The process of writing code

Constantly evaluate your work

TODO: write

Split your program into modules

Break down your program into modules and treat each as a separate, self-contained, work.

A given unit should probably usually be about 500 lines of code, very rarely larger than 1000 or 2000. This number if probably smaller than what most people would consider, but is what I've found works best for me.

Of course, this number depends significantly on the expressiveness of the language you are using and your style. For example, I would roughly equate 1000 lines of my Scheme code with 2500 lines of my C code.

The public interface provided by each module should be clearly delimited. At the very least, it should be clear which are the public symbols that a module exports for other modules. Try to make this interface as small as possible.

My best example for this would be Svnwiki, which I've split into many different eggs (libraries for Chicken Scheme). Currently, to install it you need to get a total of 47 different libraries —41 of which I created. You can do this with a single command, of course. Many of these eggs have been extended now by other users and are being reused by other programs or eggs. This would hardly have happened if I hadn't clearly separated this code from Svnwiki. The act of splitting the code into different modules also increases its quality by itself, by forcing me to think more clearly about how the code should be organized and by making each unit self contained.

Work in small chunks

“Perfection is attained by slow degrees; it requires the hand of time” —Voltaire

If you have to make a large change, split it into smaller changes and try to always have a version of the software that works, even if it's full of annotations about things that should be improved. If you have to go from a state A to a state E, it's usually better to go to intermediate states B, C and D, even if some of this steps introduce code that others remove.

This makes it easier to maintain the quality and stability of the software as you modify it.

Sometimes you won't even know what the intermediate steps are and will only have a fuzzy idea of what the end result should be. This is OK. Start with some minimalistic, but functional, version of what you need and iterate on it, improving it one step at a time, constantly taking it in the right direction. If you do this right (part of which requires following other principles described here, such as refactoring your code), the end result will be impressive, far better than anything you could have hoped to achieve if you had tried to build it in just one go.

Iglesia en Praga

Note that working in small chunks often means you'll have temporary versions of the software that are still not the ideal. As you iterate on this, document the limitations in the software that you introduce. Depending on the nature of the software, you'll want to do this simply adding comments to the code or in a separate file or bugs repository.

The better a programmer gets, the larger the individual changes he'll be able to work with. By increasing the size of your changes, you usually increase the cost (for you) to analyze them and validate them.

When I was extending xc to allow the users to define procedures in a simple syntax (eg. “f(x, y) = x + y ^ 2”), I noticed that it would be desirable to support procedures with a variable number of arguments. However, that was not the problem I was solving then. Following this principle, I decided to simply implement what I had set out to implement, document the limitation and move on. Even if support for user-defined procedures is incomplete, the small change already improved the program.

As jerojasro points out, if you do dirty things temporally that you think should be improved in the future, document those with “TODO: ...” comments.

Refactor your code

Spend time fixing your code, even in ways that won't be directly visible to the users. This will pay in making it easier to work with the code.

As time goes by and software grows organically, it will start to accumulate cruft. You should really make an effort to remove any unnecessary complexity. If you find a better (ie. simpler, more clear, more concise, etc.) way to do something, implement it, even if the current mechanism works reasonably well.

Related links:

15 Minutes

Don't do The Simplest Possible Change That Works

When you need to modify some software to make it conform with a new requirement (fix a bug or add a new feature), don't take the wrong approach of applying the smallest possible change that makes the software fulfil the requirement. I've been guilty of this many times.

Instead, you should make the change that results in the software having the least possible complexity while now supporting the new requirement. Sometimes the two approaches result in the same change, but many times they don't.

This often requires changing more lines than the other approach, but that's OK: you're paying a higher up-front cost for not having anticipated the new requirement, but then fixing the problem. In the other approach you will continue to pay this cost constantly.

A corollary of this is that if a requirement that the software already fulfills is removed (and doesn't seem likely to come back in the future), you should probably take the opportunity to simplify your code by removing support for it.

Use version control

Store your code in a version-control system such as Subversion, Mercurial or GIT.

My favorite is Subversion. While I would prefer a decentralized system, I like the fact that you can very easily just check out a subdirectory of the whole repository and work on that. In most distributed version-control systems that I've seen this tends to be difficult. For example, Mercurial's tutorial is very clear:

Many SVN/CVS users expect to host related projects together in one repository. This is really not what hg was made for, so you should try a different way of working.

Subversion is what I use as the backend for Svnwiki. This means that, for example, the whole Freaks Unidos wiki is stored in a single repository. Allowing people to check out individual parts of the wiki matters a lot for large-scale repositories for collaborative authoring. Think of, for instance, things like Wikipedia: it wouldn't be feasiable to require contributors to check out the whole repository if they want to work (with a version-control system) on a single page.

However, if the requirement of having to checkout the entire repository instead of just one of its subdirectories is not a problem for you, you may be better off working with a decentralized system.

In the first versions of this document I didn't include this for I thought it was too obvious: could any programmer worth his salt really not see the need for version control? I only added it when in the discussion for the article many insisted that, sadly, they know many cases where programmers don't use version-control or use their own lame manual attempts such as making different copies of the software periodically. I won't bother listing all the advantages of using version-control systems when developing software, though.

Test your software

Having complex software in production without having proper tests for it has a bad psychological effect for me as a software engineer. I've found that having proper tests for my software makes me a happier programmer. I guess I've come to feel about software that I find it hard to trust it unless I have tests that exercise all my expectations from it. When I create a new version, for example, I can't be assured that I haven't broken anything.

I've found that there are three major types of test for software. In some cases, one of them will make a lot more sense than the others. Writing tests the wrong way can be extremely frustrating. One should pick the right set of methodologies to provide the desired result (allowing the programmer to trust that the software works) with the least effort. I'll talk about them in the next sections.

Often you'll need to generalize your code to make it testable. There are endless particular lessons to be learned about how to do this. In the end, it boils down to accepting that it is OK to slightly increase the complexity of your software to make it testable. Or, in other words, to realize that being able to automatically test it should always be one of software's requirements.

Being able to run the tests automatically is important. Back in my previous job, I was working with one of the largest Novell customers on porting their mission critical software from old Unix systems to GNU/Linux. They had detailed protocols for how to test the software manually: they would run the program and perform a series of operations to validate that it did what they expected. This had two important disadvantages:

  • It made it expensive to test the software, —we would waste about half a human hour per program every time we wanted to run the tests—, which means that tests didn't run very often, only at the end of a big batch of changes.
  • It made it expensive to increase the coverage of the tests, —as that would significantly increase the cost of running them—, which means that their coverage was very poor. Indeed, we detected a lot of errors that the test protocols did not.

So what are my favorite methodologies for doing testing?

Dessert

Assertion checks

Procedures should have statements validating its pre and post conditions and the consistency of complex structures.

For example, on reasonably complex classes I often add a method to validate its internal consistency and call that at the beginning of many procedures and at the end of those that modify its attributes. Assertions serve as documentation about procedure and structures.

Validating expectations dynamically adds to the number of computations and, therefore, makes the program slower. Don't let this tempt you into not making your tests as thorough as they should be. Where to draw the line? As long as doing so increases the stringency of your validations, add tests of the same algorithmic complexity as the procedures to which you add them. For example, if a procedure runs two nested loops in some list, —resulting in a number of operations proportional to the square of the input size (O(N) = n^2)—, it is OK to have it run tests that do the same. However, procedures that run in constant time should only perform tests that also run in constant time.

At the beginning of 2009 I worked on a library that receives a list of vertices and automatically generates graphs by adding edges until a certain set of constraints are satisfied for every vertex. Among other things, the library is used to automatically generate the configurations of the software that handles all alerts at Google and routes them to the appropriate MTAs (here instances of this software are the vertices and the edges represent backup instances). The library does interesting things such as try to balance the load and create connections between targets that are nearby in the network. Well, about half of the code in this library (and in some of the caller libraries and programs) has the sole purpose of validating the results and the internal consistency of the graph. Before actually writing any output, the code checks that all the constraints are indeed satisfied (raising an exception if they aren't). These tests have actually resulted in subtle bugs being caught when the system was in production, preventing faulty configurations from being used. The main result for me is that I know that any outputs generated by the library have been verified exhaustively, which lets me sleep a bit better at night. Of course, there's always the possibility of a bug in the tests themselves, but the odds of this are much lower.

Assertions should only be used for failure conditions from which the program is not expected to be able to recover, things that should never happen; other conditions, —those from which it's reasonable to expect the program to recover—, should not be handled this way.

How to react to failed assertions depends on the context of the program. Often the best thing to do is abort the execution of the program (for example, by calling abort(3) or raising an uncaught exception). In some cases, however, it may be possible for the program to ignore the inputs that lead to the failure but, instead of crashing, continue to work on others (eg. some query-processing daemon or a web browser). This will reduce the impact of the problem and the urgency of fixing it. It goes without saying that you should implement a proper mechanism to get a report of failures with as much meaningful information as possible for reproducing them in a test environment.

You should only disable your assertion checks if it is strictly necessary (for example, for performance reasons). I almost always run my software with assertion checks enabled. An interesting possibility is suggested by Stu Smith:

The problem I have with this is that the debug/release division is way too blunt.

  • Suppose a tester wants to test a release build, but also perform the sanity-checking?
  • Suppose a customer finds a bug in a complex calculation, and you want them to run the checks?
  • Suppose a developer wants to debug performance issues in the non-checked code?

I would suggest you move the conditionals for code like this to a (possibly hidden) configuration setting. Someone wants to run in slow-and-safe mode? No problem, just tell them the registry setting (or whatever). By all means, change the splash screen or title bar or whatever so it says “test mode”.

I like this idea: always ship your binaries with the assertion checks included, but add a configuration setting, command-line flag or environment variable that enables or disables them. This is very similar to how I handle unit tests (see the next section).

Try to have an automated way of running your software in a way that exercises many of them in a simulation of a real run (possibly by combining this with the methodologies discussed in the next two sections).

I also used this testing methodology back in 2001 when I was working in Matanza. Matanza's code has a lot of assertion checks for pre and post conditions. Whenever I made non-trivial changes, I would often just run the server, connect a few clients and just let their ships fly randomly. Since I had a lot of checks, whenever I introduced a software defect, the checks would often catch it and, very importantly, would make it very easy to identify where the culprit was.

Related links:

Unit tests

Whenever you've finished writing a procedure, write some unit tests with common and uncommon uses. Or even before writing it. This will probably pay off sooner than you think.

The unit tests should call the procedures in different ways. Try to think of corner cases and verify that you don't have errors such as off-by-ones.

If you are using a procedure in an unusual way, verify if it has tests that exercise it. If it doesn't, add them.

Pick a testing methodology where the cost of adding tests is as small as possible. When writing Scheme code, I use my embedded-test methodology. One thing I don't like of most methodologies is the disconnect, between the code being tested and the tests, that results from storing them on different files. However, if that's the best alternative for the language you're using, suck it up and write unit tests anyway.

One thing I like about embedding tests right next to the code —as I do with my embedded-test library— is that the tests serve as documentation of how procedures should behave in unusual conditions. As your programs grow in complexity, sometimes different parts of the code will expect a procedure to behave differently. Encoding these expectations as tests means that you'll get notified if you change the behavior of a function to match the expectations from one module and by doing this stop to fulfil the expectations from another.

After designing my embedded-test library, ceronman pointed out Python's doctest module, which works in a rather similar fashion. If you're using Python, check it out.

I said before that making code testable should always be one of its requirements. With unit tests, you'll often find yourself making changes to your procedures that serve no purpose other than making them testable. One example would be that if a procedure performs some operations based on an external resource like the file system or the computer's clock, you may want to find a way to override the actual resource with a mock one. A typical way to do this is to make your code receive an interface to interact with the resource for which you'll provide two implementations: one that maps the calls directly to the resource's API and another that you use for testing. So, as I've said before, making procedures testable may increase the complexity of your code. This is OK.

I've only started to add unit tests to Svnwiki somewhat recently. As of 2009-03-03 it has a total of 102 tests. I expect to continue to add tests. It has already paid off in helping me detect contradictions in the expectations that different modules had of particular procedures.

Related links:

End-to-end tests

Another type of test is simply to do a full run of your software and verify that it does what you expect.

You can think of this as a subset of unit tests: those for the main procedure. You should write tests for all your procedures, even those that call a lot of your other independently-tested procedures. Your main procedure is not different.

However, in practice you'll often find that your end-to-end tests will be represented differently than your unit tests. For instance, they may be a script that runs your program several times with different inputs and verifies its outputs.

One advantage of end-to-end tests is that they can be written reasonably decoupled from the internal aspects of the program. As such, if the implementation of the program changes significantly, your end-to-end tests won't have to. While this is true at the procedure level for a particular set of unit tests (ie. if the procedure's implementation changes, its tests won't have to), it's very clear that the entire set of unit tests depends significantly more on the implementation details.

Other types of tests

There are other types of tests that I find interesting. You may want to perform these types of tests depending on the context of what you're doing.

For example, load tests that measure how a system's performance changes as the amount of operations it performs increases or the impact that a change in the code has. This gets a bit on the realm of profiling and optimizing, which I'll talk about below.

I also find A/B tests an interesting idea to measure the impact of small changes to the user interface. I find it very hard to predict the impact that simple changes have on the way the software is used. If you care about this (and you have a reasonably large number of users), measure rather than guess.

I am not a fan of fuzz testing. Running your programs on random inputs tends to exercise their coverage very little. It may catch trivial bugs (such as, for example, hard-coded limits for the length of input lines) but it won't exercise the interesting aspects of your software. For instance, feeding random inputs to an HTTP server or a C compiler doesn't really get you very far, as far as testing is concerned. To make it work you have to assist the generation of inputs: generate interesting inputs in a random manner. However, as you start to do this, you start losing the supposed advantage of fuzz testing, lack of preconceptions about system behavior.

Žižkov TV Tower (2)

A word of caution about interactive testing

Lisp advocates often mention that one of the advantages of writing code in Lisp is that you can very easily run it interactively. The idea is that after defining a procedure, you'll run it on a few inputs to verify that it does what you expect before moving onto writing the next.

While interactive programming is fun, it does not substitute the practice of writing proper tests. In other words, if you're going to bother making a few interactive calls to your procedures to check them, don't just throw that logic away once you see that the function works: save them as unit tests. I was guilty of not doing this for years.

Static type checks are useful

While static type checks won't catch most errors (which is why you'll need the other forms of testing described above), they can catch some of them, specially for those parts of the code that rarely get executed (for instance, paths of the code that the unit tests don't cover) but that are, nevertheless, important.

Sadly, most implementations for dynamically-typed programming languages such as Python or Scheme don't really support this. I believe that doing static checking of types on languages like these, while hard, is not impossible. This is an area in which I'm currently experimenting.

Loading... Vote up! Vote down! Save to del.icio.usSubmit Story to Digg Discussion (25)

Last update: 2010-07-31 (Rev 17037)

svnwiki $Rev: 15576 $