Wednesday, July 23, 2014

The hazards of commenting code

- Gautham

It is commonly thought that good code should be thoroughly commented. In fact, this is the opposite of good practice. A coding strategy that does not allow the programmer to use coding as a crutch is good. Programs should be legible on their own.

Here are the most common scenarios:

  • Bad. The comment is understandable and it precedes an un-understandable piece of code. When the maintainer of the code goes through this, they still have to do a lot of work to figure out how to change the code, or to figure out where the bug might be.
  • Better. The comment is understandable, and the line of code is also understandable. Now you are making the reader read the same thing twice. This also dilutes code into a sea of just words.
  • Best. There is no comment. Only an understandable piece of code due to good naming, good abstractions, and a solid design. Good job!
  • Terrible. The comment is understandable. The code it describes does not do what the comment says. The bug hides in here. The maintainer has to read every piece of your un-understandable code because they have realized they can't trust your comments, which they shouldn't anyway. And so all your commenting effort was for nothing. This scenario is surprisingly common. 

When are comments acceptable?
  • Documentation. If you have a mature set of tools, you might have them to the point that the user can just read the manual, rather than read the code. This is intended for users, not maintainers, and usually takes the form of a large comment that automated documentation generation tools can interpret.
  • Surprising/odd behavior of libraries you are using. Matlab has some weird things it does, and sometimes I like to notify the maintainer that this line of code looks this way for a reason (especially if the line of code is more complex than a naive implementation would appear to require because of subtleties of the programming language or libraries/packages being used.) It can be counter-argued that rather than put in a comment you could put in a set of unit tests that explore all the edge-case behavior and encapsulate byzantine code into functions whose names describe the requirements that the code is trying to meet.
  • When your program is best explained with pictures. Programs are strings. But sometimes they represent or manipulate essentially graphical entities. For example, a program that represents a balanced binary search tree involves tree rotation manipulations. These manipulations are very difficult to describe in prose, and so they are similarly difficult to describe in code. Some ASCII art can be a real life saver in this kind of situation, because code is a poor representation of diagrams. So think of it this way: don't let yourself write text in comments but its okay to draw figures in the comments.

For more on these ideas, please just get Robert Martin's book on Clean Code. 

1 comment:

  1. This is a common scenario for me:
    There is no comment. Only an un-understandable piece of code due to bad naming, no abstractions and a bad design. Bad job!

    If the piece of software is produced by a scientist who doens't no about good naming, abstraction and software design the lack of comments makes all worse.

    Can't support your recommendation.