On Comment Headers

Summary: Modules should start with great comment headers. Too many don't.

I read a great deal of code. The vast majority is in C with some C++ and a bit of assembly sprinkled in. Some of it is brilliantly-written, clearly by developers who are craftspeople of the highest order. Some is, well, not so nice. I sometimes wonder what the developers were thinking.

One pet peeve is with the comment header at the top of each module. A great variety of styles are used, and some are better than others.

The worst comment headers are the ones that aren't there, or the ones that are completely wrong.

On one safety-critical project every module did have a header. But they were obviously block-copied. There's nothing wrong with that, except in this case dozens of files had identical headers. A module containing functions for the ADC, for instance, had a header documenting (well, with a mere line or two) some entirely different chunk of code. A huge percentage of these headers had identical dates and authors. Either that person was a code-cranker of monumental skill and speed, or he was completely careless. Incorrect comments are worse than none at all.

Oddly, each header was many tens of lines long, but most of these were blank lines. Probably the intent was to fill in the blank spaces with appropriate comments, but no one ever bothered.

I can't stand many headers used in open-source projects. Here's an example from Linux:

The first real line describes the module, which is exactly the right way to start a header. That's what someone wants to know when opening the file. A copyright notice and the author's name is appropriate. But the rest is lawyerly crap. No one reads this stuff, and you're left paging through the words to find where the real documentation starts. Though it may be important to have this for legal reasons, it gets in the way. It should be at the very end of the file where it will be read just as carefully as when it's annoyingly at the beginning.

In too many cases the small print goes on for far longer than this one. Do we really need to stumble over it all? In every single module? Is it fair to make perhaps hundreds of others waste their time with it?

Also, note that the disclaimer about the warranty is itself embedded in the GPL 2. Adding it to the header is redundant. And the GPL 2 is about 3000 words; either this header is woefully incomplete or should be replaced by a one-sentence reference to the license.

Here's another extract from Linux. There's a brief reference to the license. I like this as it provides legal cover while not cluttering the file. But it doesn't reference which version of the GPL applies (GPL 2 was released in 1991, long before the date in the file).

Now as to content. In the previous example a line does mention what's in the file. But what's a governor? Some sort of general description is needed as others, many perhaps far less familiar with the code than the authors, will be digging through this. Remember that many people want a very fast high-level view of what is in the module - just a sense of how it fits into the big picture. Perhaps something like "The governor uses a pair of weighted sheres to ensure Linux's triple-expansion steam engine maintains a constant speed."

Freescale's MQX headers all start with this line:

/**HEADER********************************************************************

And end with:

*END*********************************************************************/

Do those comments add useful information? The asterisks might be nice to block off the comments from the rest of the code, but HEADER and END are an affectation.

FreeRTOS is very well commented and is a joy to read. But each module starts with this:

This is a sales pitch. It's not documentation. In fact, there's not a word about what the module does. But, like Marco Rubio's 25-second verbal tic, did you notice that 1 tab == 4 spaces?

I have heard Ada programmers wonder if C users have an aversion to typing. While that's an unfair generalization there is a common pattern of excessive briefness, to the point of taciturnity. It's important to be expressive, to get all of the important information on the page. Doxygen provides pretty good tools for this: the header starts with a brief, one-line description, but that is fleshed out thereafter. The one-liner is the headline; the rest is the story.

Documentation can be carried to an extreme. One module I read had a header 2000 lines long - the code was merely 200. Sometimes it makes sense to point to external documentation instead.

Every header should include:

A brief description of the file

A detailed description of the file

Author's name

The date of first release

The developer's name, date and a description of each revision

Perhaps a line or two about licensing

I prefer using the /* comment rather than double slashes. This:

Comments

is better than:

// Comments

. because the former eases making free-form changes to the comments. Having to reformat double slashes when just adding a few words to a sentence makes one less inclined to maintain the comments.

A project I looked at recently had no comment headers at all. None. Modules were expressively named "b.c" and functions were little better. The entire program was written by one individual, and presumably the thinking was that, since he would maintain it, comments were superfluous. As always happens1, he moved on to another company and the poor sods left had to deal with the mess. I think a role of management is demanding high quality work from their people, and doing enough of an audit to ensure that the quality is there. It can be hard to assess the quality of software, but it's pretty easy for someone with basic software skills to look at source and notice a dearth of comments or inconsistent styles.

I've been told by smart developers that they don't want to take the time to write descriptive, careful comments. But if you can't type pretty quickly in this day then programming is probably not for you. And if you can't carefully formulate your intention in English, odds are you can't in code.

The comments are a love letter to yourself and your successors. They are every bit as important as getting that interrupt handler right.

1) Always. The only valid assumption when documenting the code is that someone else, someone who knows far less than you do about the problem, will be left trying to make sense of your work. It's up to you whether he thinks "Joe was a master craftsman!" or "that *&*&%()($# Joe! I hope he rots in hell with the Perl programmers!"

Published February 8, 2016