By Jack Ganssle

Comments

Published 12/02/2005

I was chatting informally with a group of developers last week when one used a phrase guaranteed to set me off.

"I write self-commenting code," he revealed proudly.

Uh huh. I have yet to see a program of any useful size that, stripped of comments, is self-documenting. C is not a language like English or Swedish where there's so much information conveyed that even in a noisy room, where one might catch only 70% of the words, the meaning still comes across. Computer languages are inherently dense and precise: miss a single character and the program won't run correctly. Mix up "Identifier" and "identifier" and, if you're lucky, the compiler will complain. Programmers with less good fortune will get a clean compile and spend days or weeks looking for a hard-to-find bug.

Usually these folks will go on at length about their use of long variable names. Hey, I'm all in favor of making variables as long as they need to be to clearly express an idea. But length, in this case, isn't always an asset. I find it awfully hard to read something like:

for(next_output_buffer_sequence!=first_output_buffer_sequence;
!complete_message_assembled_by_host_process; ++final_result_queue_pointer);

The C constructs (operators et al) are lost in the morass of names. And reading a single statement split across many lines confuses the eyes.

Most compilers only recognize the first 31 characters as being unique so it's dangerous to get enamored with exceedingly long names.

  for (i=0; i<max; ++i){	// Start initialize loop
  ++Array_Pointer;	   	// point to next element in Array
  *Array_Pointer=0;	    // Set Array element to zero
  }					    // end for
Much better is to eliminate those annoying and not particular informative comments and prefix the entire snippet with:
           //  Array is a sparse matrix; empty
           // elements are denoted by zero so
           // here we initialize all elements to
           // "empty" (a zero).

The second style conveys the sense of the code while the first gives plenty of detail and no context.

Others write the code first and add the comments later. They don't want to "waste time" with documentation while endlessly fiddling with a routine to get it to work. But the comments should be the function's design. If the developer doesn't know enough about the design before cranking code, just how do he start pounding C into the editor? Is it a random walk? "Uh, hmm, I dunno, let's try:"

void main(void){

"Oh boy, now what? How about maybe initializing something. or should we set up a queue?"

The idea must be that if they type enough C a function, a structure and a clear idea will emerge. That might indeed happen. eventually. But not efficiently.

There's a spec somewhere - perhaps only in the developer's head - which describes in English what a function should do in a human-friendly manner. The code is a translation of that spec to cryptic and unforgiving computerese. So I figure the way to write a function is to create all of the comments first. The header, and even all of the individual little snippets of English spread throughout the code. Then the function's design is done.

After that, anyone can fill in the code.