By Jack Ganssle

The Use Of Assertions

Summary: A new study shows the power of seeding your code with assertions.

There are things we are supposed to do that we don't. Exercise. Avoiding fast foods. Use the assert() macro. Exercise is a real pain that requires time and effort. Fast food can save time, and is carefully designed to seduce us. But asserts take little time or effort, and yet are as rare as the spotted owl. The Linux kernel sports just one thousand invocations of the macro in three million lines of code, an assertion density of just one per 3KLOC.

Historically, though, there's been little good evidence they are effective. Till now. In a paper from Microsoft Research (http://research.microsoft.com/pubs/70290/tr-2006-54.pdf) the authors compared the relationship between the density of the assertions and code quality. By "quality" they meant post-release bugs, the sort of awful things customers see. It would have been interesting to see how asserts helped during development, and the effect on schedule. I'd expect earlier identification of bugs leads to faster delivery.

Figure three in the paper shows the fault density versus assertion density for each file in four products (actually two products with two releases each). These are not toy projects like the ones that pepper so many academic studies. They are real products comprising 100KLOC to over a third of a million LOC.

The result: those modules with a low number of assertions per KLOC had much higher post-release bug rates than those seeded with many assertions. The figure is a scatter plot and I don't have access to the raw data so can't do a curve fit. But by eyeballing the results it appears that systems with less than about ten to 25 asserts per KLOC experience much higher post-release bugs than those using more assertions. The benefit seems to taper off around 50 to 100 asserts per KLOC.

More analysis by the researchers concluded that assertions detected six to 14% of the post-release bugs. That's hard to reconcile with the vastly improved quality experienced by files so seeded. Perhaps, and this is merely speculation on my part, the disciplined use of assertions is indicative of better development practices overall.

Figure 4 is stunning. Frequent readers of this column know of my hopes for static analysis. Yet this chart shows assertions find at least twice the number of bugs identified by static analyzers. (Note that the analyzers are Microsoft products and may or many not be comparable to similar products offered by other companies). Currently, commercial static analyzers are frighteningly expensive. Asserts: free, more or less.

Makes you think.

Published November 11, 2009