Debuggable Designs
Tips for making your design more debuggable. Originally in Embedded Systems Programming, December, 1998.
By Jack Ganssle
An unhappy reality of our business is that we'll surely spend lot of time - far too much time - debugging both hardware and firmware. For better or worse, debugging consumes project-months with reckless abandon. It's usually a prime cause of schedule collapse, disgruntled team members and excess stomach acid.
Yet debugging will never go away. Practicing even the very best design techniques will never eliminate mistakes. No one is smart enough to anticipate every nuance and implication of each design decision on even a simple little 4k 8051 product; when complexity soars to hundreds of thousands of lines of code coupled to complex custom ASICs we can only be sure that bugs will multiply like rabbits.
We know, then, up front when making basic design decisions that in weeks or months our grand scheme will go from paper scribbles to hardware and software ready for testing. It behooves us to be quite careful with those initial choices we make, to be sure that the resulting design isn't an undebuggable mess.
Some day someone will compile the canonical encyclopedia of strategies for building debuggable systems. Till then, here are a few ideas from my musty files.
Add Test Points
Whether you're working on hardware or firmware problems, the oscilloscope is one of the most useful of all debugging tools. A scope gives instant insight into difficult code issues like operation of I/O ports, ISR sequencing, performance problems, and the like.
Yet it's tough to probe modern surface mount designs. Those tiny whisker-thin pins are hard enough to see, let alone probe. Drink a bit of coffee and you'll dither the scope connection across three or four pins.
The most difficult connection problem of all is getting a good ground. With speeds rocketing towards infinity the scope will show garbage without a short, well connected ground, yet this is yes"> almost impossible when the IC's pin is finer than a spider web.
So, when laying out the PCB add lots of ground points scattered all over the board. You might configure these to accept a formal test point. Or, simply put holes on the board, holes connected to the ground plane and sized to accept a resistor lead. Before starting your tests solder resistors into each hole, and cut off the resistor itself leaving just a half-inch stub of stiff wire protruding from the board. Hook the scope's oversized ground clip lead to the nearest convenient stub.
Figure on adding test points for the firmware as well. For example, the easiest way to measure the execution time of a short routine is to toggle a bit up for the duration of the function. If possible, add a couple of parallel I/O bits just in case you need to instrument the code.
Add test points for the critical signals you know will be a problem. For example:
' Boot loads are always a problem with downloadable devices (Flash, ROM-loaded FPGAs, etc.). Put test points on the critical load signals, as you'll surely wrestle with these a bit.
' The basic system timing signals all need test points: read, write, maybe wait, clock and perhaps CPU status outputs. All system timing is referenced to these, so you'll surely leave probes connected to those signals for days on end.
' Using a watchdog timer? Always put a test point on the time-out signal. Better, use an LED on a latch. You've ' normal">got to know when the watchdog goes off as this indicates a serious problem. Similarly, add a jumper to disable the watchdog as you'll surely want it off when working on the code.
' With complex power management strategies it's a good idea to put test points on the reset pin, battery signals, and the like.
When using PLDs and FPGAs remember that these devices incorporate all of the evils of embedded systems with none of the remedies we normally use: the entire design, perhaps consisting of tens of thousands of gates, is buried behind a few tens of pins. There's no good way to get "inside the box" and see what happens when.
Some of these devices do support a bit of limited debugging using a serial connection to a pseudo-debug port, like Xilinix's Xchecker cable. In such a case by all means add the standard connector to your PCB! Your design will not work right off the bat; take advantage of any opportunity to get visibility into the part.
Also plan to dedicate a pin or two in each FPGA/PLD for debugging. Bring the pins to test points. You can always change the logic inside the part to route critical signal to these test points, giving you some limited ability to view the device's operation.
If your system uses tough parts - say, fast 32 bitters that lack much tool support, or if there's a hard communication port or other I/O that you just know will be a nightmare, you'll probably live for many days chained to a logic analyzer. This is the best of tools and the worst of tools. It sure gives a lot of insight into timing problems, but there's nothing more frustrating than struggling with forty or fifty clip leads. You'll connect some to the wrong pins, and half of the others will pop off at exactly the wrong time.
Instead, add a special logic analyzer connector to the PCB layout, one that matches the plug configuration of your analyzer. Use AMP's new Mictor connectors. These high density 34 pin wonders bring out a lot of signals using minimal PCB real estate. Even better, both HP and Tektronix supports the Mictor so you can plug the analyzer directly onto the board, without using a single clip lead.
Similarly, if the CPU has a BDM or JTAG debugging interface, put a BDM/JTAG connector on the PCB, even if you're using the very best emulators. For almost zero cost you may save the project when/if the ICE gives trouble.
Very small systems often just don't have room for a handful of test points. The cost of extra holes on ultra-cheap products might be prohibitive. I like to always figure on building a real, honest, prototype first, one that might be a bit bigger and more expensive than the production version. The cost of doing an extra PCB revision (typically $1000 to $2000 for 5 day turnaround) is vanishingly small compared to your salary!
When management screams about the cost of test points and extra connectors, remember that you do not have to load these components during the production run. Install them on the prototypes, leaving them off the bill of materials. Years later, when the production folks wonder about all of the extra holes, you can knowingly smile and remember how they once saved your butt.
Mechanical Connections
Smaller, faster, cheaper. Chant this over and over, eyes closed, legs lotussed, spirit uplifted. It's the silicon mantra, and it drives electronics to phenomenal success, while giving the industry's practitioners' prematurely gray hair.
How are you going to connect debugging tools to that new tiny PCMCIA card you're working on? Don't just assume that the software crowd will "come up with something". If they don't, your clever design could bankrupt the company.
One company I know had a PCMCIA product whose CPU's whisker-thin TQFP leads defeated every ICE-connection attempt. Their wonderfully clever solution was to design the card with a rather large extra connector - a simple 100 pin header - with all CPU lines connected. Though the connector doubled the size of the board, it sat alone, the only component outside of the development environment. When it came time to ship the product they cut the connector off, and the board down to size, with a bandsaw. Production versions, of course, were proper-sized cards without the connector.
If your product uses a card cage no doubt the board-to-board spacing is insanely tight. Too often extender cards don't work since the CPU becomes unstable driving the extra long lines. Just debugging the hardware is hard enough - try slipping a scope probe in between boards. It's not unusual to see a card with a dozen wires hastily soldered on, snaked out to where the scope or logic analyzer can connect.
Why make life so hard? Either design a robust processor board that works properly on an extender, or come up with a mechanical strategy that lets you put the CPU near the end of the cage, with the cage's metal covers removed, so you and the software people can gain the access so essential to high productivity debugging.
One DOD system's card cage is so tightly packed into rack of equipment that the developers could only remove the "wrong" (i.e., circuit side) of the card cage cover. Their solution: solder the processor socket on the circuit side of the board, and then make a pin swapping jig (using parts from Emulation Technology or EDI) for the logic analyzer. Using a ROM emulator in a similarly tight situation? Consider the same trick, inverting one or more ROM sockets.
In the good old days microprocessors came in only a few packages. DIP, PGA, or PLCC, these parts were designed for through-hole PC boards with the expectation that, at least for prototyping, designers would socket the processor. Isolating or removing the part for software development required nothing more than the industry-standard chip puller (a bent paper clip or small screwdriver).
Now tiny PQFP and TQFP packages essentially cannot be removed for the convenience of the software group. Once you reflow a 100 pin device onto the board, it's essentially there forever.
Part of the drive towards TQFP is the increasing die complexity. That tiny device is far more than a microprocessor; it's a pretty big chunk of your system. The CPU core is surrounded with a sea of peripherals - and sometimes even memory. Replace the device (somehow!) with a development system, and the tool will have to replace both the core and all of those high integration devices.
Take heart! Most semiconductor vendors are aware of the problem, and take great pains to provide work-arounds.
There's no cheap cure for the purely mechanical problem of connecting a tool to those whisker-thin pins, but at least the industry's connector folks (Emulation Technology, EDI, Pomona, HP) sell clips that snap right over the soldered-on processor. The clip translates those SMT leads to a PC board with a PGA or header array which your tools can plug into.
Most of these vendors also offer adapters that plug into SMT sockets, or that solder onto the board in place of the microprocessor. I see very few QFP applications that use sockets - it seems to defeat the whole reason for going to surface mount. As far as replacing the processor with an adapter that solders on the board, well, personally, I think this is mechanically the soundest approach. However, once the adapter is connected in this fashion, your system is a development prototype forever.
Tool Tradeoffs
Debugging tool vendors all promote the myth of "non-intrusive tools". In fact, we demand just the opposite - what could be more intrusive, after all, then hitting a breakpoint?
Other forms of intrusion are less desirable but inevitable as the hardware pushes the envelope of physical possibilities. If you don't recognize these realities and deal with them early, your system will be virtually undebuggable.
Make sure the CPU (when using an ICE or logic analyzer) or ROM sockets (ROM emulator) are positioned so it's possible to connect the tool. Be sure the chip's orientation matches that needed by the emulator or analyzer.
Don't push the timing margins! All emulators eat nanoseconds. With no margin the tool will just not work reliably. I've seen quite a few designs that consume every bit of the read cycle. Some designers convince themselves that this is fine, the timing specs are worst case scenarios met at max or min temperatures leaving a bit of wiggle room for the tool. As speeds increase IC vendors, though, leave ever less slop in their specifications, so it's dangerous to rely on a hope and a prayer.
Before designing hardware talk to the tool vendor to learn how much margin to assign to the debugger. Typically it makes sense to leave around 5 nsec available in read and write cycle timing. Wait states are another constant source of emulator issues, so give the tool a break and ease off on the times by four or five nanoseconds there, as well.
Be wary of pull-up resistors. CMOS's infinite input impedance lures us into using lots of ohms for the pull-ups. Remember, though, that when you connect any sort of tool to the system you'll change the signal loading. Perhaps the tool uses a pull-down to bias unused inputs to a safe value, or the signal might go to more than one gate, or to a buffer with wildly different characteristics than used on your design. I prefer to keep pull-ups to 10k or less so the system will run the same with and without an emulator installed.
If you use pull-down resistors (perhaps to bias an unused node like an interrupt input to zero, while allowing automatic test equipment to properly bias the node in production test), remember that the tool may indeed have a weak pull-up associated with that signal. Use too high of a resistance and the tool's internal pull-up may overcome your pull-down. I never exceed 220 ohms on pull-downs.
Synchronous memory circuits defeat some emulators. These designs ignore the processor's read and write outputs, instead deriving these critical signals from status outputs and the clock phase. Vadem, for example, makes chip sets based on NEC's V30 whose synchronous timing is famously difficult for ICEs.
This sort of timing creates a dilemma for ICE vendors. What sort of signals should the emulator drive when the unit is stopped at a breakpoint? A logical choice is to drive nothing: put read, write and all other control signals to an idle, non-active state. This confuses the state machine used in the synchronous timing circuits, though; generally the state machine will not recover properly when emulation resumes, and thus generates incorrect reads and writes.
Most emulators cannot afford to completely idle the bus, anyway, as it's important to echo DMA and refresh cycles to the target system at all times.
Since the processor in the ICE usually runs a little control program when sitting still at a breakpoint, another option is to echo these read/write cycles to the bus. That keeps the state machine alive, but destroys the integrity of the user's system because internal emulator write cycles trash user memory and I/O.
Another possibility is to echo the cycles, but fake out write cycles. When the emulator's CPU issues a write, the ICE drives an artificial read to the target. Unhappily, on many chips read and write cycles have somewhat different timing, which may confuse the user's state machine.
None of these solutions will work on all CPUs and in all user systems. If you really feel compelled to use a synchronous memory design, talk to the emulator vendor and see how they handle cycle echoing at a breakpoint.
Consider adding an extra input to your state machine that the emulator can drive with it's "stopped" signal, that shuts down memory reads and writes. Talk timing details with the vendor to ensure their "stopped" output comes in time to gate off your logic.
Bear in mind that at a breakpoint every emulator behaves differently. Ask the vendor deep questions, such as:
' If you're using DMA, do DMA cycles continue to run at a breakpoint? Some emulators properly service cycles generated by DMA controllers located on-board the processor but fall down on externally requested DMA. Many, when stopped, will not do DMA from emulation RAM to external memory or I/O.
' Interrupts mirror the DMA problems. Do you expect to service interrupts when stopped? If so, be sure the emulator properly supports this, and can handle vectors located both in emulation and external memory.
' Timers always cause grief. Do you want a timer internal to the CPU to stop running when the code hits a breakpoint? Some emulators don't shut them down.
Conclusion
It's amusing - or sad - to watch designers troubleshoot a board or code. Too many take shortcuts that ultimately cost hours and days. When someone repeatedly struggles to hook up a ground clip - instead of soldering a permanent lead on the board - you know they are acting foolishly.
Years ago a sailboat rounding Cape Horn was rolled over by a big wave, opening a huge hole in the side. Water was pouring in, yet the boat was saved by a sailor who first took time to sharpen the owner's rusty collection of tools. He saved time - and the boat - by getting his tools right first, and only then patching the hole.
Good designers build creations that work as planned; great designers construct systems that both work and are debuggable.