The ICE Blues
Using an emulator? Here are some gotchas to watch out for.
Published in Embedded Systems Programming, November 1990
By Jack Ganssle
It's interesting that embedded development environments really haven't changed all that much over the years. During the first half of the 70s I worked with an Intellec 8, a pseudo emulator for the 8008. A teletype with paper tape reader and punch served as both console and mass storage. Our first product used an enormous 4k of code (requiring sixteen 1702 PROMs). The edit/assemble/link cycle filled three entire days - three days to make even a trivial code change. Obviously we patched, patched and patched, reassembling only perhaps monthly.
In 1975 Intel invented the first true emulator - the MDS 800 for 8080 processors. With dual 200 K eight inch floppies, full speed emulation (as I recall it ran at a blazing 2 Mhz), real time trace and breakpoints, it seemed a dream machine and formed the backbone of our organization for several years. Our programs were now by now running about 30 K, but assemble/link times were well under an hour.
Today we have faster host computers with more disk storage, better emulators, and much better languages and utilities. Probably the two biggest improvements in our environment (in addition to raw computer speed, which always increases) are decent high level languages and source level debuggers.
Now I work in the development tool business. It's the best of jobs and the worst of jobs. The best because nothing is stable; change is the order of the day, so we never have a chance to get bored. The worst, because our tools plug into literally thousands of target systems, and, whether the user's target is carefully designed or just hacked together, our units must work perfectly.
Where the typical engineer or programmer might work on only Two or three systems a year, we get involved in hundreds. Despite the evolution in processors, tools, and applications over the years, the same handful of problems keep cropping up.
It's unfortunate that the embedded design world is polarizing into distinct hardware and software camps. This separation creates barriers to really elegant hardware/software integration. Worse, hardware- or software-only types don't have the background to effectively use a complex tool like the emulator, which sort of straddles both groups.
Any emulator tries to be all things to all people. It must work in 100 Khz ultra-low power CMOS systems as well as the latest 16 Mhz speed demon; with simple static memory circuits as well as frequently marginal DRAM designs; with compilers whose debugging files run the gamut from wonderfully complete to woefully inadequate. In fact, while most ICEs will work in most systems, no emulator will run in every target.
A number of hardware and software issues become important when using an ICE. These are not simple tools like scopes or ROM emulators! You can avoid many of the problems by considering the limitations inherent in any emulator before building hardware, writing disks of undebuggable code, or selecting your language tools.
Debuggers and Languages
Most programmers developing embedded code for 16 and 32 bit processors seem to understand the need for some sort of source level debugger. Many 8 bit users are not so well informed. In the January, 1990 issue of Embedded Systems Programming I discussed the advantages of source debugging at length, but it bears repeating: beg, borrow, or steal some sort of reasonable source debugger before starting any project.
Think about it: how will you debug embedded C code? Remember that in a sense the emulator itself just provides the low level hardware resources you'll need. It's "raw" (without a debugger) interface will be terrible, no matter what the vendor claims. Source debuggers are shells around the unfriendly native emulator environment, shells that provide an efficient windowed display, and link the debugging session to your original source code.
Emulators need addresses to access memory. Only a decent debugger can translate "line 3 of function FOO" to the emulator's "address=12AB". There is no obvious correlation between the original C source and the disassembled machine language you'll get from a naked emulator. Without the debugger shell it's all but impossible to work with your code and variables.
Even if you work entirely in assembly language, by all means use a source level debugger. Even the most expensive debuggers are cheap compared to the amount of time you'll save.
Debuggers do need to know a lot about the compiler or assembler and the emulator. Be sure the tool chain is truly compatible.
RULE 1: Select a language, linker, debugger, and emulator that work well together, BEFORE writing thousands of lines of code.
We still come across programmers who eschew any form of help from tools. Once or twice a year I speak to customers confused about the emulator's use of mnemonics. When I try to convince them to buy an assembler from one of dozens of vendors they invariably refuse. "I know all of the machine opcodes", the argument usually goes, "I wrote all 5,000 instructions just by translating the codes in my head. Why should I use an assembler?" This, unfortunately, is not a joke.
In the 8 bit world we also see a lot of people using ancient CP/M-based languages. Perhaps they run a CP/M simulator on a PC, but one should always be wary of getting caught in an unsupported technology backwater.
RULE 2: Don't buy an ICE if your software house is not in order. Use quality language products. Learn about new compilers, linkers, and the like to sharpen your skills and improve your productivity.
I'm writing this in August when general the economic outlook is grim. If a recession rears its ugly head, you want to survive the inevitable layoffs and cutbacks. Become known as a productive employee by experimenting with new concepts, culling those that work, and taking advantage of tools and techniques that save you time.
Downgradable Code
To misquote a popular bumper sticker: things happen. (After all, this is a family magazine). Plan for the worst. Sometimes your wire-wrapped prototype just won't operate at the design goal of 10 Mhz. Or, if you're using leading edge components, the very fast memories or peripherals you need might not be available until long after debugging is slated to begin. (Along these lines, always be cynical about the REAL availability of new chips).
Successful generals and businesspeople always plan for disaster and develop solutions to meet these potential problems before they become real. We should do the same.
Speed, timing, and noise are the bane of new projects and emulators. When you design your code don't assume it will run at the advertised speed, for you may find that until production boards are finally available you'll have to slip a slower clock into the prototype. Come up with a way to slow things down and still have enough of the system functional to do useful debugging. If your code really won't run at all unless the clock is running full bore at 33 Mhz, then count on spending many sleepless nights at the office.
Some processors now come with integrated cache memories, burst mode DMA controllers, and all sorts of other wonderful speed-enhancing silicon. Try to come up with a software architecture that can take advantage of these features, but one that you can debug even if you have to disable some of them.
For example, the 486 and 68030 have on-board caches that can greatly increase software throughput. Unfortunately, cache is a nightmare to design an emulator around. If your program is executing a loop in cache no external bus cycles are generated. How is the emulator to trace instruction execution? One solution is to use a "bond-out" chip, a special version of the CPU provided by the vendor that brings internal busses out to extra pins, so the emulator can monitor the actual instruction execution stream.
Bond-outs are no panacea. Some foundaries, notably Intel, refuse to sell them since they are in the ICE business and don't care to help the competition. Yes, some other solutions exist. One company cuts the top off 286s and bonds extra wires into the chip to get to the needed signals. Motorola has a "cache disable" pin on the 68030. This is suggestive: presumably some ICEs run the chip with the cache disabled when you wish to breakpoint or collect trace data. "Horrors!" you scream, "the emulator should somehow work perfectly with cache on and off". Are you willing to pay tens of thousands of dollars extra for this? If you can run with cache disabled, your code is that much more debuggable.
At the other end of the processor spectrum, low performance single chip microcontrollers can be a source of emulation grief because of their on-chip ROM. Some emulators use bond-outs and so can non-intrusively monitor all CPU operations. In cases where no bond-out is available some emulators use the chip in an expanded mode and simulate the internal ROM with external RAM. A lot of microcontrollers run slower in external bus mode: NEC's K-series runs about a third as fast as with internal ROM. Hitachi's H8 adds an extra cycle to external bus accesses, reducing throughput by 33%. There are good reasons for these longer cycles, but if your emulator runs 1/3 slower, be sure the code will work properly at these reduced speeds.
RULE 3: Design your code so it can run in some degraded, slower mode. Even if you really need full speed operation, localize the fast routines so 90% of the debugging can take place in a slower environment.
Timing Shifts
If your training leans towards the software side of the business learn about hardware. Don't be confined by a lack of learning! You'll never make full use of an emulator without some grounding in hardware issues.
As processor speeds creep up, physical aspects of the emulator become more important. Electrons move at the speed of light. We just can't push them any faster.
It takes a lot of electronics to make an emulator. All of it, including the emulation processor (i.e., the CPU that runs your code) is isolated from your target system by bus driver chips and the emulation cable. These ICs and the cable will shift the timing of signals applied to your system.
In wire electrons move at around 2 nsec per foot, so even an 18 inch cable will induce a 3 nsec shift in timing. If a signal must propagate up the cable and then back down (say an interrupt request and acknowledge pair), then the shift is about 6 nsec. At speeds above 10 Mhz this is a lot of time.
The bus driver chips will add more delay. The fastest parts available add 4.8 nsec. This, with the cable's 3 to 6 nsec, starts to be a lot of time. Suppose your 386 emulator added a 10 nsec shift to the signals. At 33 Mhz, 10 nsec is a third of the entire machine cycle! External cache tag RAMs are often specified at 15 nsec; if the emulator adds 10 nsec then 5 nsec tag RAMs might be needed.
Some 386 emulators (for example, Intel's) get around the problem by eliminating the cable and bus drivers. This is a reasonable way of making the emulator more transparent, but it does impose severe restrictions on the mechanical and electrical design of the target system. Intel does a pretty good job of documenting these tradeoffs, but all the documentation in the world is worthless if you don't follow the advice when first starting the design.
Even at 10 Mhz timing can be a problem. Zilog's Z280 uses a multiplexed address/data bus with rather tight timing requirements on the address strobe signal. Design with marginal speed components, and the emulator's 6 to 10 nsec added delay might eat up all of the margin and then some.
RULE 4: Expect the emulator to shift your timing a bit. Design enough margin into the system to accommodate these shifts.
Noise
If you live near an airport noise might be one of your biggest headaches. If you design with fast logic, electrical noise is certain to be one.
In the good old days digital designers knew only about quantum levels like ones and zeroes. Nothing in between existed. Now, with ever-increasing CPU speeds, we are forced to use extremely fast logic devices to minimize circuit propagation delays. The old LS logic (Low power Schottky) forgave poor PC layouts. Now designers use exotic logic like FCT and ACT. Not only do they produce little delay, their rise and fall times are blindingly fast.
These rapid edges generate noise spikes on the signals being transmitted and on power and ground lines. No longer can we view a PC trace as a perfect wire. Any trace longer than a few inches will look like a transmission line.
Minimizing noise is crucial in every fast design. By fast, I mean a design using FCT or ACT logic at any clock speed (since the edges are independent of clock), or one with a CPU speed over 10 Mhz.
An emulator only makes the noise problems worse. Again, consider the ICE's design: an emulation processor, separated from the target by bus drivers and cable. Electrically, especially since we're now talking transmission lines and not wire, this configuration little resembles an NMOS or CMOS CPU's output. Multiplexed busses cause the biggest problems. The 80186, for example, outputs an address and then the data on shared wires. When the address is valid an ALE signal is asserted, so the target system logic can latch it. If the ALE line has any noise, erratic addresses will be latched, causing grief and despair!
Some power will flow between the emulator and target, even if only in the signal lines. If the target's power distribution tracks on the PC board are poorly designed, then the emulator can induce even more noise.
You would be amazed by the number of systems we see with miserable power and ground tracks. A few weeks ago we received one with power tracks only .012 inch wide. Sure, the system was CMOS. But power consumption is not the issue; noise is. Wide tracks are important to reduce switching noise. We recommend 4 layer boards for most new designs.
RULE 5: Be sure your design has plenty of noise margin, as the emulator will only make the problem worse. Consider noise at each step of the design.
The single biggest hardware problem we see is noisy or inadequate clock circuits. Microprocessors are very sensitive to the clock's voltage level, duty cycle, and noise.
Clock inputs on microprocessors are rarely TTL levels. Most other inputs are considered a ONE when the input exceeds 2.4 volts. Not so the clock. Typically the clock's minimum voltage for a ONE is about 4.5 volts.
RULE 6: Be very sure your clock circuit drives clock properly.
Did you know that the 8088 must have a clock with a 33% duty cycle? We recently saw a production PC motherboard running at 50%. Sure, it usually works, but why take chances?
The ICE Helps
Sometimes we get a call from a customer with, from our standpoint, the best sort of problem. His system works perfectly with the emulator but not at all with the CPU installed. We're always tempted to tell him to just install an emulator in each unit.
Making the transition from developing with an ICE to plugging in a ROM and CPU can be painful. Usually the engineers wait until the night before delivering the product to do this for the first time, only to find that nothing works.
In fact, the ICE can sometimes mask real problems. It always makes sense to try your system standalone every once in a while to insure that unrealized problems don't lie dormant till delivery time.
Usually these problems come from a bug in the initialization code. Perhaps the emulator comes up with memory zeroed and the code relies on this. Or, a lot of developers start the program in its middle, perhaps skipping a long self test that has some unsuspected impact on the main routines.
Occasionally the culprit is a hardware problem. Usually the emulator alters or bypasses the target system's reset signal. Marginal resets will prevent reliable self starting. Like the clock, a CPU usually requires reset to go to above 4.5 volts. On some processors reset must be synchronous, so a simple resistor-capacitor circuit is out of the question.
Finally, the emulator can usually drive a lot of capacitive loads, generally far more than the processor. Be sure your circuit obeys the manufacturer's AC and DC loading guidelines.
RULE 7: Test your system standalone once in a while as a sort of sanity check.
Summary
This is not intended to be a gallery of ICE horror stories. The emulator, when coupled with a source level debugger, remains the most powerful debugging tool for embedded programming. To get the best performance from one with the least hassle, you must understand its limitations and design your code and hardware to be debuggable.