Longevity Planning

Do you have a backup strategy? 

Published in Embedded Systems Programming August 2004

By Jack Ganssle

 In 1066 William, the illegitimate son of the Duke of Normandy, invaded England and defeated King Harold at the Battle of Hastings. The nation's nobles weren't too thrilled with this French usurper, so William cleverly cut their power base by introducing feudalism. Though Europe had known feudalism for centuries it had never before been tried in England. Serfs (villeins) pledged loyalty to the King, not to their local lord, diminishing the lord's power. William scattered castles and manors far apart to prevent any aggregation of authority in any one lord. Over the next two decades he deposed all but two of the noblemen that had survived Hastings, replacing them with his own French followers.

 Instead of trusting the local lords to remit enough in taxes he had the Domesday Book, a record of virtually all property in the nation, compiled. By 1086 it listed over 13,000 places, detailing wealth down to the last pig. The name derived from the Medieval English spelling of "doomsday," and meant "the book of unalterable judgments." The original 900 year old book is still in fine condition in the Public Record Office in Kew, London.

 But not so its digital rendering, created at a cost of ₤2.5 million in 1986. In just a couple of decades the technology needed to read the antique pair of videodisks disappeared.

 When we rely on technology to preserve important information we're betting on stasis, a stability that just does not exist in this fast-paced industry. Everything changes, all the time.

 File formats become obsolete. What will happen to our picture collections when some new holographic format replaces JPGs and GIFs? Someone will surely write a translator of debatable accuracy, but that's doomed to failure for more complex file types. Even a simple upgrade from one version of Word to another nearly always often brings up page formatting or other issues in complex documents. Imaging the difficulty of writing a translator for database files with lots of interrelated links and scripting capabilities makes my head hurt.

 Hardware disappears. NASA is reputedly scrounging eBay for x86 boards needed to maintain Shuttle ground support equipment.

 Applications fade away. Do you have any files created in Wordstar or less popular word processors of the late 70s? Feed them to the virtual fire. Without the application, there's little hope of reading the files.

 People quit or die, taking vast hoards of critical but undocumented knowledge with them. In the 80s my former employer called in a panic. They needed one simple change to the 8008-based instrument we had delivered a dozen years before. I managed to resurrect the original development system, rebuilt a teletype, and edited the punched paper tape source files, all of which were brittle and flaking with age. The company wasn't willing to pay to transfer those "files" to disk so I'm sure the code all disappeared within a short time.

 And media, well, media changes so fast that anyone using CDs or DVDs for long-term preservation is guaranteed to amass a Hotel California library. You can check in, but you can never check out. A few years ago I was startled to find 8" floppies in the safe, next to a stack of 5" disks. My current computer has a 3 " drive! which has never been used. In the 70s we kept critical data on removable 14 inch hard drive platters, packing a whopping 5 Mb per disk. I bet those would be tough to read today. Before that, well, punched cards were everywhere. Now, even if the cards weren't unusably brittle from age, how would you suck the data off of these?

 

Reliable Backups

An entire industry exists simply to warehouse data. No doubt lots of that saved material is paper, but if their customers archive disks with the services then I hope the warehouse is meant for short-term storage only. That carefully-preserved media will soon be rendered unreadable and worthless by advancing technology.

 We're building our digital heritage on the quicksands of electronic systems that intrinsically have no permanence. Librarians and preservationists are grappling for solutions that can survive for centuries. That's a problem beyond my ken, but we practicing engineers do need a way to safeguard our source files. For embedded systems last forever. I often see products with 20+ year old designs, still being manufactured, and still benefiting from occasional firmware improvements.

 The three elements to keeping systems maintainable for many years are reliable backups, a program that carefully manages the use and eventual preservation of the tools, and conserving the project's documentation.

 Microsoft claims (http://www.itsecurity.com/tecsnews/oct2003/oct168.htm) that over 40% of small businesses back up their critical data less frequently than once a month. 27% never bother with a backup. Hey, backups are sort of annoying, and we code jockeys never do anything that's not fun, cool and exciting! right?

So it seems. Poking around too many companies I've found that all (so far) embedded outfits do have some sort of backup strategy. But an astonishing number of engineers really don't know what's preserved. Few outfits, unless there's a functioning IT group, ever bother to actually test the backups on a regular basis. They might as well be sending the data to a write only memory (https://www.ganssle.com/misc/wom.html), because backups, like all systems, do fail.

 Only totally naive or very gutsy companies expect engineers to preserve their own data. We're very good at a lot of things, but for some reason too many of us forget or defer making backups. Bosses incentivize, browbeat, and yell at the engineers to copy everything to tape once in a while, but somehow this never becomes an integral part of our psyche. The only solution is the company's version control system.

 Nearly all of us do use a VCS. If you don't, start now. This is the most fundamental tool of all, even for a one person organization. The VCS does allow us to recreate old versions and get change information, of course. But it's also a central database that maintains the entire project's build information. It lives on a server somewhere, one that's backed up every night. The developers might (will) forget to save their data, but if they're careful to extract only the code they need from the database then nearly everything is saved daily. Sure, a code terrorist could check out vast chunks of software. But one of a project leader's most important responsibilities is a

 The VCS is a very dynamic part of the business. When the server migrates to a next generation processor the database goes with it, stored on the current hard disk technology. Backup media might change from a 3.5" floppy in years past to an optical 100 petabyte storage cube, but this media is only intended to handle short term crashes. The goodies are on the hard disk, not locked in some dusty safe somewhere. Old and obsolete projects reside in the database along with the latest development effort. Media obsolescence is not a problem.

 Do be wary of IT people who claim the backups are entirely under control. I've been to too many companies that lost big chunks of important data because there was some problem that didn't come to light until the server's disk crashed. Do they back up daily? To what? Where are the media kept? Are the tapes or disks changed every day? How often is a backup tested - actually rolled into the server and examined?

 While RAID is nice the disks all live in the same room as the server. A fire destroys the original data and the redundant copies. Use a removable backup. Tapes, CDs and DVDs are the prevalent media, though large capacity Firewire disk drives are cheap and fast.

 CDs don't hold much, but have become so cheap that smaller organizations and consultancies rely heavily on them. I can just fit my critical docs onto one CD. DVDs, now only a buck or less, hold a lot more data and are just as easy to use. But both of these media suffer from a variety of ills. Originally advertised to offer 100 years of service, many of us can now peer through the eroded metal in some of our music collection's disk.

 CDs have three layers: a thick polycarbonate base, the metal layer, and then a very thin bit of lacquer deposited on the top, or label, side. That final layer is quite delicate, much more sensitive to damage than the polycarbonate.

 DVDs are much hardier. The metal is sandwiched between two glued-together equally thick polycarbonate layers. But that glue joint is subject to failure. Bend the disk when extracting it from the jewel case and you run the risk of the glue pulling some of the metal apart.

 DVDs, CDs and tapes should be kept in a cool, dry space when not being used. ANSI standard IT9.13 suggests keeping them at a constant 65 to 70F and 45 to 50% relative humidity (RH). Widely fluctuating temperature or RH severely shortens the life span of all recordings. Environmental conditions must not fluctuate more that 10 F or 10% RH over a 24-hour period. Keep recordings away from light, especially sunlight and unshielded fluorescent lights.

 Keep these media in their individual cases to buffer rapid environmental changes and protect them from airborne contaminants. Don't substitute paper sleeves for the cases; DVD and CD jewel boxes keep the disk from contracting the case. Remove paper materials which tend to attract and absorb water.

 Label CDs and DVDs with a xylene-free marker like the Staedtler Lumocolor CD/DVD Markers (available from Staples and many other office supply firms). Xylene will eat the lacquer layer. Never use a fine point marker which may dent the lacquer. It's best to write on the inner hub where there's no metal! and thus no data.

 Store disks vertically, like a book. On their side they may bow. And don't use paper labels on disks stored for more than about 5 years, as the paper can absorb moisture, inducing more bowing.

 Never use rewriteable CDs or DVDs as they contain a heat-sensitive layer that decays much faster than the metal layers of cheaper write-once products.

 

Manage Your Tools

Ten years have elapsed since releasing that cool (then) Colorimeter 1994. A major customer wants a change. Can you even find the compiler's original disks? Are they readable? Where will you get a 5" drive?

 You call the vendor. They're out of business, or don't support that product anymore. Or maybe you can install the ancient tools, but discover that the compiler, which once ran under DOS or Windows 3.1, crashes under XP.

 At the end of the project preserve your tools. Check them into the VCS. They'll only consume disk space, which costs nothing today. Save everything - make and project files, locate maps, every scrap of binary that's associated with the product.

 Disks are cheap. Even my home machine has 400 Gb.

 Don't toss old computers. When it's time to replace a PC, emulator, or ROM burner, lock the devices in a closet, intact, with users' manuals, cables, and all accessories. These devices have zero value by the time you're ready to upgrade them, so the cost to the company is nothing.

 Avoid upgrading operating systems. It's probably cheaper to replace a computer with a pre-installed OS than to wipe the disk and reload everything. You'll probably forget to move some critical utility anyway, so it'll be lost forever. Buy a new machine, put both on the network, and copy all of your goodies across. Store the old computer in the closet with the other bits of antiquity. New OS releases are a good time to replace the hardware anyway.

 A lot of tools hook themselves deeply into your computer's guts via licensing technologies like Macrovision's super-annoying Flex. Want to retire an old machine and port your tools to the latest 100 gigamegaweeniehertz Pentium 15? Forget it. I want that old box in the closet, loaded and ready to go, even if it sits idle for years. Perhaps one solution is to get many licenses, but I prefer to find open, or at least dongled, tools. It's easy to move the dongle between development machine, laptop, and the old clunker awaiting its moment of glory.

 Entropy continues to exact its inevitable toll on those old machines fading into quiet obscurity in the closet. Insulation rots. Capacitors go bad. Moisture gets into the innards. It's important to turn the beasts on once in a while. I sure hope you use some sort of electronic organizer. Time & Chaos (http://www.chaossoftware.com/), a $45 product, is my favorite. Program an action item that pops up once every year to remind you to give these machines just a bit of attention.

 When the reminder pops up, check to see if any of the files need porting. Did you split the VCS database to minimize its size? If the company migrated to a different VCS it might be time to convert the old file. Yes, this is a pain, but viable products require some amount of maintenance.

 If your debug tool of choice is an emulator, consider adding a dead-simple debug port like a BDM or ROM monitor. In ten years maybe the emulator won't come back to life, but at least there's an alternative.

 Code in boring ANSI C or C++. Don't use extensions; don't exercise all of the compiler's cool features. Remove all warnings and get a clean Lint. This means using explicit casting, too many parentheses, and the like. If you do have to change compilers for some reason, not stressing the language greatly reduces the effort required to get the code working with the new tool.

Conclusion

We're amazingly bad at documenting our work. Partly that's because our writing tools are so poor. The systems we build are so interrelated that any flat-file approach to codifying knowledge is awkward at best. I find that a Wiki (http://en.wikipedia.org/wiki/Wiki) is one of the easiest and most effective ways to document projects. The web-based Wiki offers utter simplicity, hypertext relations, and multi-user support. Like the VCS a Wiki lives on the company's server so migrates from one generation of media to the next.

 When the project is finally done, ask yourself some simple questions. In 10 years, when I'm called out of retirement to change this thing, what will I wish I had saved? What equipment will I wish I had? Documents? People! and how can I codify their knowledge now?

 As the Domesday Book showed, it's pretty hard to beat paper for documentation. Maybe the best strategy is to dump the entire source tree on punched cards.