Information Age Losing Memory
Items on outdated systems might be impossible to retrieve

From USA Today 10/21/99
by Bob Waggoner
 
Ask genealogists to name the nation’s worst tragedy, and they’ll say the 1921 fire that wiped out 98% of the 1890 federal Census, a priceless snapshot of the U.S. population at the end of the 19th century. The vast amounts of data being stored today - financial records, pictures, video and e-mail - are in danger, too, but from a different threat: changes in the hardware and software used to read them.
 
As technology speeds ahead, it leaves old machines behind. Data stored on a computer today might be just as useless as music stored on a 1970s eight-track tape. And if the technology doesn’t leave you behind, the disks themselves could deteriorate, making them as unreadable as a dinner plate. Anyone who has ever had a computer’s hard drive blow up knows how ephemeral electronic data can be. And hard drives aren’t the only things that can go zap in the night. Depending on how they’re made, 3 ½-inch floppy disks can start to deteriorate in 18 months. Iomega, which makes high-capacity disks and drives, warranties its Zip drives for only five years, although they can last longer. CD-ROMs can last from five to 50 years, depending on the quality of the disk and how it’s stored. That’s a good length of time, but if you want to hand down a disk to your kids, you’d better back it up every few years - or print out the contents. Places like the National Archives, whose mission is to preserve records forever, use high-quality reel-to-reel magnetic tapes, which have a life of 10 to 40 years. They recopy their tapes every 10 years and test them every so often to make sure the data is still intact. “As long as our tapes last at least 10 years, we’re OK,” says Mike Carlson, director of electronic media. “We have never lost big chunks of stuff.”
 
But the biggest enemy of long-term data storage is obsolescence. “The problem is not so much whether the disk will be around but whether there will be a drive around to read it,” says Fynnette Eaton, director of technical services at the Smithsonian archives. “The simple example is the 5 ¼-inch floppy disk,” says Weston Thompson, chairman of the Electronic Records Section of the Society of American Archivists. Few computer users use 5 ¼-inch floppies any more, and fewer computers have 5 ¼-inch drives. Still, Thompson recalls cases where archivists would be presented with cases of old 5 ¼-inch floppies - and even older 8-inch floppies, the technological equivalent of 78-rpm records. All too often, there is no way to read the data. “People often call asking if we have a service that reads old disks,” says David Allison, director of the Smithsonian Institution’s division of Information Technology and Society. They don’t. Scientific observations recorded on obsolete equipment are often lost forever. “If you lose astronomical data, you can’t crank the universe back again,” he says. And unlike a piece of lint on a floppy disk, obsolete technology doesn’t just lose a piece of information - it loses all of it. “It’s not losing a block of data, it’s losing the entire database,” says Thomas Brown, manager of archival services in the National Archive’s electronic and special media records services division. For example, the standard computer character set is American Standard Code for Information Interchange, or ASCII. But many 1970s-era databases are stored in Extended Binary-Coded Decimal Interchange Code (EBCDIC), an IBM rival to ASCII that has gone the way of the Neanderthals. The Archives uses special programs to convert EBCDIC to ASCII. If you don’t have that program, you’re out of luck.
 
This year’s version of the 5 ¼-inch floppy could be the CD-ROM. So far, there are several competing systems for reading and writing CD-ROMs, says Mark Giguere, records and information systems analyst at the National Archives. What might be readable today on CD-ROM might not be in a decade.
 
If DVD replaces CD-ROMs, you have to move your critical data to the new technology. And don’t leave anything you want to save on your computer’s hard drive.
 
The digital revolution has made it possible to store text, audio, pictures and film on computer, and the amount of data being stored is growing exponentially. The National Archives is bracing for an electronic collection of 25 million cables from the State Department and another 25 million e-mail messages from the Clinton White House this year. Private business is adding even more electronic data to be stored, including e-mail, customer account information and Internet Web pages. Financial giant T. Rowe Price, for example, now stores about 10 terabytes of corporate and customer data. Each terabyte is 1,000 gigabytes, or enough data to fill a 30-mile-high stack of paper. In 1996, Price stored only 800 gigabytes.
 
“We’re increasing disk capacity 50% to 60% every year,” says Michael Goff, managing director of Price’s technology group.
The Vanguard Group of mutual funds is now storing 25 terabytes of data, much of which it’s required to keep for at least six years.
 
“In practice, we never purge transaction data,” says Jeff Dowds, Vanguard’s head of technology data.
 
One estimate: The world has one quintillion bytes of data, the equivalent of 10,000 Libraries of Congress, now in electronic storage. How much of that will last to 2099 is a big question.
 
During the 1994 presidential elections, the Smithsonian decided to preserve shots of some of the candidates’ Web sites. After all, it was the first time the Internet had been a force in an election. But historians faced several obstacles . First, there’s the changeable quality of Web sites, which are updated frequently. In many cases, items simply vanish - on average, in about 75 days. If you were to keep a Web site, at which point do you preserve it? Other questions involve electronic documents. Should early versions be preserved? Ernest Hemingway revised the last page of A Farewell to Arms 39 times. Had he used a word processor, 38 of those revisions would have been lost.
 
In the longer term, there’s the thornier issue of how to preserve any electronic document that includes formatted text - boldface, italics, pictures, drawings, moving objects. There is now no universally agreed way to save it. Scientists at the San Diego Supercomputer Center are working on a universal computer language for archiving multimedia documents. Saving texts in plain-text ASCII - which doesn’t have formatting, such as bold or italics, also won’t give you the look or feel of a Web page. And in some types of documents, such as those from the Supreme Court, use of italics and underlining can add significantly to meaning. It’s easy to underestimate how much electronic storage has helped scholars. If you don’t understand, spend a few hours hunting through a crumbling pile of papers. But if you want to pass on more than a blank disk to posterity, you’d better remember to move important data to new disks every few years. If you have photos, save some prints -preferably black-and-white, which last longer than color, archivists say.
 
And remember that in the long run, printouts, on good paper, might provide your best record.



 

 

The Archive Company

139 West Wooster Street · Bowling Green, Ohio 43402
Voice: (419) 352-2227 · Fax: (419) 352-2555