Information Age
Losing Memory
Items on outdated systems might be impossible to retrieve
From USA Today 10/21/99
by Bob Waggoner
Ask genealogists to name the nation’s worst tragedy, and they’ll say the 1921
fire that wiped out 98% of the 1890 federal Census, a priceless snapshot of the
U.S. population at the end of the 19th century. The vast amounts of data being
stored today - financial records, pictures, video and e-mail - are in danger,
too, but from a different threat: changes in the hardware and software used to
read them.
As technology speeds ahead, it leaves old machines behind. Data stored on a
computer today might be just as useless as music stored on a 1970s eight-track
tape. And if the technology doesn’t leave you behind, the disks themselves could
deteriorate, making them as unreadable as a dinner plate. Anyone who has ever
had a computer’s hard drive blow up knows how ephemeral electronic data can be.
And hard drives aren’t the only things that can go zap in the night. Depending
on how they’re made, 3 ½-inch floppy disks can start to deteriorate in 18
months. Iomega, which makes high-capacity disks and drives, warranties its Zip
drives for only five years, although they can last longer. CD-ROMs can last from
five to 50 years, depending on the quality of the disk and how it’s stored.
That’s a good length of time, but if you want to hand down a disk to your kids,
you’d better back it up every few years - or print out the contents. Places like
the National Archives, whose mission is to preserve records forever, use
high-quality reel-to-reel magnetic tapes, which have a life of 10 to 40 years.
They recopy their tapes every 10 years and test them every so often to make sure
the data is still intact. “As long as our tapes last at least 10 years, we’re
OK,” says Mike Carlson, director of electronic media. “We have never lost big
chunks of stuff.”
But the biggest enemy of long-term data storage is obsolescence. “The problem is
not so much whether the disk will be around but whether there will be a drive
around to read it,” says Fynnette Eaton, director of technical services at the
Smithsonian archives. “The simple example is the 5 ¼-inch floppy disk,” says
Weston Thompson, chairman of the Electronic Records Section of the Society of
American Archivists. Few computer users use 5 ¼-inch floppies any more, and
fewer computers have 5 ¼-inch drives. Still, Thompson recalls cases where
archivists would be presented with cases of old 5 ¼-inch floppies - and even
older 8-inch floppies, the technological equivalent of 78-rpm records. All too
often, there is no way to read the data. “People often call asking if we have a
service that reads old disks,” says David Allison, director of the Smithsonian
Institution’s division of Information Technology and Society. They don’t.
Scientific observations recorded on obsolete equipment are often lost forever.
“If you lose astronomical data, you can’t crank the universe back again,” he
says. And unlike a piece of lint on a floppy disk, obsolete technology doesn’t
just lose a piece of information - it loses all of it. “It’s not losing a block
of data, it’s losing the entire database,” says Thomas Brown, manager of
archival services in the National Archive’s electronic and special media records
services division. For example, the standard computer character set is American
Standard Code for Information Interchange, or ASCII. But many 1970s-era
databases are stored in Extended Binary-Coded Decimal Interchange Code (EBCDIC),
an IBM rival to ASCII that has gone the way of the Neanderthals. The Archives
uses special programs to convert EBCDIC to ASCII. If you don’t have that
program, you’re out of luck.
This year’s version of the 5 ¼-inch floppy could be the CD-ROM. So far, there
are several competing systems for reading and writing CD-ROMs, says Mark Giguere,
records and information systems analyst at the National Archives. What might be
readable today on CD-ROM might not be in a decade.
If DVD replaces CD-ROMs, you have to move your critical data to the new
technology. And don’t leave anything you want to save on your computer’s hard
drive.
The digital revolution has made it possible to store text, audio, pictures and
film on computer, and the amount of data being stored is growing exponentially.
The National Archives is bracing for an electronic collection of 25 million
cables from the State Department and another 25 million e-mail messages from the
Clinton White House this year. Private business is adding even more electronic
data to be stored, including e-mail, customer account information and Internet
Web pages. Financial giant T. Rowe Price, for example, now stores about 10
terabytes of corporate and customer data. Each terabyte is 1,000 gigabytes, or
enough data to fill a 30-mile-high stack of paper. In 1996, Price stored only
800 gigabytes.
“We’re increasing disk capacity 50% to 60% every year,” says Michael Goff,
managing director of Price’s technology group.
The Vanguard Group of mutual funds is now storing 25 terabytes of data, much of
which it’s required to keep for at least six years.
“In practice, we never purge transaction data,” says Jeff Dowds, Vanguard’s head
of technology data.
One estimate: The world has one quintillion bytes of data, the equivalent of
10,000 Libraries of Congress, now in electronic storage. How much of that will
last to 2099 is a big question.
During the 1994 presidential elections, the Smithsonian decided to preserve
shots of some of the candidates’ Web sites. After all, it was the first time the
Internet had been a force in an election. But historians faced several obstacles
. First, there’s the changeable quality of Web sites, which are updated
frequently. In many cases, items simply vanish - on average, in about 75 days.
If you were to keep a Web site, at which point do you preserve it? Other
questions involve electronic documents. Should early versions be preserved?
Ernest Hemingway revised the last page of A Farewell to Arms 39 times. Had he
used a word processor, 38 of those revisions would have been lost.
In the longer term, there’s the thornier issue of how to preserve any electronic
document that includes formatted text - boldface, italics, pictures, drawings,
moving objects. There is now no universally agreed way to save it. Scientists at
the San Diego Supercomputer Center are working on a universal computer language
for archiving multimedia documents. Saving texts in plain-text ASCII - which
doesn’t have formatting, such as bold or italics, also won’t give you the look
or feel of a Web page. And in some types of documents, such as those from the
Supreme Court, use of italics and underlining can add significantly to meaning.
It’s easy to underestimate how much electronic storage has helped scholars. If
you don’t understand, spend a few hours hunting through a crumbling pile of
papers. But if you want to pass on more than a blank disk to posterity, you’d
better remember to move important data to new disks every few years. If you have
photos, save some prints -preferably black-and-white, which last longer than
color, archivists say.
And remember that in the long run, printouts, on good paper, might provide your
best record.