Wednesday, May 16, 2007

Save Your Data On Bacteria

I worry about stupid things, sometimes. Not practical things, like how I’m going to potty train my toddlers or where I can buy an affordable steam cleaner, but about the big, giant things I could probably never do anything about. Those “oh my god” sorts of problems that most people shake their heads about but don’t think on any further because they’re sensible enough to know they couldn’t possible do anything about it, like forging peace in the Middle East, controlling AIDS in central Africa, and stopping the polar ice caps from melting. You know, the “little things.”

For instance, I worry about what’s going to happen to all the monumental loads of data and information the world generates every day. Is it safe? Where does it all go? How can we insure it never gets lost or corrupted? I absolutely loathe the idea that someone would spend valuable time collecting information and recording it, only to lose it. I guess that’s what makes me a good lab rat. And, yes, I have multiple backups for all my data, right down to a spreadsheet for my VCR and DVD movies at home. What? You don’t have such a spreadsheet? How else am I to remember that I have a VCR tape copy of Highlander? (“There can be only one!”)

Previously I posted about how a state official in Alaska had accidentally deleted 800,000 refund payment files, then accidentally deleted the backup disc. A second backup had been corrupted. This is exactly the sort of thing that worries me. Luckily they had the original paper documents, and after months of overtime by employees they re-entered all the data.

Once upon a time all we had was papyrus and velum, then we invented paper. There are still a few of these ancient documents around, preserved by desert conditions and now tucked away in museums, but think of all the documents that were lost over thousands of years! Where would civilization be, now, if they had been safeguarded better? Essentially, most of our modern information is still on paper, paper that for the most part is made with cheaper and less durable ingredients, I might add. Now we have digitized storage media, but the accident in Alaska shows how unreliable that is, even in the short term. Hundreds of years from now, do you think we’ll be able to retrieve that information? Do you think we’ll even have the same technology lying around to do it?

Well, now some Japanese researchers may have found a way to help alleviate my worry. They have found a reliable method to store data on the DNA code of living bacteria, which could protect that data for hundreds or even thousands of years!


Scholarly article:

That’s right, save data in the DNA of living bacteria in a manner similar to storing data on computer discs. And you thought bacteria was only good for making beer and cheese!

For you non-science types, DNA is made up of four components, called nucleotides, which pair up in specific combinations, or genes, to code for the production of all the proteins that make up cells, organs, and, eventually, YOU, and every other living thing on earth. For decades, molecular biologists have found increasingly clever ways of identifying those codes, manipulating them, synthesizing them, and inserting them into DNA sequences. Left alone, these genetic sequences take thousands, or even millions, of years to change due to random mutations as they are inherited from generation to generation.

Dr. Masaru Tomita and his colleagues have found a way to store data by synthesizing their own genetic sequences. Each combination of the nucleotides in these sequences corresponds to specific binary codes. These binary codes can then be matched with specific letters or numbers. Those sequences were then inserted into the DNA of living bacteria (of the species Bacillus subtilis).

They successfully inserted, then later retrieved, the codes for the phrase "E=mc^2 1905!", referring to Albert Einstein’s theory of relativity and the year he published his Nobel-prize-winning theorem. Because they inserted the code in four different locations in the DNA, mutation in one copy can be corrected by the other three copies. Computer simulations, based on the predicted rate of mutation, suggest the code is secure for hundreds to thousands of years.

Though the amount that can be stored in the bacteria is limited by the genome size, and the person who eventually reads the data would need to know the code that deciphers the nucleotide combinations and matches them with numbers and letters, Tomita’s technique essentially safeguards the information far, far into the unseen future. According to the CNN story: "Many people never even thought about storing data for thousands of years," Tomita said. "This may sound like a dream. But we're thinking hundreds of millions of years."

Now THAT would solve my insane worries. All we have to do is figure out how to make a "bacterial disc drive" to store my data for the next thousand years. Then where would I store it? The fridge? -- “Wait, Honey! Don’t throw out that rancid milk! Those bacteria have my movie database saved on them!”

1 comment:

Anonymous said...

This is not new. see: