National Archives sign at Kew Gardens Station

National Archives sign at Kew Gardens Station


Science historians and digital curators grapple with digital data deluge

On August 28, 02009 the Wall Street Journal (online edition) published an article by Robert Lee Hotz titled "A Data Deluge Swamps Science Historians." Here are some notes I made of a printout of this article:

Dr. Jeremy Leighton John is the first curator of eManuscripts at the British Library and has assembled his own museum of dead media in order to access obsolete digital data storage media. At the time of the Wall Street Journal article, he was also on the tail end of a research program at the British Library called Digital Lives.

Update for February 21, 02011: I searched the British Library Web pages (10,000+) for "eManuscripts" and only came up with 4 hits. I'm not sure what this means, but it is rather surprising to me.

Digital curator Sayeed Choudhoury at Johns Hopkins University is "principal investigator for a national consortium of data preservationists called the Data Conservancy." (hyperlink added)

Update for February 21, 02011:
On the News and Events page of the Data Conservancy site, the only news between the awarding of a huge ($20 million!) grant in October 2009 for the Data Conservancy's work and today were a link to this online article "Rethinking scientific data management" (October 27, 02010,, two video interviews of a student and a professor, an article from the Chronicle of Higher Education "A Digital Library Guru Discusses New Rules on Sharing Scientific Data" (January 28, 02011) and a February 14, 02011 report of an award to professor Christine L. Borgman for her academic research -- no mention, however, in the report of the Data Conservancy. She was also the professor whose video interview is among the news and events. So where has the $20 million gone over the past 18 months or so. On the Objectives page, the answer is given:

The first 18 months of DC were focused on prototyping, which have created the foundation for full-fledged preservation, improved conduct of science, and developed greater insights into current science and frameworks for new forms of science. In the next three years, DC will:

* Augment the open and flexible architecture for data curation and data synthesis.
* Extend the current data model or define new data models.
* Develop additional pilots and proofs of concept.
* Research the full problem space of CI development and cross-disciplinary science.
* Strengthen connection points between DC socio-technical research and infrastructure.
* Create a DC operational environment that provides data management support.
* Build capacity through continued community engagement of various stakeholders.
* Expand upon initial sustainability planning through case studies and further market analysis.

The University of New Mexico has a data-preservation network it calls DataONE. It also received a $20 million grant from the U.S. National Science Foundation, the same amount it gave the Data Conservancy. Ironically or not, DataOne Director William (Bill) Michener is quoted as saying "We lose an awful lot of data that is collected with public funds."

Update for February 21, 02011: According to its first-year report (August 23, 02010), DataONE appears to have accomplished a lot more than the Data Conservancy.

The article concludes with mentions of Japanese researchers who early in 02009 revealed a "memory chip designed to last for centuries" and California research physicists who in April 02009 "published the design of of a digital device that could store data for a billion years, at least in theory."

Update for February 21, 02011: Here's an article from SEED Magazine about the billion-year data storage device.

No comments:

Post a Comment