Ten Thousand Year Blog (August 02010-): ArchivePress, a UK blog-archiving project

2010-03-08

ArchivePress, a UK blog-archiving project

According to its front page, ArchivePress

is a blog-archiving project being undertaken by the University of London Computer Centre and the British Library Digital Preservation department, funded by the JISC Information Environment Programme under its Rapid Innovation Grants Call (03/09).

The project will explore practical issues around the archiving of weblog content, focusing on blogs as records of institutional activity and corporate memory. As an alternative to the web crawling/harvesting approach of the Internet Archive and the UK Web Archive, ArchivePress will test the viability of using RSS feeds and blog APIs to harvest blog content (including comments, embedded content and metadata). The archived content will be stored and managed using instances of Wordpress, thereby maintaining the blogs’ native data structures, formats and relationships.

We hope to develop tools and methodology that will enable organisations to use simple, free, open source blogging software to manage a central archive of designated institutional blog outputs, even if they are spread over different blog hosts and platforms. The benefits of this approach will include:

targeted gathering of selected weblogs

improved reliability and authenticity of records

citable blog content with persistent identifiers

automated, ongoing harvesting, via newfeeds

accessibility of content, using native blog interfaces

use of native web and database file formats, compatible with registry-based preservation activities.

Source: ARCHIVES mailing list, 02010 03 08

Ten Thousand Year Blog (August 02010-)

National Archives sign at Kew Gardens Station

2010-03-08

ArchivePress, a UK blog-archiving project

No comments:

Post a Comment

Tags

Blog Archive