Ten Thousand Year Blog (August 02010-): NetarchiveSuite Web harvesting software from Denmark

2010-08-26

NetarchiveSuite Web harvesting software from Denmark

According to the Web site for the NetarchiveSuite software, it was "developed by the two national deposit libraries in Denmark, The Royal Library and The State and University Library, and has been running in production, harvesting the Danish world wide web for three years. The Danish netarchive currently contains over 120 TB of data that are mirrored on two different geographical locations." It's open source software based on the Heritrix web crawler from the Internet Archive. You can read more information about it on the netarchive.dk project site. I first took note of it on my Ten Thousand Year Blog on July 20, 02004.

Ten Thousand Year Blog (August 02010-)

National Archives sign at Kew Gardens Station

2010-08-26

NetarchiveSuite Web harvesting software from Denmark

No comments:

Post a Comment

Tags

Blog Archive