NetarchiveSuite Web harvesting software from Denmark
According to the Web site for the NetarchiveSuite software, it was "developed by the two national deposit libraries in Denmark, The Royal Library and The State and University Library, and has been running in production, harvesting the Danish world wide web for three years. The Danish netarchive currently contains over 120 TB of data that are mirrored on two different geographical locations." It's open source software based on the Heritrix web crawler from the Internet Archive. You can read more information about it on the netarchive.dk project site. I first took note of it on my Ten Thousand Year Blog on July 20, 02004.