The open source JHOVE characterization tool has proven to be an important component of many digital repository and preservation workflows. However, its widespread use over the past four years has revealed a number of limitations imposed by idiosyncrasies of design and implementation. The California Digital Library (CDL), Portico, and Stanford University have received funding from the Library of Congress, under its National Digital Information Infrastructure Preservation Program (NDIIPP) initiative, to collaborate on a two-year project to develop a next-generation JHOVE2 architecture for format-aware characterization.
Among the enhancements planned for JHOVE2 are:
* Support for four specific aspects of characterization: signature-based identification, feature extraction, validation, and rules-based assessment
* A more sophisticated data model supporting complex multi-file objects and arbitrarily-nested container objects
* Streamlined APIs to facilitate the integration of JHOVE2 technology in systems, services, and workflows
* Increased performance
* Standardized error handling
* A generic plug-in mechanism supporting stateful multi-module processing;
* Availability under the BSD open source license
To help focus project activities we have recruited a distinguished advisory board to represent the interests of the larger stakeholder community. The board includes participants from the following international memory institutions, projects, and vendors:
* Deutsche Nationalbibliothek (DNB)
* Ex Libris
* Fedora Commons
* Florida Center for Library Automation (FCLA)
* Harvard University / GDFR
* Koninklijke Bibliotheek (KB)
* MIT / DSpace
* National Archives (TNA)
* National Archives and Records Administration (NARA)
* National Library of Australia (NLA)
* National Library of New Zealand (NLNZ)
* Planets project
The project partners are currently engaged in a public needs assessment and requirements gathering phase. A provisional set of use cases and functional requirements has already been reviewed by the JHOVE2 advisory board.
The JHOVE2 team welcomes input from the preservation community, and would appreciate feedback on the functional requirements and any interesting test data that have emerged from experience with the current JHOVE tool.
The functional requirements, along with other project information, is available on the JHOVE2 project wiki http://confluence.ucop.edu/display/JHOVE2Info/Home. Feedback on project goals and deliverables can be submitted through the JHOVE2 public mailing lists.
To subscribe to the JHOVE2-TechTalk-L mailing list, intended for in-depth discussion of substantive issues, please send an email to (listserv at ucop dot edu) with an empty subject line and a message
SUB JHOVE2-TECHTALK-L Your Name
Likewise, to subscribe to the JHOVE2-Announce-L mailing list, intended for announcements of general interest to the JHOVE2 community, please send an email to with an empty subject line and a message stating:
SUB JHOVE2-ANNOUNCE-L Your Name
To begin our public outreach, team members recently presented a summary of project activities at the iPRES 2008 conference in London, entitled "What? So What? The Next-Generation JHOVE2 Architecture for Format-Aware Characterization," reflecting our view of characterization as encompassing both intrinsic properties and extrinsic assessments of digital objects.
Through the sponsorship of the Koninklijke Bibliotheek and the British Library, we also held an invitational meeting on JHOVE2 following the iPRES conference as a opportunity for a substantive discussion of the project with European stakeholders.
A similar event, focused on a North American audience, will be held as a Birds-of-a-Feather session at the upcoming DLF Fall Forum in Providence, Rhode Island, on November 13. Participants at this event are asked to review closely the functional requirements and other relevant materials available on the project wiki at http://confluence.ucop.edu/display/JHOVE2Info/Home prior to the session.
Future project progress will be documented periodically on the wiki.
JHOVE2 project underway
Quoting from the announcement on DIGLIB (02008 11 06):