"The Email Parser ... migrates an email account and its messages into a single XML file using the Email Account XML Schema developed in collaboration with the North Carolina State Archives and the EMCAP project.
The CERP Email Parser migrates an email account in MBOX format into XML, using the schema to preserve the full body of messages, together with their attachments, and keeps intact the account’s internal organization (e.g., an Inbox containing subfolders labeled Policies, Special Events, and Projects). The CERP team successfully preserved email accounts from a variety of applications including Microsoft Outlook, AppleMail, LotusNotes, and Netscape. All email messages retain their full header content, in contrast to some tools produced in earlier research efforts.
The parser runs on a workstation in a virtual machine environment compatible with Windows, Macintosh, Linux, and some Unix platforms. CERP testing was limited to the Windows XP environment. The CERP Email Parser is licensed as open source software so that it may be used, supported, and enhanced by all organizations that adopt it.
The Email Parser is designed to address the task of preserving bodies of email, such as an account, without requiring access to the original email systems. Still, email accounts from active email systems may also be preserved using this tool. The CERP Email Parser will be featured in the pre-conference workshop “Achieving Email Account Preservation With XML” at the Society of American Archivists 2009 Annual Meeting this August."
Collaborative Electronic Records Project (CERP) email preservation parser available
The Collaborative Electronic Records Project (CERP), a partnership of the Smithsonian Institution Archives and the Rockefeller Archive Center that wrapped up in December 02008, has released its CERP Email Parser as an open source application. Here's more information from an announcement that's been circulated to various mailing list beginning on July 6, 02009: