RECMGMT-L Archives

Records Management

RECMGMT-L@LISTSERV.IGGURU.US

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Mark Conrad <[log in to unmask]>
Reply To:
Records Management Program <[log in to unmask]>
Date:
Tue, 19 Oct 2010 14:20:33 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (47 lines)
The National Center for Supercomputing Applications (NCSA) has recently released new information concerning their research related to the Data Format Description Language (DFDL). DFDL is a language to describe existing data formats, both binary and text, in such a way that the content of a file can be viewed without using the creating software or an existing viewer. It is a draft standard specification from the Open Grid Forum (OGF)  (http://forge.gridforum.org/projects/dfdl-wg). The NCSA researchers have been working with the OGF working group that is developing the DFDL standard for a number of years. The National Archives and Records Administration's (NARA) Center for Advanced Systems and Technologies (NCAST) has supported this work.
 
As part of their work, the NCSA Researchers are developing a framework and prototype tools for accessing data in arbitrary file formats and providing interpreted information from those files in XML and RDF representations, supporting discovery and long-term preservation of the content. 
 
"Preservation can be thought of as communication with the future. The records we preserve today need to be accessible and displayable by future technology. Beyond maintaining the accessibility of the raw bits of the digital data, preservation requires maintaining an ability to interpret the data as meaningful structures, relationships, and visual representations.
 
We are contributing to the development of a preservation system that would dramatically lower the per-file-format effort required for preservation. In particular, we are contributing to the development a format description language (the Data Format Description Language) and format-independent parser (Defuddle) to support interpretation of arbitrary binary or ASCII formatted files in terms of well-defined logical models." (See http://cet.ncsa.uiuc.edu/projects/naraDefuddle.html)
 

The NCSA researchers have developed the first parsers to implement DFDL. Defuddle is a free open source DFDL parser. You can find it here:
 
http://sourceforge.net/projects/defuddle/
http://defuddle.bzr.sourceforge.net/bzr/defuddle/files.
 
You can find a review of defuddle here:
 
http://cet.ncsa.uiuc.edu/publications/Review_of_Defuddle.pdf
 

The researchers are currently developing the second generation DFDL parser, daffodil. They plan to release it as open source in the near future.
 

You can find a report on the development of daffodil here:
 
http://cet.ncsa.illinois.edu/publications/Daffodil-ANewDFDLParser.pdf
 
 
 
 
Mark Conrad
NARA Center for Advanced Systems and Technologies
NHA 
The National Archives and Records Administration
Erma Ora Byrd Conference and Learning Center
Building 494 Second Floor
610 State Route 956
Rocket Center, WV  26726

Phone: 304-726-7820
Fax: 304-726-7802
Email: [log in to unmask] 

List archives at http://lists.ufl.edu/archives/recmgmt-l.html
Contact [log in to unmask] for assistance
To unsubscribe from this list, click the below link. If not already present, place UNSUBSCRIBE RECMGMT-L or UNSUB RECMGMT-L in the body of the message.
mailto:[log in to unmask]

ATOM RSS1 RSS2