RECMGMT-L Archives

Records Management

RECMGMT-L@LISTSERV.IGGURU.US

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Larry Medina <[log in to unmask]>
Reply To:
Records Management Program <[log in to unmask]>
Date:
Thu, 3 Apr 2008 08:13:39 -0700
Content-Type:
text/plain
Parts/Attachments:
text/plain (109 lines)
>
> The fact that they have not been approved YET
> is not necessarily surprising, particularly given that OOXML just became
> an
> ISO standard, but what is better about TIFF or even BMP than XML
> particularly for Office-type documents?


There are two competing issues here.  One is preservation of an
"image of the content" and the other is preservation of a native/working
file copy.  Depending on the intent/purpose of the preservation, the choice
for format can be a mixed bag.  Most ERMS allow you to store in both native
and image format; the image (PDF, typically) is for viewing by those with
permissions, the native is for the originator or others provided access to
the native file to make further revisions.

There is a lot said about the need to consider non-proprietary formats for
long term (a much discussed term recently) preservation.  Given much of the
administrative value content created in an office environment has a
relatively short retention period (less than 10, and frequently less than 5
years), this is less of an issue.  The open source concept is being pushed
by many as a means of allowing ANY USERS to have access to the raw content
and presumably be able to modify/re-use/re-purpose it , but for the original
file, this is completely contrary to the concept of "record".

Statistics have shown that less than 3% of what is created has "permanent
value", and what DOES is not intended to be modified.  Even NARA's work with
the electronic records archive (ERA) project is set to ingest non-modifiable
content rather than native file formats as the preferred option.

When paper source documents are scanned for imaging purposes, most systems
scan to a TIFF format originally, which is in many cases subsequently
converted to PDF for storage in an ERMS. Some organizations continue to
store the "intermediate" TIFF images as part of a backup strategy, because
for the most part, if the scan was accurate, the TIFF image is
incontrovertible and can (in necessary) be called up and converted to PDF
again.  There are documented incidents where early PDF images are not able
to be properly rendered in later versions of Adobe Acrobat, so fortunately,
'long live the TIFF'.

And before someone throws the dart to burst the bubble, that's right... TIFF
and PDF are also proprietary formats, belonging to Adobe, however Adobe has
done much to keep the code available and (sort of) open for use/access by
others.

I've seen a lot done to use an "XML wrapper" for native format files, even
some consideration of it for e-mail and attachments, but the problem I have
with this latest fiasco and the M$ XML is it isn't plain vanilla XML.  What
was the problem with M$ simply adopting and using XML in its original form?

And for the archivists out there, some of whom have weighed in on this, What
> IS your digital preservation format of choice? I don't want this to
> devolve
> into a "film vs. bits" discussion.


While you may not want to, if you speak to those in the scientific and
engineering communities, especially those who have requirements to maintain
complex files and images for "duration of active research plus 75 years", or
even (gasp!) Permanently, you'll have to cross this line.

The options are to maintain TIFF or CALS image files, along with native
files, periodically migrate and convert to more current versions, and yes,
FILM as the preferred deep backup.  Although it's only a raster image, with
complex designs, decisions are even made to portray layers independently,
generate raster files, and film them.  In this case, the individual layers
can be vector converted to create a full working drawing when they are
re-assembled.   BUT THE FACT REMAINS, the film is truly incontrovertible.

Instead, for born-digital documents, if
> it isn't an XML-based markup language, what do you recommend be used
> instead? Choose any type of documents - but include at least two. In other
> words, "TIFF" is a cop-out as so many file formats are not images;
> moreover,
> using TIFF for e.g. CAD drawings causes a loss of more than 90% of the
> functionality of the original (layers, views, external references, etc.)
> and
> so the archivist would I presume eschew such an approach.


I think it's important to keep in mind that 'the archivist" has one primary
concern. That they can reproduce something that represents the "record" as
it was originally created.   And there's no question that archival theory
goes far beyond that, especially when it comes to historic documents,
records and artifacts.  But, the majority of what is being created digitally
at this point doesn't have archival value yet, and that which is thought to
is being converted to a flat file of some sort before it's considered for
preservation.

Not an Archivist, didn't even sleep at a Holiday Inn Express last night, but
responsible for tons of content that is more than 50 years old and still
active, and even more that has been created during the 'digits to dust' era
that is in legacy applications and proprietary formats that are no longer
supported simply because the "best and brightest minds" of the time made
decisions that the digital was going to be around forever.  Good thing we
decided to film a lot of that when it was created, and even a better thing
that we decided to properly process and store the film.

Larry
-- 
Larry Medina
Danville, CA
RIM Professional since 1972

List archives at http://lists.ufl.edu/archives/recmgmt-l.html
Contact [log in to unmask] for assistance
To unsubscribe from this list, click the below link. If not already present, place UNSUBSCRIBE RECMGMT-L or UNSUB RECMGMT-L in the body of the message.
mailto:[log in to unmask]

ATOM RSS1 RSS2