RECMGMT-L Archives

Records Management

RECMGMT-L@LISTSERV.IGGURU.US

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Brian Tuemmler <[log in to unmask]>
Reply To:
Records Management Program <[log in to unmask]>
Date:
Wed, 6 Jun 2012 23:03:52 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (32 lines)
Carol, 

I like your question.  Most of the work I do is helping clients get a handle on content stored on shared network drives before migration to an ECRM.  Cleaning duplicates is part of that. Hopefully some of these factoids may still be useful for what you are seeking. 

I have come across a few companies do have a category in their retention schedule for "drafts and duplicates" - usually of 1 year.  Sometimes it is in the context of general admin stuff. 

On shared drives, when you find duplicates that span across departments or workgroups, it often identifies an opportunity for a collaborative site between those areas. (i.e. if Dept A and Dept B have multiple duplicates of a similar nature, they are trying to compensate for a general lack of sharing between the two departments.) Consider a SharePoint site to address this need. 

To find duplicates on a network share, you can either compare names, dates, and sizes, or you can run a utility to find a file's "MD5 hash."  (If the hash numbers of two different files match, they are identical, even if the name is different.)  I have found:
*	31% of the time a duplicate file name occurs; it is not a duplicate file
*	97% of the time, files with identical dates and sizes will also be duplicates
*	10% of the time, a duplicate will say "Copy of" in the file name or folder
*	95% of the files that say "Copy of" are duplicates
*	An average duplicate set will have 3.14 surplus copies

But be careful:
*	85% of duplicates are created by humans, 15% by machine (temp files, backups, log files)
*	14% of the duplicates cannot be deleted as they are part of an application, database, or web-file collection (like navigation buttons) or represent a "complete set" of delivered files and their deletion will ruin the larger set. 

The numbers come from a single large, yet typical, shared drive.

You can configure tools to delete all but the most current, all but the oldest, all but the one in the shortest file path, or a combination of rules - or deal with them manually.  

Brian Tuemmler
Gimmal Group
[log in to unmask] 

List archives at http://lists.ufl.edu/archives/recmgmt-l.html
Contact [log in to unmask] for assistance
To unsubscribe from this list, click the below link. If not already present, place UNSUBSCRIBE RECMGMT-L or UNSUB RECMGMT-L in the body of the message.
mailto:[log in to unmask]

ATOM RSS1 RSS2