A friend brought to my attention that the URL I cited in the attached posting
wasn't working. I contacted the Society of American Archivists, and they
fixed the link, but in the process they changed the URL - which is now
http://www2.archivists.org/sites/all/files/Case13Final.pdf
Elizabeth W. Adkins, CRM, CA
On Tue, 12 Apr 2011 10:56:42 -0400, Elizabeth W Adkins
<[log in to unmask]> wrote:
>Capturing websites is tricky, particularly if the website uses Flash
>technology, or if the information on the site is fed from a database. It
>may be that the foundation's website contains static content that is not
>changed very often, which should help. But then there is also the issue
>of the crawling technology that is deployed to conduct the captures;
>typically organizations have IT defenses against unauthorized crawling,
>and you will have to deal with those programs before you can capture
>everything in a crawl.
>
>It so happens that the Society of American Archivists has just posted a
>case study from the University of Michigan regarding website capture:
>http://www2.archivists.org/sites/all/files/FinalCase13.pdf
>
>The University of Michigan contracted with the University of California
>for their website capture services, but recently a couple of companies
>have surfaced that offer website capture services. The two that I know of
>are both based out of the U.K.; I'm sure that there are others starting to
>surface in other parts of the world, including the U.S. One company is
>called Cloud Testing Limited; here is their website:
>http://www.website-archive.com/. Another is called Hanzo Archives; here
>is their website: http://www.hanzoarchives.com/.
>
>Nobody that I know of has completely overcome all the technological
>barriers to accurate website captures, but there is some good progress
>being made.
>
>Elizabeth W. Adkins, CRM, CA
>
>The opinions expressed above are my own, and do not reflect those of my
>employer.
>
>
>
>From:
>Christine Martin <[log in to unmask]>
>To:
>[log in to unmask]
>Date:
>04/12/2011 09:33 AM
>Subject:
>archving web sites
>
>
>
>
>
>One of the organizations I work for (a private foundation in Chicago) is
>trying to decide whether (and how) to digitally preserve (or "archive")
>its
>web site.
>
>
>
>The foundation does not take money or handle financial transactions over
>its
>web site. The web site contains primarily publications, e.g., news
>releases, annual reports, newsletters, and the like.
>
>
>
>My question is: What software or procedures do you use to preserve your
>organization's web site as it changes over time? Our web developer has
>suggested that we use WGet (a web crawler) to capture our web site as it
>appears to the public and then use SubVersion (another software
>product-used
>for version control?) to catalog any aspects of the web site that have
>changed. In this way, we store only the base web site plus incremental
>changes, as opposed to storing multiple copies of portions of the web site
>that have not changed.
>
>
>
>Have any of you done (or attempted) anything similar? If so, I would love
>to hear what you did and how it went. This is fairly new territory to me,
>and any words of advice, warning, or encouragement would be most
welcome.
>
>
>
>Thank you.
>
>
>
>Sincerely,
>
>
>
>Christine Martin
>
>Contract records manager
>
>Des Plaines, IL
>
>224-636-2457 (cellular)
>
>
>
>
>
>
>List archives at http://lists.ufl.edu/archives/recmgmt-l.html
>Contact [log in to unmask] for assistance
>To unsubscribe from this list, click the below link. If not already
>present, place UNSUBSCRIBE RECMGMT-L or UNSUB RECMGMT-L in the body
of the
>message.
>mailto:[log in to unmask]
>
>
>
>List archives at http://lists.ufl.edu/archives/recmgmt-l.html
>Contact [log in to unmask] for assistance
>To unsubscribe from this list, click the below link. If not already present,
place UNSUBSCRIBE RECMGMT-L or UNSUB RECMGMT-L in the body of the
message.
>mailto:[log in to unmask]
List archives at http://lists.ufl.edu/archives/recmgmt-l.html
Contact [log in to unmask] for assistance
To unsubscribe from this list, click the below link. If not already present, place UNSUBSCRIBE RECMGMT-L or UNSUB RECMGMT-L in the body of the message.
mailto:[log in to unmask]
|