Executive Summary

The following questions framed the Manuscript Digitization Demonstration Project: What type of image is best suited for the digitization of large manuscript collections, especially collections consisting mostly of twentieth century typescripts? What level of quality strikes the best balance between production economics and the requirements set by future uses of the images? Will the same type of image that offers high quality reformatting also provide efficient online access for researchers?

A project steering committee drawn from several Library of Congress units determined that the Federal Theatre Project documents selected for this activity were typical of large twentieth century manuscript collections at the Library of Congress. The normal preservation actions that would be applied to collections of routine manuscript documents include (1) rehousing the original documents to conserve them and (2) creating a preservation microfilm copy.

After discussion, the committee reached consensus that the importance of these manuscripts lies in their information value. A preservation microfilm would be judged successful if the documents were legible and if a researcher could gain a reasonable sense of what the documents looked like. But there would be no expectation that the microfilm image would offer a fully realized facsimile of the original.

During the project's Phase I, the steering committee viewed a number of sample digital images produced by Picture Elements, the project consultants, and selected two image types for testing. Grayscale and color images were selected for the highest quality reproduction, called preservation-quality in this project. The committee agreed to tolerate some aesthetic degradation of the images so long as legibility was not impaired and agreed that “lossy” JPEG compression could be applied to the preservation-quality images. For the access-quality images, the model of microfilming projects influenced the outcome and bitonal images were deemed useful as a supplement to the preservation-quality images. Bitonal images resemble the high-contrast images familiar to microfilm users and offer small file size (for ease of use in a computer network environment) and print efficiently and cleanly from a laser printer.

During Phase II, Picture Elements produced a test bed of 20,000 images representing two versions of each of 10,000 pages. The image specifications were:

The Library found that the grayscale and tonal preservation-quality images were generally very satisfactory, that some bitonal access images were satisfactory, and that other bitonal access images were unsatisfactory. There are two reasons for dissatisfaction with some of the bitonal access images. First, some had lost legibility. This was largely related to the original documents themselves, many of which consisted of typed carbon copies on onionskin paper, marked up by a lead pencil. Such documents are nearly impossible to reduce to a clean bitonal image in which all marks are retained.

The second reason for dissatisfaction with the bitonal documents applied to the entire set and reflects the exigencies of access in the World Wide Web environment. World Wide Web browsers are not natively set up to accommodate TIFF-format, ITU T.6 (Group IV) compressed images, and the Library found that it needed to produce an additional set of access images for the World Wide Web. These were reduced scale tonal images in the GIF format, produced by running a batch conversion process on the previously scanned preservation-quality images.

The project's findings are summarized in Section 14.

Next Section | Previous Section | Contents