SECTION C

DESCRIPTION/SPECIFICATION/WORK STATEMENT

C.1

BACKGROUND

The central mission of the Library of Congress (Library/LC) is to assemble, preserve, and provide access to a universal collection representing human knowledge, in order to serve the United States Congress and the American people. During the next several years, the provision of access to this collection will increasingly be accomplished via online networks and the Library of Congress will work cooperatively with other libraries and archives to establish a national digital library. To support its growing role in online access, the Library has established the National Digital Library Program (NDLP), which has as its primary focus the conversion of historical collections to digital form. During the next five years, the Library plans to convert as many as five million of its more than 100 million items. The material to be converted includes books and pamphlets, manuscripts, prints and photographs, motion pictures, microfilm and sound recordings. As the national library of the United States, the Library of Congress is committed to establishing and maintaining standards and practices that will support the development of the national digital library.

C.2

SCOPE OF WORK

The Library of Congress requires the conversion of a variety of LC archival textual collections, books, and book-like printed matter to electronic form. This will include converting printed, handwritten, or illustrated pages into (1) raster-scanned digital images of the original pages and (2) machine-readable texts encoded with Standard Generalized Markup Language (SGML). Generally, manuscripts and similar collections shall be converted to digital image-only sets, while books and other longer narrative works may require conversion both to an image set and to an SGML-encoded text file. The resulting digital image and text deliverables will be incorporated into computerized presentations that are part of the Library's National Digital Library Program. Work shall be performed under either LOT 1 or LOT 2 requirements which may be awarded as one (1) or two (2) separate contracts. Work shall proceed as small- or medium-scale projects of coherent groups of related materials under task orders. LOTS 1 and 2 are differentiated by the types of materials to be digitized. LOT 1 includes the majority of the work to be completed: unbound materials (i.e., items made up of separate sheets of paper) and a variety of small bound materials (i.e., books and pamphlets that are in good physical condition (robust), exclude color, and have smaller pages). In addition, LOT 1 includes the conversion of digital images to SGML-encoded, machine readable texts. LOT 2 consists of large and color bound materials (i.e., books, pamphlets, and bound manuscripts that may be cumbersome, fragile, rare, include significant color elements, or have larger pages).

C.2.1

Types of Original Materials During the next few years, the focus for the NDL Program will be American historical collections. Plans are being developed to digitize some of the following materials:

LOT 1--

Sheet music from 19th century, uncopyrighted
Theater playbills
Documents from the first 14 Congresses
Journals of the Continental Congress
Elliot's Debates on the Ratification of the Constitution
Ferrand's Records of the Federal Convention
Native American legal materials
Reports of slavery trials
Selected books and periodicals

LOT 2--

Letters in bound volumes in the Presidential Papers
Bound holograph music manuscripts
Oversize biographical compendia
Books illustrated with color lithographs

Items in the collections described above are often unique and valuable, and most originals shall not be removed from the Library of Congress. Thus, the initial capture of most materials shall take place onsite at the Library. Many items are bound and cannot be disbound for scanning. Post scanning processing and text conversion shall take place off-site at the contractor's facility.

C.2.2

Image Capture and Delivery - LOT 1 and LOT 2

For both LOTs 1 and 2, the digitization of Library collections demands more than the production of high-quality images in the file formats. When delivered by the contractor, the sets of images and texts must also be coherently and logically named and/or numbered, placed in delivery directories with prescribed characteristics (see C.10), and accompanied by a carefully maintained scanning log and printed directory list. After the images are loaded into the Library's retrieval system, these filenames and directories will link the images to bibliographic records (computer catalog "cards") or to finding aids (not unlike the yellow pages in a telephone directory).

C.2.3

Conversion and SGML-Encoding of Texts - LOT 1

In 1992 the Library, as part of its American Memory pilot digital conversion project, developed a Document Type Definition (DTD) for the conversion of historical texts to machine-readable form. This DTD follows the Text Encoding Initiative (TEI) approach for the use of Standard Generalized Markup Language (SGML) in the conversion of historical materials. Text conversion shall be performed in accordance with the existing DTD. Text conversion is only required under LOT 1.

C.2.4

Other Activities

C.2.4.1

Contract Startup/Testing Activity

Due to the complex, interrelated technical elements associated with this contract, an initial activity will begin with an eight-week (LOT 1) or five-week (LOT 2) startup and testing phase to provide a forum to address and finalize the definition of technical elements (see C.13 and CLIN B.5.08).

C.2.4.2

Related Services

During performance of the tasks, a variety of incidental and related services shall be required (see C.12) for both LOT 1 and LOT 2. These may include photocopying materials prior to scanning, printing quantities of digital images, and carrying out miscellaneous custom programming or computer processing functions related to the unique characteristics of the task at hand.

C.2.4.3

Workflow Tracking System

The Library will develop a workflow tracking system that tracks the flow of materials through the digitization process. The system, which connects both the Library and several contractors via computer networks, will track the progress of each batch of scanned materials from the time of scanning to the payment for services rendered for each batch. This tracking system will be used for both LOT 1 and LOT 2. (See also Section J, Attachment 9)

C.3

LIBRARY FURNISHED MATERIALS AND TRAINING

Unless otherwise noted, the Library furnished materials shall be applicable for both LOT 1 and LOT 2.

C.3.1

Multiple Scanning Facilities

The Library will provide space, electricity, local telephone service, and other items determined necessary and as agreed to prior to award. The work may entail the use of multiple crews at multiple locations at the Library. Facilities will be provided at appropriate locations (see also C.4.1).

C.3.2

Identifying Targets

For each item to be scanned, the Library will provide a paper target to help identify the materials to be scanned. The target is to be the first image scanned for each item, and given a control number of zero and carried out to the number of digits required by the naming scheme to be used. The type, format and resolution of the identifying target shall be the same as the type, format and resolution as specified for the document or volume the target identifies. For items not to be converted to SGML-encoded, machine-readable texts the target will contain the item name to be used for naming the digital files and the filenaming pattern to be used, the title of the document, and the title of the collection to which the item belongs. For materials to be converted to SGML-encoded, machine-readable texts, the target will contain all of the information needed to create the document's TEI header for the SGML text. Examples of typical targets are provided in Section J, Attachment 3. Note that, for this contract, a file folder containing multiple documents may be considered to be a single item.

C.3.3

DTD, Tag Library, and Keying Instructions - LOT 1 only

The Library will provide the SGML DTD for the encoding of texts designated for conversion. This will be accompanied by documentation of tag definitions and usage, and specific keying instructions.

C.3.4

Training in Safe Handling

Prior to onsite scanning, staff members from the Library's Conservation and Binding Offices will provide a two-hour training session to contractor scanning personnel on procedures to be used in the physical handling of original items in all phases of the capture workflow. After this initial training session, all replacement scanning personnel shall be fully trained prior to beginning work.

C.4

SAFE HANDLING AND SCANNING OF SOURCE MATERIALS - LOT 1 AND LOT 2

C.4.1

Scanning Locations and Equipment The contractor shall provide and maintain all equipment and associated software for the conversion work. The equipment used for initial image capture must be determined to be non-damaging by the Library's Conservation Office prior to contract award. The majority of the items to be converted cannot be removed from the Library of Congress; thus, most scanning shall take place at the Library, in work space furnished by the Library. Scanning facilities at the Library will vary according to the source of the collection materials, its condition, and value. Facilities to be provided in the following locations and as determined prior to issuance of each task order:

The contractor shall deliver and set up suitable equipment and supplies at the Library during the period(s) when scanning is to take place. The contractor is responsible for all equipment left onsite at the Library. When contractor equipment is to be idle for an extended period (more than three (3) weeks if located in the NDLP office or a curatorial processing section and more than one (1) week if located in a reading room) or when the scanning projects have been concluded, the equipment shall be removed from the Library by the contractor. All such activities shall occur at times pre-approved by the Library. The Library cannot assure the provision of a locked facility solely for the use of the contractor equipment.

C.4.2

Off-site Scanning

Under special circumstances and only when indicated and included in a task order, the Library may permit materials to be taken off-site for scanning. The contractor shall provide a facility and equipment that will secure, protect, and not damage the materials.

C.4.3

Onsite Scanning Hours

Most of the scanning work shall be done on site at the Library during regular business hours Monday through Friday. The contractor may be required to furnish scanning services during evening, early morning or weekend hours. Scanning at the Library shall generally be performed during public hours, between 8:30 a.m. and 5:00 p.m. or 9:00 p.m. (depending on the day of the week). In some cases, depending on the scanning location, the Library may permit additional hours between 6:30 a.m. and 9:30 p.m. in which scanning crews may be scheduled to work in shifts. The Library is closed on Sundays.

C.4.4

Scanning Personnel Requirements

A two-person crew shall be required when scanning all bound materials and when scanning unbound materials when items must be located, carefully removed from, and returned to file folders. One operator working alone shall be permitted only for scanning of specified unbound materials (to be defined prior to issuance of task order). It is estimated that with hard-to-scan materials such as may be encountered under this contract, that capture rates ranging from about 200 to 600 images per day can be achieved. In order to capture quantities of images in the ranges outlined in Schedule B, multiple crews will be required. The contractor shall provide sufficient equipment and personnel to achieve such levels of image capture.

C.4.5

Scanning Equipment and Safe Handling

The equipment (including lights) used for all image capture shall not damage original materials nor shall the manner of its use cause damage. All scanning equipment is subject to the approval of the Library's Conservation Office prior to contract award. Rough handling or the placement of stress on the original, especially the binding of a book, is unacceptable. Damage avoidance from handling or equipment shall have priority over the requirements including the capture of subtleties of printing or writing on originals.

C.4.6

Book Cradle Design and Other Support Structures - LOT 1 AND 2

A book cradle is required for all materials in LOT 2 and for bound materials in handling category H3 of LOT 1. Support is required for unbound materials in handling categories H8 and H9 in LOT 1.

C.4.6.1

Support for Bound Materials

For image types which require the use of a book cradle or support of some kind to prevent damage to the material while scanning, the following requirements apply:

For further information about the areas of a book which require special support, refer to Section J, Attachment 10.

C.4.6.2

Support for Unbound Materials

Certain unbound materials, such as folded sheets of music, may require other types of support. For example, fragile sheet music which has been folded for long periods of time has a tendency to tear at the fold. These types of folded sheets shall not be scanned with the crease pressed flat against the scanning bed. While these sheets can normally be inverted and scanned page-by-page on a book-edged scanner and sometimes on a typical flatbed scanner, the area or page which is not being scanned must be supported to prevent damage or undue stress to the crease or to the pages themselves. The contractor shall provide a support mechanism which will accommodate these requirements. This support structure need not be elaborate, but must be functionally adequate to meet the requirements.

C.4.7

Operation of Equipment

Contractor personnel shall perform all handling and scanning labor which includes removing items from storage containers (usually in the case of unbound materials) one at a time, performing the scanning and associated record keeping, and replacing the items in the containers from which they were removed in the same order in which they were found. Library of Congress staff members may be present only to monitor that materials are properly handled.

C.4.8

Resolution Targets

In order to verify the calibration of the scanning equipment and to ensure the best possible images, the Library requires that a resolution target be scanned. The Library requires the use of the IEEE Std 167A-1987 resolution target. The target shall be scanned for each machine at the start of every fifth batch or every five (5) working days, whichever comes first. An image of the target shall be contained in the delivery batch with which it was scanned. The target images shall be named according to the specifications in C.10. The target shall be scanned at the same resolution and pixel depth as the images in that batch. If images of different resolutions and pixel depth are to be contained in that batch, then the target shall be scanned at each resolution and pixel depth. For example, if the entire batch consists of 300 dpi bitonal images, then the target need only be scanned as a 300 dpi bitonal image. If the batch contains a mixture of 300 dpi bitonal images and 300 dpi grayscale images, then the target shall be scanned as both a 300 dpi bitonal image and a 300 dpi grayscale image.

C.4.9

Fragile Materials

Some materials, both bound and unbound, may be designated by the Library's Conservation Office as fragile and require special handling. The Library will provide instruction for special handling techniques for all instances of fragile items.

C.4.10

Damage to Original Documents/Materials

Preventing damage to original documents shall be the primary concern during scanning. While most of the documents to be scanned shall be sturdy enough to be scanned when handled according to the directions of the Library's Conservation Office, there may be times when it is not possible to determine, in advance, potential damage to the original source document during the scanning process. In the event that any damage to an original occurs during the initial capture, the scanning technician shall cease scanning that original and shall seek assistance from a Library representative. Such damage shall be defined at a minimum level to include the breaking of the book spine, pages coming out of the original binding, the cracking of brittle pages, and so on. Instructions on how to recognize damage will be included in the Library's training on the safe handling of originals and shall be followed at all times.

C.5

GENERAL IMAGE TYPES, CHARACTERISTICS, AND REQUIREMENTS - LOT 1 AND LOT 2

A raster-scanned image of each page or sheet and an image of each identifying target of various bound and unbound materials shall be produced and delivered in a separate file in accordance with specifications.

For manuscript materials, some of the grayscale or color images to be produced will serve as provisional preservation-quality images, for use by the Library in its continued investigation of digital reformatting for preservation. When this is required, the contractor shall produce both a grayscale or color preservation-quality image and a derivative bitonal access image. The requirement for preservation quality images shall be stated in the task order.

The following chart provides a summary of the image specifications for both LOT 1 and LOT 2. Image types and resolutions will be specified prior to the issuance of a task order and, whenever possible, like materials will be grouped together so that production efficiencies may be achieved.

Image
Type
Description Format/
Compression
Comment Resolution
(DPI)
Abbre-
viation
for
Image
Spec
Bitonal 1 bit per pixel,
without special
treatment of
halfttones
TIFF files, ITU Group 4 compression Produced by direct scanning of bound and unbound materials 200
300
400
2B
3B
4B
Bitonal 1 bit per pixel, with special treatment of halftones TIFF files, ITU Group 4 compression Produced by direct scanning of bound and unbound materials 200
300
400
2BH
3BH
4BH
Bitonal 1 bit per pixel, images derived from grayscale or color images TIFF files, ITU Group 4 compression Produced by post-processing grayscale or color images 300 3DB
Bitonal 1 bit per pixel, images of segments of larger images, w/o special treatment of halftones TIFF files, ITU Group 4 compression Produced by post-processing larger images 200
300
2BS
3BS
Grayscale 8 bits per pixel JFIF files, JPEG compression Produced by direct scanning of bound and unbound materials 200
300
2G
3G
Grayscale 8 bits per pixel, segments of larger images JFIF files, JPEG compression Produced by post-processing larger images 200
300
2GS
3GS
Color 24 bits per pixel JFIF files, JPEG compression Produced by direct scanning of bound and unbound materials 200
300
2C
3C
Color 24 bits per pixel, segments of larger images JFIF files, JPEG compression Produced by post-processing larger images 200
300
2CS
3CS

Refer to C.6 - C.9 for specific image requirements for each category of material to be digitized.

C.5.1

Bitonal Images

This section refers specifically to image types 2B, 3B, and 4B from the chart above.

The contractor shall have the capability to produce bitonal images of all bound and unbound materials. Bitonal images shall have a pixel depth of 1 bit-per-pixel and shall generally be scanned at resolutions of 200, 300, or 400 dots per inch (dpi) depending on the size of the original and the scanner type. For example, certain large pages in bound items shall generally be scanned at the lowest resolution. The images shall be stored as an "Intel" TIFF (Tagged Image File Format) file, with the header content specified. The compression algorithm shall be ITU (Formerly CCITT) Group 4.

The initial-capture system shall include dynamic threshholding or a similar feature in order to capture both dark-imprint typing and light-imprint pencilled handwriting on a manuscript page or similar item. The most challenging types of dark- and light-imprint pages, typically found in unbound manuscript collections, shall be captured as grayscale or color images as described below.

C.5.1.1

TIFF Version

TIFF version 5.0 has been determined to be satisfactory and shall be acceptable; however, subject to testing, version 6.0 (or later) may be acceptable.

C.5.1.2

TIFF File Header Requirements

The Library requires that "typical" or "expected" data be provided for most TIFF tags (normally, the data supplied by software default settings). In addition, the contractor shall include additional information in the 269, 315, and 306 tags. The requirements for the TIFF headers are described in Section J, Attachment 5. The Library has used varying approaches for the use of the TIFF header tags 282, 283, and 296.

C.5.2

Bitonal Images--Book Pages with Halftone Illustrations and/or Finely Inscribed Line Art Drawings

This section refers specifically to image types 2BH, 3BH, and 4BH from the chart above. Compression, resolution, TIFF version and header information requirements are stated in C.5.1.

Illustrations in printed matter often consist of printed halftones or finely inscribed line drawings. The contractor shall capture printed halftones and finely inscribed line art using a technique to suppress or reduce moiré patterns.

The treatment of moiré patterns shall be accomplished in one of the following ways, or by an equally effective method proposed by the contractor and approved by the Library:

Option 1 (preferred): Descreening and rescreening approach or an equivalent process.

If descreening and rescreening or its equivalent is used to suppress moiré patterns, the contractor shall deliver a single image which reproduces the text without dithering and applies special treatment to the illustration

Option 2 (acceptable): Capture of the image using diffused patterning (often called dithering).

If dithering or its equivalent is used to suppress moiré patterns, the contractor shall deliver a pair of images:

one (1) image of the full page reproduced as a non-dithered bitonal image
one (1) image of the full page reproduced as a dithered bitonal image

Book pages with simple or "coarse" line art (not finely inscribed) shall be scanned without treatment.

Examples of printed halftones, finely inscribed line art, and simple line art in addition to Library findings on this subject can be found in Section J, Attachment 3.

C.5.3

Derivative Bitonal Images - LOT 1 and LOT 2

This section refers specifically to image type 3DB from the chart above. Derivative bitonal images are required for both LOT 1 and LOT 2.

For certain materials, typically manuscript items, the Library may require capture and archive a grayscale or color image for archival purposes and a bitonal image for access purposes. The contractor shall derive this bitonal image from the grayscale or color by computerized post-processing. Regardless of the resolution of the original capture, derivative bitonal images shall be 300 dpi. Compression, resolution, TIFF version and header information requirements are stated in C.5.1. Sophisticated threshholding algorithms shall be applied in the post-processing to assure minimum loss of information.

Because computer processing time is dependent on the decompressed size of the source image (the larger the image, the longer the processing time), four (4) categories of source images from which bitonal images shall be derived are defined as follows:

C.5.4

Derived Bitonal Image Segments of Images of Large Pages - LOT 1 only

This section defines requirements for image types 2BS, 3BS, 2GS, 3GS, 2CS, and 3CS from the chart above. Compression, resolution, TIFF version and header information requirements are stated in C.5.1. Derived image segments of images of large pages are required for LOT 1 only. When required in a task order, the contractor shall segment large images and deliver a set of smaller images in addition to the large image that was initially captured. The general concept is analogous to that described in C.7.4 for the segmented capture of large pages.

The images shall include the percentage of overlap stated in Section C.7.4.1 and be delivered with filenames that represent the naming sequence stated in Section C.10.3.5. The contractor shall produce the set of smaller segment images of large pages in a post-process.

The segment images shall retain the native resolution and image type of the large image. For example, a 24x36-inch foldout map captured as a 300 dpi bitonal image shall be segmented into six images, each of which will fit on 8.5x11 inch paper when printed without rescaling. Thus each segment shall also be a 300 dpi bitonal image which provides the required 20 percent overlap with its neighbors.

Four (4) categories of source images from which segmented images shall be derived are defined as follows:

C.5.5

Grayscale and Color Images

This section refers specifically to image types 2G, 3G, 2C, and 3C from the chart above. The contractor shall have the capability to produce grayscale for all bound and unbound materials for both LOT 1 and LOT 2. Color images are required for all categories of unbound materials in LOT 1 and all categories in LOT 2. Grayscale images shall be produced for originals that have significant tonal variation (e.g., manuscripts) and for printed matter with illustrations that cannot be treated by dithering, de-screening, etc. Color images shall be created for originals that have significant amounts of color. Specific instructions regarding grayscale and color capture will be provided at the time a task order is issued.

C.5.5.1

Grayscale and Color Resolution, Format, and Compression

Grayscale and color images shall be required at resolutions of 200 and 300 dots per inch (dpi). Image resolution for particular jobs will be finalized with the execution of the task order. Grayscale images shall have a pixel depth of 8 bits-per-pixel while color images shall have a pixel depth of 24 bits-per-pixel. The compression algorithm used shall be JPEG. Legibility and ease of transmission via computer network versus a perfect facsimile are required. Therefore, the JPEG compression quality factor chosen by the contractor shall yield an average compression ratio between 20:1 and 30:1. These images shall be stored as JFIF (JPEG File Interchange Format) files, with .jpg as the file extension.

C.5.5.2

Gamma Correction/Contrast Stretching

Images may appear gray even when the original sheet is white or off-white. In order to brighten the appearance of the paper in images of documents, the contractor shall be capable of applying gamma correction or contrast stretching to grayscale and color images. Enhancement algorithms shall be applied at a sufficiently modest level to preclude the loss of information from the original. Gamma correction or contrast stretching may be applied automatically at scan time or during post-scan processing. Gamma correction or contrast stretching algorithms shall be applied before images are compressed and before any derivative bitonal images have been produced.

C.5.6

Image Orientation - LOT 1 and LOT 2

In the delivered digital image, the top of the original document or page shall appear at the top of the display screen. Note that "right side up" for printed matter is defined as "the top of the book or magazine page" (portrait mode). An illustration or table in a book or magazine may be printed "sideways" (landscape) to fit the page, thus aligning the top of the page with the side of the illustration or table. In these cases, the top of the image shall be the top of the page and not the top of the illustration.

C.5.7

Cropping - LOT 1 and LOT 2

The Library requires presentation of the entire original sheet or page. In no event shall the actual document be cropped. Researchers using Library of Congress digital documents often wish to be reassured that the entire document has been captured. This is especially desirable for unbound manuscript documents. A "border zone" approximately 1/4-inch or less of the surface behind the scanned document shall be provided whenever possible. For some combinations of document sizes and scanning equipment, capturing such a margin may not be possible for all four edges of the page. Therefore, the Library desires a 1/4-inch margin wherever possible, and requires at least that the entire original sheet or page is captured.

C.5.8

Skewing - LOT 1 and LOT 2

The Library requires that images created from unbound materials shall not be skewed. For bound materials, the library requires that images shall not be skewed; however, the tightness of the bindings may result in slightly skewed images. In these cases, the contractor shall note in the scanning logs that the image was scanned using best efforts, and shall note the reason for and extent of the skewing.

C.6

IMAGING SMALL BOUND MATERIALS - LOT 1

The types of materials to be digitized fall into two separate categories and are defined as lots. LOT 1 includes the majority of the work to be performed: unbound materials (i.e., items made up of separate sheets of paper) and a variety of small bound materials (i.e., books and pamphlets that are robust, exclude color, and have smaller pages). A single scanned image shall be made for each page of an original item. Exceptions to this rule may arise in the case of large sheets, foldout book pages, and some illustrated printed matter. These exceptions are discussed in C.5.2 and C.7.3.

C.6.1

Sizes

Bound materials typically consist of printed matter and are defined and described in two (2) size categories:

C.6.2

Handling

The manner in which bound materials shall be handled depends on their physical condition, including variances in the type or tightness of the binding, closeness of text to the binding, and brittleness of the pages. Note that the typical flatbed scanner is not acceptable for bound materials at the Library.

In terms of handling, LOT 1 includes three (3) classes of bound materials:

C.6.3

Image Specifications for Small Bound Materials

It is required that the contractor have the capability to scan small bound materials in the combinations of handling categories, sizes, and image types listed in the table that follows. (Image requirements are provided in section C.5.)

SMALL UNBOUND MATERIALS
Handling Size Image Types Image Spec Abbreviation
H1
Bound, open NTE 130
degrees, can invert
S2
Pages NTE 8 1/2x14
inches
300 dpi bitonal images 3B
400 dpi bitonal images 4B
300 dpi bitonal halftone treatment 3BH
400 dpi bitonal halftone treatment 4BH
200 dpi grayscale images 2G
300 dpi grayscale images 3G
H2
Bound, open NTE 130
degrees, no invert
S1
Pages NTE 8 1/2x11
inches
300 dpi bitonal images 3B
400 dpi bitonal images 4B
300 dpi bitonal halftone treatment 3BH
400 dpi bitonal halftone treatment 4BH
200 dpi grayscale images 2G
300 dpi grayscale images 3G
H3
Bound, open NTE 130
degrees, no invert, use
wedge or cradle
S1
Pages NTE 8 1/2x11
inches
300 dpi bitonal images 3B
400 dpi bitonal images 4B
300 dpi bitonal halftone treatment 3BH
400 dpi bitonal halftone treatment 4BH
200 dpi grayscale images 2G
300 dpi grayscale images 3G

C.7

LOT 1 - UNBOUND MATERIALS

Unbound, separate-sheet materials include typed manuscripts, letters and other handwritten documents. Some documents include a mixture of typed and handwritten text. The degree of legibility varies greatly among documents.

C.7.1

Sizes

The majority of pages range from about 6x9 inches to about 8 1/2x11 inches. Because many are from periods before paper sizes were standardized, and because many pieces of personal correspondence are included, document sizes vary considerably, often from one page to the next. In addition, manuscript collections may include extensive quantities of slips of paper or cards on the order of 3x5 inches. These collections may also include folded posters, newspaper pages, or other sheets on the order of 11x17 inches. Collections also contain documents (like sheet music) that consist of folded sheets (creating "pages") and sheets that exceed 11x17 inches in size. All of these highly variable materials can appear in historical archival collections and all shall be scanned.

Unbound materials are defined in three (3) categories:

In some cases, when the task order requires, sheets larger than 11x17 shall not be captured in scanned segments but shall be handled as custom work, requiring pre-photocopying or other techniques.

C.7.2

Handling

The Library's unbound, separate-sheet materials are often fragile, unique and valuable. The Library will not permit them to be scanned using a high volume, automatic document feeder. These materials shall be placed by hand and scanned one page at a time, using a flatbed or book-edge scanning device. Devices shall be capable of scanning materials with original sizes up to 11x17 inches. Materials protected by mylar shall be scanned through the mylar, except when special exceptions are made and approved in advance. There are four (4) classes of unbound materials defined in terms of handling:

C.7.3

Folded Sheets

This section refers to handling category H8, above. The archetypal example of an unbound folded sheet is sheet music. For example, an eight-page piece of sheet music consists of two large pieces of paper that have been folded, one inside the other. The front and back covers are printed on one side of the first sheet of paper; on the other side, the first and sixth pages of the music. On the other (inside) sheet of paper are printed music pages two through five. (See Section J, Attachment 3 for an example.) When the overall sheet size for unbound folded sheets (comprising both "pages") is 8 1/2x11 inches or less, the sheet (pair of pages) shall be scanned and delivered as a unit. When the overall sheet size exceeds 8 1/2x11 inches, each page shall be presented as a separate image. Support for the page not being scanned is required (see C.9). The initial capture may be as a single unit with page-image separation occurring in a post-process; the delivery, however, shall be of the separate pages. The delivered images for the separate pages (for sheets greater than 8 1/2x11 inches) shall be numbered to reflect the actual cover and page sequence of the original item. (Detailed requirements on numbering are provided in Section J, Attachment 4.)

C.7.4

Requirements for Capture in Segments

This section refers to handling category H9, above. Large sheets are encountered in books (where they are folded to fit within the binding), in manuscript file folders, and in other collections. Foldout pages often present maps, charts, or illustrations. These pages shall be removed from the binding by Library personnel and scanned by the contractor as unbound pages. The foldouts must be integrated into the delivery sequence of the bound volume. The contractor shall capture large sheets or foldouts in segments (rather than as one large image) when sheet sizes are greater than 11x17 and less than 36x24 inches. No individual segment shall be greater than 8 1/2x11 inches. The contractor shall also capture sheets smaller than 11x17 in segments when (1) extremely fine print or fine detail or (2) special conditions of fragility and propensity to tear are present. The Library will whenever possible, identify such materials (or the likelihood of encountering such materials) at the time a task order is issued. Fine-print pages shall be segmented to ensure legibility. Fine print on pages smaller than 11x17 may occur from time to time in a collection; when they are encountered, the contractor shall confer with the COTR about the proper action to take. Capture-by-segment shall be similar to that for folded sheets: a portion of the sheet shall be placed on a scanner or under a camera for each exposure. The difference is that for large sheets, as many as six or eight exposures may be made. Special care in handling is required in order to avoid damage to the original and additional support of the page is required, see C.4. Segment images shall be named according to the specifications in Section J, Attachment 4.7.

C.7.4.1

Overlap of Segment Images

In order to ensure that no part of a large sheet or page is lost and to permit end users to orient themselves as they move between segments, each segment image shall overlap with all adjacent images. This overlap shall be 20 percent whenever possible but shall not fall below 10 percent, except with the approval of the Library.

C.7.5

Image Specifications for Unbound Materials

The Library requires the contractor to have the capability to scan unbound materials in the combinations of handling categories, sizes, and image types listed in the table that follows.

UNBOUND MATERIALS
Handling Size Image Types Image Spec Abbreviation
H6
Unbound materials
S4
Pages NTE 8 1/2x14
inches
300 dpi bitonal images 3B
400 dpi bitonal images 4B
300 dpi bitonal halftone treatment 3BH
400 dpi bitonal halftone treatment 4BH
200 dpi grayscale images 2G
300 dpi grayscale images 3G
200 dpi color images 2C
300 dpi color images 3C
H6
Unbound materials
S5
Pages NTE 11x17 inches
300 dpi bitonal images 3B
400 dpi bitonal images 4B
300 dpi bitonal halftone treatment 3BH
400 dpi bitonal halftone treatment 4BH
200 dpi grayscale images 2G
300 dpi grayscale images 3G
200 dpi color images 2C
300 dpi color images 3C
H7
Fragile unbound
materials
S4
Pages NTE 8 1/2x14 inches
300 dpi bitonal images 3B
400 dpi bitonal images 4B
300 dpi bitonal halftone treatment 3BH
400 dpi bitonal halftone treatment 4BH
200 dpi grayscale images 2G
300 dpi grayscale images 3G
200 dpi color images 2C
300 dpi color images 3C
H7
Fragile unbound
materials
S5
Pages NTE 11x17 inches
300 dpi bitonal images 3B
400 dpi bitonal images 4B
300 dpi bitonal halftone treatment 3BH
400 dpi bitonal halftone treatment 4BH
200 dpi grayscale images 2G
300 dpi grayscale images 3G
200 dpi color images 2C
300 dpi color images 3C
H8
Folded sheets (capture
of one half at a time),
support required
S5
Single page NTE
11x17 inches (full
sheet NTE 22x17
inches)
300 dpi bitonal halftone treatment 3B
400 dpi bitonal halftone treatment 4B
200 dpi grayscale images 2G
300 dpi grayscale images 3G
200 dpi color images 2C
300 dpi color images 3C
H9
Large sheets or foldouts
captured in segments,
special handling and
support required, (cost
per segment)
S6
Full sheet NTE
36x24 inches;
segments NTE
8.5x11 inches
300 dpi bitonal images 3B
300 dpi bitonal halftone treatment 3BH
300 dpi grayscale images 3G
300 dpi color images 3C

C.8

LOT 2 - LARGE AND COLOR BOUND MATERIALS

LOT 2 consists of large and color bound materials (i.e., books, pamphlets, and bound manuscripts that may be cumbersome, fragile, rare, include significant color elements, or have larger pages). Typical page samples are reproduced in Section J, Attachment 3.

C.8.1

Sizes

Bound materials typically consist of printed matter. Bound materials for LOT 2 work are described and defined in two (2) size categories:

C.8.2

Handling

Materials in LOT 2 are especially challenging because they are often very large, cumbersome, and fragile. These volumes require the use of a cradle during capture. The handling category for LOT 2 is defined as follows:

C.8.3

Image Specifications for Large and Color Bound Materials

The Library requires the contractor to have the capability to scan large and color bound materials in the combinations of handling categories, sizes, and image types listed in the table that follows.

LARGE AND COLOR BOUND MATERIALS
Handling Size Image Types Image
Spec
Abbreviation
H4
Cumbersome volumes,
fragile, and/or color
materials; cradle
required
S2
Pages NTE 8 1/2x14 inches
200 dpi bitonal 2B
300 dpi bitonal 3B
200 dpi bitonal halftone treatment 2BH
300 dpi bitonal halftone treatment 3BH
200 dpi grayscale image 2G
300 dpi grayscale image 3G
200 dpi color image 2C
300 dpi color image 3C
H4
Cumbersome volumes,
fragile, and/or color
materials; cradle
required
S3
Pages NTE 11x17 inches
200 dpi bitonal 2B
200 dpi bitonal halftone treatment 2BH
200 dpi grayscale image 2G
200 dpi color image 2C

C.9

SCANNING COMPONENTS - LOT 1 AND LOT 2

C.9.1

Bound Materials - LOT 1 and LOT 2

This section describes the component parts of a book that shall be scanned for both LOT 1 and LOT 2. The delivered set of images shall reflect the following order regardless of the order in which the contractor may actually scan the component parts of the original. In the delivered set of images, each filename shall include a control number. (See C.10 and Section J, Attachment 4, for a description of delivery directory and filenaming requirements.) These control numbers increment in a sequential manner and, thereby, shall establish the sequence of images.

C.9.1.1

Identifying Target

The identifying target shall be the first image for each bound item (see C.3.2).

C.9.1.2

Book Covers

Covers shall be scanned for certain books. When covers are required to be scanned, an instruction will be provided in a note included on the target. If both front and back covers are to be scanned, the front cover image shall be numbered to precede the images for the inside pages and the back cover shall be numbered to follow them. The general rules for cover scanning are as follows:

C.9.1.3

Inside Pages

The images of the inside pages shall come after the images for the target and the front cover (if any). The first page of the book to be scanned shall be the first page containing significant information. Examples include a page containing a copyright stamp that precedes the title page, the title page itself, or end papers containing significant information, such as a map. Scanning of the remainder of the book shall continue in sequence, omitting blank pages. However, pages that contain no printed information but that contain handwritten inscriptions, notes, marginalia or other written ephemera shall be scanned. End papers shall only be scanned if they contain significant information, such as a map. End papers that are merely decorative shall not be scanned. Blank pages or blank pages with stray pen or pencil marks shall not be scanned.

C.9.1.4

Foldout Pages

Foldout pages present special problems in capture and, if images are segmented, in numbering. These pages shall be removed from book bindings by the Library and scanned by the contractor as unbound pages. They shall be integrated in the delivery sequence of the rest of the bound volume.

C.9.2

Unbound Materials - LOT 1

The identifying target is always the first image scanned for each item identified by the target. If the target represents a folder of an archival collection, the second item to be scanned will be the first sheet in the folder, continuing to the last sheet of the folder, and moving on to the next target and its associated folder. The images shall be named according to the requirements in C.10 and Section J, Attachment 4.

C.10

FILENAMES AND DELIVERY DIRECTORIES - LOT 1 AND LOT 2

The contractor shall assign a digital-image filename to each image captured as part of the initial image-capture process, and deliver these files to the Library in a certain arrangement of directories and subdirectories, following the specifications outlined in Section J, Attachment 4. These are called delivery directories. The filename and directory structure is essential as it will facilitate future access to the images and texts. The contractor shall deliver the images and texts in delivery directories which the Library will archive in repository directories that parallel those created for delivery. These directories and the names of the files they contain provide the structure for the Library's digital repository, the institution's archive of digital information. The directory names and filenames link the images and texts to elements in the Library's collection-retrieval system. The content of the digital repository is stored in UNIX-based servers at the Library of Congress. The Library, however, anticipates production and delivery content using equipment that employs the MS-DOS operating system. In addition, sets of images may be delivered to third parties who use IBM-compatible, DOS-based computers. For this reason, the directory names and filenames shall conform to DOS naming conventions. In order to accommodate UNIX needs, any alphabet letters in the file or directory names shall be lower case. Since filename extensions will be assigned according to file type (e.g., .tif, .jif or .jpg), the first eight characters--the file name proper--become very important.

C.10.1

Identifiers Used to Name Directories

Specifications for file- and directory-naming are outlined in Section J, Attachment 4. The particular file- and directory names shall be assigned from interpretation of these general specifications. The Library will specify an identifier for a delivery directory. Identifiers are unique names which distinguish one item from another. Under this contract, an item may be any of the following: a book or pamphlet, a folder in a manuscript collection, or a document within a folder of a manuscript collection. An identifier is the prefix or left-side (right-truncated) portion of a name that may contain as many as eight characters. For example, the identifier is bj06 might be used as the basis for assigning the directory names bj06001 (for the first 300 files), bj06002 (for the second 300 files), bj06003 (for the third 300 files), and bj06004 (for the fourth 300 files). Other, similar patterns may also be specified for other identifiers, as outline in Section J, Attachment 4. The identifiers used to name directories also appear in the cataloging or finding-aid data the Library employs in its retrieval systems. When a researcher has found an item of interest in a catalog or finding aid and executes a fetch command, the retrieval system uses the identifier to locate the appropriate repository directory in the Library's digital archive and proceeds to retrieve the appropriate set of image or text files.

C.10.2

File and Directory Structures

Assigning filenames and naming directories for the collections shall be performed according to the four structures identified below and in accordance with the detailed specifications found in Section J, Attachment 4.

  1. Unnumbered documents in folder structure
  2. Bibliographic record/print-page number structure
    1. When printed page numbers are tracked
    2. When printed page numbers are not tracked
  3. Serials structure
    1. When printed page numbers are tracked
    2. When printed page numbers are not tracked
  4. Copyright-registration-number and technical-document structure

Identifying targets for each item will identify the naming scheme to be applied to that item.

C.10.3

Feature Recognition

In order to properly assign filenames and enter data into the scanning log, appropriate actions as specified for the various document and collection features listed below shall be taken.

C.10.3.1

New Folders in Manuscript Collections

The start of each new file folder shall be identified during the scanning of folders within a manuscript collection. If not already marked on the folder, a number shall be assigned to the folder. These numbers or names shall be used to properly assign names to delivery directories.

C.10.3.2

New Documents in Manuscript Folders

The beginning of new documents (reports, letters, etc.) shall be identified and "new document" shall be indicated as a feature.

C.10.3.3

Features and Page Numbers in Printed Matter

For some books or other printed matter, the presence of at least four types of features: title pages, tables of contents, lists of illustrations, indexes, and/or cumulative tables of contents or indexes (for serials). In addition, the actual printed page numbers (when present) for certain books or magazines shall also be identified. Special codes shall be used to embed feature-identifiers in the filenames.

C.10.3.4

Derivative Images or Multiple Versions of Entire Page

When derivative images or multiple versions of page images are required, e.g., to successfully reproduce printed halftones and finely inscribed line art in a second image, filenames shall be assigned to the images in accordance with the specifications.

C.10.3.5

Images of Segments of Pages

When multiple segments of a single sheet or page are created, filenames shall be assigned to the images in accordance with the specifications in Section J, Attachment 4.7.

C.10.3.6

Resolution Targets

Resolution targets scanned according to the specifications in C.4.8 and shall be named in accordance with the specifications in Section J, Attachment 4.

C.11

TEXT CONVERSION AND SGML-ENCODING - LOT 1

The Library of Congress encodes converted texts with Standard Generalized Markup Language (SGML) following the Text Encoding Initiative (TEI). Encoding permits the retention of certain elements and features that would be lost in simple word-based ASCII conversion. These include structural elements (front matter, chapters, illustration location) and such features as highlighted text (for example, bold or italics). The retention of these elements and features permits the interchange of texts with reduced loss of meaning, the loading of texts into access software that interprets and displays materials as encoded, and the use of retrieval software that can, for example, give added weight to certain portions of a text. In 1992, the Library developed a Document Type Definition (DTD) for the SGML-encoding of its historical materials. The contractor shall comply with the DTD, Tag Library, and supplementary keying instructions provided by the Library for the conversion of textual materials. The contractor shall synthesize these Library-furnished materials in order to create a single instruction set. The historical documents (American Memory) DTD and associated keying and encoding instructions are provided in Section J, Attachment 6 and Attachment 7.

C.11.1

Character Set

The texts shall be delivered in ISO 646. The IBM-extended characters (letters modified with diacritics), however, shall be coded according to the standard publicly declared entity reference sets in ISO 8879.

C.11.2

Text Conversion and SGML-encoding from Image Sets Scanned by the Contractor

The contractor shall create SGML-encoded, machine-readable texts from image sets that have been scanned under this contract. The contractor shall be responsible for preparing and delivering image sets for conversion; this will include activities such as checking that all required rework has been completed, integrating rework into image sets, copying image sets onto suitable delivery medium, etc. The contractor shall also be responsible for tracking the progress of conversion work.

C.11.3

Text Conversion and SGML-encoding from Existing Image Sets

The contractor shall create SGML-encoded, machine-readable texts from image sets provided by the Library and not produced under this contract when those image sets meet the image-type and image-quality requirements set forth in this document. These image sets will be provided by the Library on write-once CD-ROM disks in directories and with filenames that meet the requirements set forth in this document. Paper targets to accompany each document represented in the image set will be provided by the Library; these targets will contain all the information needed for the document's text header, as described in the Keying Instructions, Section J, Attachment 7.

The delivered SGML-encoded texts from image sets provided by the Library shall meet the requirements for text provided in preceding sections. If needed for the conversion process, the contractor shall reprocess the image sets or copy them onto other media for that purpose.

Prior to assigning a task order for the conversion of image sets not produced under this contract, the contractor will be given an opportunity to examine the images and media and to determine if the minimum image requirements of this contract can be met.

C.11.4

Texts and Associated Files

All SGML-encoded texts shall be named according to specifications and delivered as ASCII files on the media indicated. Each delivery shall be accompanied by a printed memo and printed directory list. Several types of associated files shall also be provided for each SGML-encoded text: page information group files, reference files, omission report files and ENTITY files. A description of each file follows. (See Section J, Attachment 8, for examples of each.)

C.11.4.1

Page Information Group Files

For every SGML-encoded document, the Library will confirm that: control page numbers match document filenames, control page numbers and print page numbers progress in the proper sequence, and there are no unaccountable variations in the relationship between control page numbers and print page numbers. Therefore, each delivery of SGML-encoded text shall include a set of machine-readable ASCII files, one file for each full-text document, that contains a list of all the page information group tags and their contents, in the order in which they appear in the document.

C.11.4.2

Reference Files

Pointers to internal and external references in SGML-encoded documents (such as all occurrences of illustration, table, and note reference tags with attributes) shall be reported in a set of machine-readable ASCII files, one file for each full-text document.

C.11.4.3

Omission Report Files

Text that is illegible is marked with an SGML tag to show its placement. A report of the line-number location of any omitted text shall be delivered in a set of machine-readable ASCII files, one file for each full-text document.

C.11.4.4

Entity Files

ENTITY references are used to link to external files. ENTITY references in the SGML file point to an external entity file that lists each ENTITY value, its corresponding filename and MIME type. Therefore, for all converted text a set of machine-readable ASCII files, one file for each full-text document, shall be provided that contains a list of the ENTITY values found in the document plus the filename associated with the ENTITY value.

C.11.5

Customized Error Diagnostic Software and Other Quality Review Tools

Currently no commercially-available off-the-shelf (COTS) software has been determined to be adequate for the full range of quality review activity required to ensure that all conversion requirements are met. Some customization of error diagnostic software may be required to meet the quality review requirements. Any customized error diagnostic software or other automated tools used by the contractor in quality review shall be furnished to the Library for the contract's period of performance at no additional cost. This software (or combination of software) must be able to produce user-friendly output stating errors found and their locations. The contractor shall also provide the Library with instructions for using this software. This software must be provided to the Library before the delivery of the test-batch SGML-encoded texts. Customized additions and modifications to the diagnostic software shall be provided to the Library as they are put into active use by the contractor.

C.12

RELATED SERVICES AND ACTIVITIES - LOT 1 AND LOT 2

C.12.1

Photocopying of Source Material

During the scanning process, certain source materials shall require photocopying. When color pages are scanned as bitonal or grayscale, it will be necessary to produce a photocopy of the original in order to compensate for the scanner's color-blindness. The Library will provide the photocopier when scanning takes place onsite and will permit copying at no cost to the contractor.

C.12.2

Printed Copies of Scanned Images

Printed copies of scanned images are required for the following images.

The need for printed copies of scanned images will also be specified for the task order (when applicable).

C.12.3

Programming and Processing Activities

The capability to provide different levels of technical expertise is required. It is anticipated that additional programming or processing steps associated with scanning or conversion, modifications to the SGML DTD and related files, and adjustments to the workflow tracking system may be required. These tasks may require different levels of technical expertise, including a processing technician or a computer programmer, and will be specified for task orders as applicable.

C.13

CONTRACT STARTUP AND TESTING ACTIVITY - LOT 1 AND 2

Because of the complexity of the Library's requirements and the variation in the Library's original materials, the first task to be performed under this contract for both LOT 1 and LOT 2 shall entail the study of a representative cross section of items and the production of a set of test images. For LOT 1, the task will also include the production of a set of SGML-encoded texts. The startup and testing phase shall provide a time during which the contractor and NDLP staff shall work together to address and finalize a mutually agreed upon definition of particular matters related to these technical requirements, such as: the handling of printed halftone illustrations and, for LOT 1 only, the clarification of keying and coding instructions for text conversion and SGML encoding. The startup activity for LOT 1 shall be eight weeks, while the activity for LOT 2 shall be five weeks. The outcome of the startup and testing activity shall 1) establish the specifications for the first task order and 2) provide for the provisional establishment of specifications for other materials likely to be encountered in later tasks under the contract. The startup and testing activities for LOT 1 and LOT 2 will also provide an opportunity for the Library and the contractor to finalize the details of data entry in the Library's workflow tracking system (see C.14).

C.13.1

Representative Types of Materials

During the contract's startup phase, the Library will furnish the following representative examples of the types of materials from which digital images and texts shall be produced. The sample materials will be accompanied by instructions regarding the filenaming and directory structure to be employed.

Furnished for LOT 1

The books will be selected to represent the most frequently encountered types for scanning, e.g., books that must be scanned face up with and without a cradle. Furnished for LOT 2

C.13.2

Startup/Testing Phase

The startup phase for both LOT 1 and LOT 2 shall include the following actions:

Week 1

The Library will make the items listed above available to the contractor at the Library of Congress, along with guidelines for handling, filenaming, etc.

Week 2

The contractor project manager and other contractor designated staff shall meet with the Library project manager (COTR) and other Library staff to discuss the sample materials and delineate the various options for scanning, the SGML markup of text, and the delivery directory structure(s).

The Library's Conservation Office will conduct an orientation session on the safe handling of originals, including use of contractor's book cradle, if applicable.

Week 3-4

Week 5

Week 6-7

Week 8

C.14

PRODUCTION WORKFLOW, PROCEDURES AND PROJECT MANAGEMENT - LOT 1 AND LOT 2

C.14.1

Workflow Tracking System

By the time of contract award, the Library will have established an electronic tracking system in order to manage interrelated production activities. This system shall consist of a mutually accessible, networked database that is DOS- or Windows-compatible and shall be used by the Library and the contractor via Internet, World Wide Web. The types of data to be entered into the system are provided in Section J, Attachment 9. As production work proceeds, the contractor shall enter the data elements indicated in Attachment 9 into the tracking system.

C.14.2

Pace of Work: Imaging - LOT 1 and LOT 2

C.14.2.1

Images - LOT 1

The contractor shall be capable of producing images at a minimum rate of 5,000 images per week, and a maximum rate of 15,000 images per week. The contractor shall be able to accommodate varying production requirements within this minimum/maximum range. The Library will develop a work plan at the beginning of each task order stating the anticipated pace of scanning for that task or for subtasks within the task.

C.14.2.2

Images - LOT 2

The contractor shall be capable of producing images at a minimum rate of 100 images per week and a maximum rate of 500 images per week. The contents of each task order will include the production rate and the requirements for LOT 2 scanning.

C.14.3

Pace of Work: Text Conversion and SGML-encoding - LOT 1 only

The contractor shall be capable of text conversion and SGML-encoding at a minimum rate 1,000 pages per week and a maximum rate of 10,000 pages per week. The contractor shall be able to accommodate varying production requirements within this minimum/maximum range. The specific pace of conversion work shall be determined for each task order.

C.14.4

Production Batches and Item Directories - LOT 1 and LOT 2

Groups of items shall be scanned in production batches of a size determined for each task order not to exceed the required maximum quantities. Each batch shall consist of items in a single collection. (The collection name is stated on the identifying target (see C.3.2). For example, a set of ten items or 1,000 scanned page-images might be grouped in a single batch. Within a batch, each item's files shall be contained in directories as in C.10.

C.14.5

Scanning Log - LOT 1 and LOT 2

A scanning log shall be kept that includes the elements listed in this section. At a minimum, this log shall indicate the date and general description of the material scanned, as well as noting exceptions, problems, irregularities, and anomalies of the types described in other sections of this document. The log shall also include the identification of the scanning operator and shall identify the particular scanning equipment used. The scanning log may be in machine-readable or paper form. If a machine-readable log is proposed, it shall be in commonly used software (e.g., WordPerfect, Paradox, etc.) and/or delivered as a delimited ASCII or generic word processing file. This log will actively be consulted during the quality review of the materials delivered by the contractor. The Library will also use the log to guide the modification of its cataloging or finding aids by incorporating the log's reports of missing pages, impossible-to-scan documents, and other anomalies.

C.14.6

Periodic Reports - LOT 1 and LOT 2

In addition to the scanning log, the contractor shall submit periodic reports of progress during the course of each task. The timing of such reports will be determined when the specific task is planned; for longer tasks monthly reports may be required. The reports shall provide a narrative that summarizes key events or activities, noting special problems or difficulties encountered, and addressing proposed methods for corrections of such problems as the work continues. The reports shall indicate the agreed-upon project schedule, progress relative to the schedule, and shall clearly state any deviations from the schedule with accompanying explanations. The reports shall note changes in equipment or procedure and provide statistics that indicate the accomplishments of the period described.

C.14.7

Accuracy Requirements: Imaging - LOT 1 and LOT 2

The standard of accuracy for images shall be 99.5%, except for those specifications or image attributes requiring 100% accuracy. For example, a batch of images will be rejected if, in a random sample lot size of 200 images, more than one image is found to be missing, duplicated, illegible, or otherwise defective. Examples of items required to be 99.5% accurate include level of compression, image size, and image quality. Examples of items required to be 100% accurate include content of file headers, file format, and resolution. Images shall be inspected and evaluated by the library in accordance with the American National Standard, general inspection level II (ANSI/ASQC Z1.4-1993 and ANSI/ASQC S2-1995). Except for those specifications requiring 100% accuracy as indicated, images will be inspected in randomly selected samples; the sampling lot size will be determined by the task order production rate, and will be in accordance with the sampling procedures described in the ANSI standards.

C.14.8

Accuracy Requirements: Text Conversion and SGML-encoding -- LOT 1 only

The original materials that the Library intends to digitize consist of varying degrees of legibility. Only materials that are highly legible (primarily printed or typescript materials) shall be converted to machine-readable, SGML-encoded texts. The standard of accuracy for all SGML-encoded texts shall be determined by the Library at contract award, and shall be either 99.95% or 99.995%. Accuracy is based on a character count, including tags, after encoding. For example, an accuracy level of 99.995% means that no more than one (1) wrong character is permitted for any 20,000 characters keyed, roughly one (1) wrong character per ten (10) pages.

C.14.9

Filenames and Delivery Directories - LOT 1 and LOT 2

The Library's requirements for filenaming and delivery directories are described in Section C.10. With the following exceptions all directory names and filenames shall be 99.95% accurate: 1.) new documents in manuscript folders must be identified and embedded into the filenames with 80 percent accuracy and 2.) features and page numbers in printed matter must be identified and embedded into the filenames with 80 percent accuracy.

C.14.10

Contractor Quality Control Program

A quality control program in accordance with the requirements for accuracy and delivery shall be initiated, documented, and maintained throughout the life of this contract. The Library expects that the contractor shall perform quality control for 100 percent of deliverables. A specific quality control plan shall be implemented for each phase of contract performance beginning with capture of document images through text conversion and ultimate acceptance by the Library of all deliverables. In addition, the contractor shall be responsible for inspecting the accuracy of filenames and directories for all digital images, texts and associated files produced under this contract. Inspection hardware and software shall be of appropriate quality, accuracy, and quantity to ensure that all requirements of this contract are met. The contractor shall document all quality control procedures, including actions taken to correct any problems, and submit a quality control report along with (or as a part of) the scanning log with each delivery to the Library. This quality control report must enumerate and describe actions taken. C.14.11 Contractor Quality Review: Imaging - LOT 1 and LOT 2 The contractor shall perform sufficient image inspection to ensure that deliveries of images to the Library meet the acceptance criteria. Contractor quality review shall include, but is not limited to, the following types of activities:

C.14.11.1

Image Quality.

The contractor shall ensure that image quality meets the acceptance criteria. For example:

C.14.11.2

Document Integrity.

In addition to ensuring that the complete page content has been captured, the contractor shall ensure that the complete source document has been scanned according to instructions provided, and that special instructions relating to specific collections or materials have been followed. This includes ensuring that:

C.14.11.3

Completeness of Scanning Logs.

The contractor shall ensure that scanning logs are completed and maintained in accordance with instructions provided.

C.14.12

Contractor Quality Review: Text Conversion and SGML-encoding - LOT 1 only

The contractor shall review the quality of all SGML-encoded texts before delivery to the Library. The contractor shall correct any errors that can be resolved by reference to the Library's DTD, Tag Library, or supplementary keying instructions before delivering SGML-encoded texts to the Library. However, it will be acceptable for the contractor to deliver to the Library SGML-encoded texts with errors or anomalies that cannot be resolved through the normal process if 1) those errors or anomalies are documented in a memo from the contractor to the Library; and 2) the contractor then applies whatever action is necessary to resolve those errors or anomalies after consultation with Library staff.

C.14.12.1

DTD Implementation.

The contractor shall ensure that the implementation of the American Memory DTD conforms to all the requirements described in the Library's instructions. This shall include confirming that document features have been tagged properly, that attributes are correctly assigned for physical text features such as italics or bold, and that special text such as handwritten or stamped is correctly identified.

C.14.12.2

Pre-parsing review.

Before parsing, the contractor shall review several features of the encoded documents. Any errors or anomalies found in this review shall either be corrected before delivery to the Library or reported to the Library in a memo, as described above. Document features that shall be reviewed before parsing and delivery include: header contents, document filenames and identity, the order of control page numbers and print page numbers, the relationship between control page numbers and print page numbers, the padding of control page numbers to the correct number of digits, and the correct ENTITY values for page images, tables and illustrations match the appropriate filenames.

C.14.12.3

Parsing.

The contractor shall ensure that all SGML-encoded texts parse in three (3) different validating parsers which have been approved by the Library prior to award.

C.14.13

Project Management

At the Library of Congress, the Contracting Officer's Technical Representative (COTR) will manage and coordinate this effort, while the contractor's project leader shall perform a similar function for the contractor. The COTR and the contractor's project leader will serve as the principal points of technical communication between the two organizations. The objectives of the project management approach are:

The contractor shall provide project management resources sufficient to ensure consistent production flow levels of images and text, and to ensure that all employees, including new employees, are fully trained in safe handling and project-related procedures.

The contractor shall provide schedules for project implementation at the start of each task order, including project startup. These project schedules shall indicate milestones against which progress shall be monitored and evaluated, and shall be updated on a regular basis to be determined at the start of each task order.

The contractor shall ensure that sufficient corporate resources exist within the contractor's organization to provide technical and management support and backup to the proposed project team as required.

C.14.14

Key Personnel

For purposes of this contract, key contractor personnel are defined as follows:

Lot 1 and Lot 2

project manager and designated alternate
digital scanning personnel
quality assurance inspector(s)
imaging engineer(s) or scientist(s)

Lot 1 only

SGML expert-specialist(s)

Project Manager -

At least five years of proven, applicable project management experience on large, complex project management efforts involving ten or more team members. Must have proven experience working with digital imaging and SGML. The contractor's project manager or designated alternate shall have full authority to represent the contractor in all matters regarding this contract.

Digital scanning personnel -

At least two of the proposed digital scanning personnel must have at least three years of proven experience in using scanning equipment and software relevant to this project. All other digital scanning personnel must have at least one year of applicable experience in using scanning equipment and software relevant to this project. All digital scanning personnel proposed by the offeror are considered to be key personnel.

Quality assurance inspector(s) -

At least one of the proposed quality assurance inspectors must have at least two years of proven experience performing quality review of images (LOT 1 and LOT 2). At least one of the proposed quality assurance inspectors must have at least two years of proven experience performing quality review of SGML-encoded text (LOT 1 only). All quality assurance inspectors proposed by the offeror are considered to be key personnel.

Imaging engineer(s) or scientist(s) -

At least one of the individuals proposed must have at least five years of proven technical experience in imaging technology, including strong knowledge and experience with the imaging requirements described in Section C, and with hardware and software relevant to this procurement.

SGML expert-specialist(s) -

(LOT 1 only) At least one of the individuals proposed must have a minimum of five years of proven experience with the conversion of texts to machine-readable form; a minimum of five years of proven experience using SGML and working with DTDs; and demonstrated knowledge of SGML editing packages and related software. At least one of the proposed individuals must have a minimum of three years experience writing/creating DTDs. At least one of the proposed individuals must have a minimum of one year of proven experience working with the SGML Text Encoding Initiative (TEI).

C.15

DELIVERABLES AND DELIVERY - LOT 1 AND LOT 2

C.15.1

Workflow Tracking System

The Library will install and the contractor will enter data concerning the progress of batches of materials through the workflow process (see C.14.1).

C.15.2

SGML Error Diagnostic Software - LOT 1 Only

The error diagnostic software used to perform quality review of SGML encoded texts described in C.11.5 shall be delivered before the delivery of the first batch of SGML-encoded texts (which are part of the startup activity).

C.15.3

Delivery Sequence

The work for each job executed under the terms of this contract shall be presented in three major deliveries:

C.15.3.1

Test Samples (when applicable)

Those digital images (LOT 1 and LOT 2) or machine-readable, SGML-encoded texts and associated files (LOT 1 only) related to the job analysis and proposal prior to issuance of task order. If the group of samples will fit on 10 floppy disks or less, 3.5-inch, IBM- compatible floppy disks may be used. Alternatively, and as a requirement if the sample data exceeds the capacity of 10 floppy disks, the samples shall be furnished on a write-once CD-ROM.

C.15.3.2

Main delivery

This will consist of one or more write-once CD-ROMs. The job may entail the scanning, conversion and SGML encoding of texts and delivery of batches [as discussed in Section D.] These first-delivery CD-ROMs are referred to as alpha disks, meaning that they are the first delivery of the image sets. They will be retained by the Library. LOT 1 Only: Images and their SGML texts shall be delivered separately. Images must be approved by the Library before they are sent for conversion and coding.

C.15.3.3

Rework

Unacceptable digital images or machine-readable, SGML-encoded texts and associated files (LOT 1 only) shall be delivered as rework. The rework may be delivered on floppy disks, on a separate, write-once CD-ROM, or if multi-session format is used, on the disk containing the original delivery. These are referred to as rework disks, meaning that they contain reworked versions of images or texts that failed in the first delivery. Rework for each batch shall be delivered separately and labeled "rwxxx," where xxx is the number of the original delivery batch. For example, rework for original batch "125" shall be labeled "rw125."

C.15.4

Write-once CD-ROM Disks

As outlined in the subsections that follow, deliveries other than those of small sample batches shall be made on write-once CD-ROM compatible with ISO 9660 specifications and containing DOS files in DOS directories. The disks may be in a single-session or multi-session format. Each CD-ROM and accompanying jewel case shall be labeled with the collection or job names, disk (volume) name (within the job series), date completed, and the indicator "Library of Congress/NDLP."

C.15.5

Alternate Delivery Media

Alternate delivery media, e.g., 8mm TAR tapes readable on IBM RS-6000 computers running the AIX version 3.2.5 operating system, or delivery of images by file transfer protocol (ftp) which would permit images to be loaded into directories and subdirectories in servers at the Library may be acceptable as negotiated and determined prior to contract award. Consideration of the alternatives will take into account compatibility with the Library's existing systems.

C.15.6

Shipping/Packing List Form

Each shipment of digital images delivered to the Library shall include an itemized packing list. Each shipment of digital files on CD-ROMs shall be accompanied by the scanning log covering that shipment, together with directory and filename lists for the disk.

C.15.7

Return of Government Furnished Materials (If Applicable)

All products developed under this contract shall belong to the U.S. Government, including the proprietary right therein. (See H.1, Release, Publication, and Use of Government Furnished Data, Page H-1) The contractor shall return to the Library any original materials supplied. Although the contractor may retain copies of the digital scanned files created as working backups, at the end of the contract period, the contractor shall erase or destroy all backups or duplicate files and materials. Any intermediate materials produced in the course of preparing the required images shall be delivered to the Library. This may include intermediate film copies, or other output. These intermediate materials shall be labeled in a systematic way. Documentation in the form of logs or inventory sheets shall be supplied.

Next.....Previous.....Return to Section B Table of Contents.....Return to the Table of Contents