Database Preparation Services

Database Clean-Up

Filing Indicator Fixes

Filing indicators in title fields specify the number of initial characters to be ignored during computer indexing. Presently, there are eleven title fields in which either the first or second indicator is used in indexing. These include: 130, 240, 243, 245, 440, 630, 730, 740, 830, 840, and 873. Other title fields (e.g., 246) strip leading articles and hence have no need for filing indicators.

Member-input records derived from bibliographic utilities often either lack or have incorrectly assigned filing indicators. If the library's online public access catalog indexing does not make use of filing indicators, they can be ignored. In systems that do use filing indicator information, absent or inaccurate indicators will adversely affect retrieval.

LTI's filing indicator fix program is one of the most comprehensive available. It recognizes over 40 languages. For the 245 field, articles associated with a fixed field element language code are compared against the initial text in the title field. Based on this comparison, the filing indicator is set to 0, if no match is made, or to its proper matched value. The program takes into account diacritical marks and special characters associated with an article, but preceding the first actual filing character. For example, the 245 field title "The eve that never sleeps ..." would be assigned the filing indicator value 5—not 4.

If the fixed field language code in bytes 35-37 of the 008 is either blank or does not match one of LTI's language codes, our algorithm compares the 245 field initial text against the table of common articles and sets the filing indicator to its proper value or to 0 as appropriate.

Because title fields other than 245 (e.g., X30, 240) do not necessarily correspond with the fixed field language code, LTI's program compares non-245 field initial text against the table of common articles and sets the filing indicator to its proper value or to 0 as appropriate. As with the 245 field, LTI also checks for initial diacritical and special characters associated with an article, but preceding the first actual filing characters.

In 245 or 740 fields, there is no way for LTI's program to distinguish the pronoun "one" from some foreign languages articles. That is, it is possible that a correctly coded value will occasionally be reset incorrectly. This happens rarely. The same problem can also occur in an English language title, e.g., the 245 field title "A is for apple."

A no-charge option in LTI's filing indicator fix program sets the first indicator in the 245 field based on the presence or absence of a 1XX field. In other words, the first indicator is checked and if necessary changed to 0 when there is no 1XX field present or to 1 if there is a 1XX field in the record.

MARC Update Service

Over the years the MARC formats and their implementation by the bibliographic utilities have continued to evolve to meet changing conditions. OCLC Technical Bulletins No. 172 (June 1987) and No. 194 (July 1991) describe some of these changes, including deletion of obsolete tags and subfields; conversion of obsolete tags, indicators, and subfields to current usage; addition of default values for fill characters in fixed-field elements when a default value is supplied in the workform; and, elimination of invalid fixed-field codes.

At various times OCLC has scanned its database and made the changes described in the above Technical Bulletins. To achieve consistency with up-to-date MARC standards, a library can request that its records undergo these same changes. Specific examples of changes include the conversion of subfield codes $d and $e in the 245, 246, and 247 fields of the serials format to the currently defined subfield codes $n and $p; conversion of the obsolete 705 and 715 fields in the sound recordings format to 700 and 710; deletion of indicator values in the 260 field; and deletion of the second indicator in 1XX fields, with creation of the appropriate 6XX field, when the second indicator in the 1XX field was initially set to 1.

As part of its MARC Update service, LTI offers several options including conversion of 69X fields to 6XX in serials format records, deletion of NLM (MeSH) subject headings, and deletion of 87X fields. Deletion of NLM subject heading fields and variant form of name fields (87X) are used mainly to reduce the size of the library's database. LTI's MARC Update service is always run as a preliminary authority control clean-up procedure. It incorporates changes based on all updates to LC's USMARC Format for Bibliographic Data. Format integration changes are also made.

In addition to general purpose MARC record upgrades, database vendors must also be able to effect "global" changes to catalog records. LTI's global edit software is flexible, allowing any field or character string to be added, deleted, or replaced with a different character string. Examples of global edits include:

  • adding a 590 note to records sharing a certain characteristic, e.g., holding library
  • removing "stamp" information from call number fields, e.g., deleting the abbreviation "Ref" before call number
  • deleting dashes after title, edition, and physical description fields input mistakenly as prescribed ISBD punctuation
  • deleting 6XX fields with a second indicator other than 0
  • creating and inserting an 040 field if none is present in the record
  • deleting acquisitions information fields from RLIN records
  • expanding holding library symbols in 049 fields from 3 to 4 characters
  • updating OCLC control number preface from "ocl7" to "ocm0."

Next: Item Build & Barcode Production