[Metadatalibrarians] Initial articles in titles

Keith Jenkins kgj2 at cornell.edu
Thu Aug 7 11:28:14 PDT 2008


Trying to automatically parse initial articles is bound to fail in
some small number of cases, for the reasons that Joe describes.  If
you know the language of the title, then checking for a set of strings
based upon that language is probably the best approach, even though
that still won't be perfect.

For titles derived from MARC records, I've used the the non-filing
indicator to remove initial articles.  But, even then, I've seen some
errors that were caused by an error in the original MARC record.
Seeing a list of titles with the non-filing characters removed can
really help to recognize, (and fix) such errors.

Even though perfection is elusive, I still think it's worth trying to
remove initial articles if you will be sorting titles alphabetically.
I haven't read any user studies (anyone know of any?), but I suspect
that most users would still expect to see The Beatles and The
Yardbirds under B and Y, rather than T.

Keith


On Thu, Aug 7, 2008 at 1:29 PM, Joe Altimus <jaltimus at gmail.com> wrote:
> Using a program to identify initial articles in titles (whether in data
> processing, indexing, or displays) will fail a small percentage of the time.
> No matter how sophisticated one makes the program to deal with cases (e.g.,
> titles such as "A is for American", "An Stalin" [a German language title],
> "Thé français"), it's likely that some cases are not covered.


More information about the Metadatalibrarians mailing list