Selected documents

Library partners selected collections of documents relevant to library work. The following table lists the final document selection and their descriptions.

Test cases

We selected a broad class from the above mentioned list against which we tested the software, as follows:

Records from the'Title index' from the above catalogueAquella noche en Varsovia : 1.759
Records from "Catálogo de Discos de 78 rpm en la Biblioteca Nacional". BNE1 Addio a Napoli. Cottrau. -- Canta pe'me. De Curtis.
Int.: Enrico Caruso, con acompañamiento de or-
Barcelona, Compañía del Gramófono-Odeón, 1924.
RCA Manufacturing Company. Camden, New Jer-
sey. A 23140, CA 11306.
La Voz de su Amo DA 104.
6 min. 25 cm.
D. Cª. 13/1
Records from the name index from de "Catálogo..." above.MAGYARI, Marika: 2.456, 2.541, 2.604, 2.633. 2.662, 2.743.
Entries from the "Diccionario-Glosario de términos económicos" . In PressAsignación de los recursos
Samuelson 1126
Resource allocation
Samuelson 981
Records from the "Indice Español de Ciencias Sociales". CINDOCACCION SOCIAL 0346, 0354, 0450, 0451, 2492, 2813, 2884 2885, 2896
ISBD bibliographic cards59.9
ADLER, Alfredo

Conocimiento del hombre / Alfredo Adler.--
5ª ed.-- Madrid : Espasa-Calpe, 1968
229 p. ; 18 cm.-- (Colección austral ; 775.
Ciencia y técnica)
D.L. M-23.498-1968
1. HOMBRES - Psicología
R. 8271
Samples from the table of content of 'Review of Aesthetics'General Theories of Art versus Music STEPHEN DAVIES 315
Art and the Transcendent T. J. DIFPEY 326
The Possibility of Aesthetics PATRICK COLM HOGAN 337
Shaftesbury: Father or Critic of Modern Aesthetics? JORGE V. ARREGUI and PABLO ARNAU 350
Authenticity in Musical Performance: Personal or Historical? JANE W. O. DEA 361
Symposium: Truth, Meaning and Literature BERNARD HARRISON and RICHARD GASKIN 376
On Analysing Analytic Aesthetics RICHARD SHUSTERMAN 389
Book Reviews 395
Books Received 415
Journals Received 417
Entries from the thesurus in the "Indice Español de Ciencias SocialesGaleno
U.p. Galen
U.p. Galeno, Claudio
U.p. Galenus, Claudius
Records from the "Catálogo de Villancos", BNE555
VILLANCICOS que se han de cantar en los Maytines de los Reyes, en la... Iglesia Metropolitana Cesar-Augustana, en su... Templo de el Pilar, este año de 1702 / puestos en musica por D. Miguel Ambiela, Racionero y Maestro de Capilla...--En Zaragoza: por Domingo Gascon..., 1702.--[8] p.; 4§
Sign.: A*-4.--Datos de pie de imp. tomados del colofón.--Texto a dos col.--Port. con viñeta xil. que representa a la Virgen del Pilar
1. Primero Nocturno. Villancico Primero: "Tres Purpuras del Oriente..." [Int., Estr. y Coplas]
2. Villancico Segundo: "Ay ­qué tierno Infante!..." [Estr. y Coplas]
3. Villanciaco Tercero: "Al Portal vnas Gitanas..." [Int., Estr. y Coplas]
4. Segundo Nocturno. Villancico Qvarto: "Tres Monarcas, y una Estrella..." [Int., Estr. y Coplas]
5. Virlancico Qvinto: "Moradores del Orbe, atended,..." [Estr. y Coplas]
6. Villancico Sexto: "Seis Poetas de repente,..." [Int., Estr. y Xacara]
7. Tercer Nocturno. Villancico Septimo: "Ay, que se duerme mi afecto,..." [Estr. y Coplas]
8. Villancico Octavo: "Vnos Rusticos Villanos..." [Int., Estr. y Coplas (Zacara y Gayta)]
Jiménez Catalán, 21; Palau, 367797
· VE/ 1303-19(1) (Barbieri)

For almost every document class several grammars were defined, to assess system ability to perform analysis of different granularity.

The quality of analysis was in every case more than satisfactory, particularly in relation to analysis correctness and completeness. We found some analysis errors in every collection. That was not unexpected at all, since it is a well-known feature of text analysis that -when working on actual documents- the document codification is never wholly stable. This instability precludes the possibility of defining a not trivial, totally flawless grammar for a representative document collection. The reason is that, given two subclasses in the same document collection, the best grammar for the first subclass increases the number of errors in the analysis of the other.

It is very important to note that usually the existence of an unparsable text segment does not produce a break on the parsing process. On the contrary, the untractable text subsegment becomes labelled as 'ind' and the parsing continues. In the same way when a mandatory area is not found in a document, the fact is recorded as an empty field labelled as 'ign' and the analysis continues. For many other possible errors the parser itself detects the flaw and marks the area accordingly. This, together with wrong analysis selection capabilities in the DFA component, simplifies the error correction at the area analysis level.

The Parser Engine was also intensively tested from November 1995 on, when used to perform the Retrospective Conversion of a University Library card catalogue (35000 records). In this case, the information tree obtained for the documents were translated to the UKMARC coding.

Processing time for AFCA module obviously depends heavily on the length of the documents. The following figures summarise typical values for the analysis. The testing ran on a PC 486 compatible computer with 16 Mbytes RAM (66 MHz).

Type of recordTime
ISBD cards3,2 sec /item (item = 1 record)
Discos 78 rpm1,4 sec / item (item = 1 record)
Index discos0,34 sec / item (item = 1 entry)
Summary Review if Aesthetics4 sec / item (item = 1 page summary)
Cindoc thesarus0'3 sec item (item = 1 entry)
Villancicos catalogue4,5 sec /item (item = 1 record)

It is worth mentioning that some document length limitations were detected, due to peculiarities of the underlying software. These as well as other limitations (related, for instance, to DFA selection capabilities) should disappear in future versions.

Other application fields

During software development, trying to assess the global applicability of the techniques developed, AFCA software -particularly the Parser Engine- was tested against other kinds of documents, These application fields are outside the BiblioTECA application field and will be object of further exploration.

