John Choi, Don Hindle, Julia Hirschberg, Ivan Magrin-Chagnolleau,
Christine Nakatani, Fernando Pereira, Amit Singhal, Steve Whittaker.
An Overview of the AT&T Spoken Document Retrieval System.
Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, USA, 1998.

Abstract: We present an overview of a spoken document retrieval system
developed at AT&T Labs-Research for the HUB4 Broadcast News corpus.  This
overview includes a description of the intonational phrase boundary detection,
classification, speech recognition, information retrieval and user interface
components of the system, along with updated system assessments based on the
49-query task defined for the TREC-6 SDR track.  Results from a comparative
ranking study, based on queries taken from AP Newswire headlines from the same
time period that the Broadcast News corpus was recorded, are presented.  For
the AP task, retrieval accuracy is assessed by comparing the documents
retrieved from ASR generated transcriptions with those from human generated
transcriptions. 

Contact: ivan@ieee.org