[an error occurred while processing this directive]
Development of Spoken Language Corpora for Travel Information
(from LIMSI 1995 Scientific Report, April 1995)
L. Lamel, S. Rosset, S. Bennacef, H. Bonneau-Maynard,
L. Devillers J.L. Gauvain
Object
Collecting spoken language corpora is an important aspect of our
research, and represents a significant portion of the work in
developing a spoken language system. We report on our collection of
spontaneous speech data from naive subjects in the context of two
tasks, an air travel information task (l'ATIS) and and a train
travel information task (MASK).
Content
L'ATIS is a French version of the ARPA
ATIS task which has been used as a common task for data collection
and evaluation within the ARPA Speech and Natural Language
program. L'ATIS allows users to acquire information about fares and
flights available between a restricted set of cities within the United
States and Canada as well as some ancillary information. In the MASK
task users can ask for rail travel information such as timetables,
tickets and reservations for train travel among 500 cities in France.
This data collection is being carried out in the context of the ESPRIT
project MASK (Multimodal-Multimedia Automated Service Kiosk) in which
we are developing a spoken language system interface for an automated
service kiosk.
We started our data collection for the L'ATIS task using WOZ
setup, where a wizard typed a paraphrased version of the spoken query
to the system ( For this initial setup, the NL understanding
component was developed in collaboration with colleagues at MIT,
cf. Eurospeech'93). In September 1994 we greatly expanded our
L'ATIS data collection efforts, and since January 1995 we record
subjects on a regular basis for both tasks. The recordings are made
in office environement, simultaneously with a close-talking, noise
cancelling Shure SM10 microphone and a table-top Crown PCC160
microphone, using up-to-date versions of our spoken language systems.
Situation
The cumulative number of subjects and number of queries recorded thus
far for the L'ATIS task are given
in Table 1. We are collecting
speech at the rate of over 1000 queries per month from at least 20
speakers. There are an average 10 words per query. The total number
of words, and the number of distinct words are also shown. The new
word rate is seen to decrease from Sep. though Dec. In mid December
a new version of the L'ATIS data collection system was
installed. This version corrected some problems in the maintenance of
the dialog history and integrated a new version of the speech
recognizer. The combined improvements changed significantly the
user's interaction with the system. With a more performant speech
recognizer, speakers speak more easily and use longer and more varied
sentences. These new recordings will then be used to improve the
system, leading once again to a more flexible, performant system.
For MASK we have recorded 89 speakers and a total of 4889
queries. There are 931 distinct words in the MASK queries, with
about one-third not in the L'ATIS word list. The MASK
queries are slightly shorter (8 words per query on average) than the
l'ATIS queries. These shorter sentences are probably linked to
the performance of the speech recognizer which for now has a higher
word error than for l'ATIS due to the limited amount of training
data. Table 2 shows recent progress in speech recognition and query
understanding for the MASK task. As more data is recorded we
will be able to improve the performance of the MASK data
collection system which in turn should enable us to record spontaneous
speech representative of how a user interacts with a fully automated
system.
References
[1] S.K. Bennacef, H. Bonneau-Maynard, J.L. Gauvain, L. Lamel, W.
Minker, ``A Spoken Language System For Information Retrieval,''
ICSLP'94.
[2] L. Lamel, S. Rosset, S.K. Bennacef, H. Bonneau-Maynard, L. Devillers, J.L.\ Gauvain, ``Development of Spoken
Language Corpora for Travel Information,'' to appear in
Eurospeech'95 .
[an error occurred while processing this directive]