The man-machine telephone corpus
For this paper, we have only considered the small subset of human-machine telephone recordings, part of the telephone corpus. This subset has been collected by means of an automatic telephone call system specifically designed and developed for the C-ORAL- ROM project by ITC-Irst (Falavigna & Gretter 2003). The system handles interactions with human callers asking for train time-table information in three languages, Italian, French and Spanish.
The collection was recorded by means of a telephone toll free number, during March and April 2002. All the telephone calls were automatically transcribed by ITC-Irst and manually checked by teams at the Universities of Florence, Madrid and Aix.
Before using the system, each caller was given a general information on how to proceed. In this way, callers interacted freely with the system. Many linguistic expressions were not covered by the speech recognition grammars. This caused many recognition errors and many callers had some difficulties to end successfully the dialog.
The dialog system developed for the C-ORAL-ROM corpus is based on a mixed- initiative dialog strategy. Such systems allow a user to take the dominant role at any instant of the interaction, in opposition to the menu-based system that only offer the user the possibility to interact through a sequence of predefined steps. Most of commercially available spoken dialog systems use the second strategy.
In mixed-initiative systems the task can be seen as a “form filling” problem, where each field of the form corresponds to basic information. These systems must be able to perform some kind of natural language processing. In particular, to provide some basic semantic interpretation of the input utterance by the user is sufficient. Some procedures for recognising errors and recover from possible misunderstandings are implemented. However we will see in the paper that performance is yet very low, mainly due to the weakness of the language models, that is, the recognition grammar and the dialog strategies. Problems with the acoustic model will not be addressed in this paper.
Add Comment