Do the Urgent Things first! – Detecting Urgency in Spoken Utterances based on Acoustic Features

Jakob Landesberger²³, Ute Ehrlich², Wolfgang Minker³

² Mercedes-Benz AG, Sindelfingen, Germany. ³ University Ulm, Ulm, Germany.

 

  • In the future, spoken dialogue systems will have to deal with more complex user utterances and should react in an intuitive, comprehensible way by adapting to the user, the situation and the context.
  • In rapidly changing situations, like talking to a highly automated car, it is highly relevant to react adequately to quick urgent interjections whether within one utterance or as interruptions of ongoing actions/dialogues.
  • A first step is the detection of urgency in user utterances.

Urgency Corpus “What is it?”

Gamification Experiment. The player had to find a symbol by asking questions to a system. Urgent time-limited tasks (right figure) alternated with non-urgent tasks (left figure).

U:  “Is the symbol I am looking for red?”S: “no.”

U: “Does the symbol have a orange background?” – S: “yes”

U: “Is the symbol an animal?” –  S:

 

Results

42 Particiapants (age Ø 27), 15 ♀, 27 ♂, 3210 urgent, 7527 non-urgent utterances.

Extracted Features 

In order to detect urgency we extracted 108 features directly from the audio signal:

 

Classifier Results

The analysis of the extracted data revealed that different features are better suited for detecting urgency in different phases of the interaction.

Transition: The switch from a non-urgent utterance to a urgent one.

Decline: Switch from an urgent utterance to a non-urgent one

We used three different Estimators to reduce the total number of features and six different Classifiers to automatically detect urgency. The results for each Estimator and Classifier combination is shown in the picture. (Top left table: Results for Transition phase, top right table: Results for Decline phase, bottom table: Results for all utterances)

 

 

Conclusion & Future Work:

  • Depending on the phase, different features, techniques for selecting a subset of features and classifiers showed varying degrees of success.
  • Especially in the Decline phase, representing the shift from urgent to non-urgent requests, urgency can be identified with a high probability.
  • Accordingly, the possibility of combining different classifiers or changing dynamically depending on the phase could prove successful for real-world problems.