Speech Recognition : reco module

Speech Recognition abilities comes from irit laboratory, team SAMOVA. For more information see their web pages

A French speech recognition system has been developed in order to allow visitors to interact with the robot, by means of some basic vocal commands. Based on the open source recognition engine, Julius (http://julius.sourceforge.jp/en/julius.html) or more exactly on another version of Julius, which performs grammar-based recognition (Julian version), this automatic speech recognition system is intended to recognize to types of vocal commands: (1) basic control and movement commands and (2) guidance commands towards one of available destinations.

Three specific linguistic resources are necessary for the recognition process: acoustic models (phones), a word lexicon including word pronunciations and a grammar describing the possible syntax of word sequences on the specific task. A French word lexicon (150 words / 250 pronunciation variants) and a specific grammar have been specially designed for Rackham and its dedicated tasks at the Cite of Space. Word pronunciations were mainly obtained from BDLEX and its lexical and phonological resources, which are available at IRIT. The acoustic models were directly taken from previous works carried out in the Samova team (IRIT).

Once a speaker utterance is recorded, the engine searches for the most likely word sequence according to the constraints given by the grammar. Then the best sentence will be converted into a specific command next processed by the superviser.

Multi-modality : fusion module

A GenoM module based on an on-the-shelf free software named Julius performs speech recognition. The GenoM module FUSION aims to complete these verbal utterances thanks to the output of the GEST module in a late and hierarchical fusion strategy. The head location enables to extract the human position, while head and hands positions give a pointed direction, enabling the robot to obey an oro-gestual commands.

