GBIB Beta

Resolving ambiguities of a gaze and speech interface
Qiaohui Zhang, Atsumi Imamiya, Kentaro Go, Xiaoyang Mao
Proceedings of the 2004 symposium on Eye tracking research & applications, 2004, pp. 85--92.

Abstract: The recognition ambiguity of a recognition-based user interface is inevitable. Multimodal architecture should be an effective means to reduce the ambiguity, and contribute to error avoidance and recovery, compared with a unimodal one. But does the multimodal architecture always perform better than the unimode at any time? If not, when does it perform better than unimode, and when is it the optimum? Furthermore, how can modalities best be combined to gain the advantage of synergy? Little is known about these issues in the literature available. In this paper we try to give the answer through analyzing integration strategies for gaze and speech modalities, together with an evaluation experiment verifying these analyses. The approach involves studying the mutual correction cases and investigating when the mutual correction phenomena will occur. The goal of this study is to gain insights into integration strategies, and develop an optimum system to make error-prone recognition technologies perform at a more stable and robust level within a multimodal architecture.

Article URL: http://doi.acm.org/10.1145/968363.968383

BibTeX format:

@inproceedings{10.1145-968363.968383,
  author = {Qiaohui Zhang and Atsumi Imamiya and Kentaro Go and Xiaoyang Mao},
  title = {Resolving ambiguities of a gaze and speech interface},
  booktitle = {Proceedings of the 2004 symposium on Eye tracking research & applications},
  pages = {85--92},
  year = {2004},
}

Search for more articles by Qiaohui Zhang.
Search for more articles by Atsumi Imamiya.
Search for more articles by Kentaro Go.
Search for more articles by Xiaoyang Mao.

Return to the search page.