Eye Gaze in Salience Modeling for Spoken Language Processing
Supported by National Science Foundation (11/1/2005 - 10/31/2009)
Previous psycholinguistic work has shown that eye gaze is tightly
linked to human language processing. Almost immediately after hearing
a word, the eyes move to the corresponding real-world referent. And
right before speaking a word, the eyes also move to the mentioned
object. Not only is eye gaze highly reliable, it is also an implicit,
subconscious reflex of speech. The user does not need to make a
conscious decision; the eye automatically moves towards the relevant
object, without the user even being aware. Motivated by these
psycholinguistic findings, our hypothesis is that during human machine
conversation, user eye gaze information coupled with conversation
context can signal a part of the physical world (related to the domain
and the graphic interface) that is most salient at each point of
communication. This salience in the physical world will in turn prime
what users communicate to the system, and thus can be used to tailor
the interpretation of speech input. Based on this hypothesis, this
project examines the role of eye gaze in human language production
during human machine conversation and develops algorithms and systems
that incorporates gaze-based salience modeling to robust spoken
language understanding. Supported by NSF (Co-PI: Fernanda Ferreira,
University of Edinburgh).
(Picture: Smoothed eye gaze fixations on the graphic display recorded during a user talks to the system)
Selected Papers:
- Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue. Z. Prasov and J. Y. Chai. Conference on Empirical Methods in Natural Language Processing (EMNLP). MIT, MA. October 2010.
- Context-based Word Acquisition for Situated Dialogue in a Virtual World. S. Qu and J. Y. Chai. Journal of Artificial Intelligence Research, Volume 37, pp.347-377, March 2010.
- Between Linguistic Attention and Gaze Fixations in Multimodal Conversational Interfaces . R. Fang, J. Y. Chai, and F. Ferreira. The 11th International Conference on Multimodal Interfaces (ICMI). Cambridge, MA, USA, November 2-6, 2009.
- The Role of Interactivity in Human Machine Conversation for Automated Word Acquisition. S. Qu and J. Y. Chai. The 10th Annual SIGDIAL Meeting on Discourse and Dialogue, London, UK, September, 2009.
- Incorporating Temporal and Semantic Information with Eye Gaze for Automatic Word Acquisition in Multimodal Conversational Systems . S. Qu and J. Y. Chai. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP). Honolulu, October 2008.
- Whats in a Gaze? The Role of Eye-Gaze in Reference Resolution in Multimodal Conversational Interfaces.
Z. Prasov and J. Y. Chai.
ACM 12th International Conference on Intelligent User interfaces (IUI).
Canary Islands, Jan 13-17, 2008. - Automated Vocabulary Acquisition and Interpretation in Multimodal Conversational Systems.
Y. Liu, J. Y. Chai, and R. Jin.
The 45th Annual Meeting of the Association of Computational Linguistics (ACL).
Prague, Czech Republic, June 23-30, 2007. - An Exploration of Eye Gaze in Spoken Language Processing for Multimodal Conversational Interfaces.
S. Qu and J. Y. Chai.
2007 Meeting of the North American Chapter of the Association of Computational Linguistics (NAACL-07).
Rochester NY, April, 2007. - Eye Gaze for Attention Prediction in Multimodal Human Machine Conversation.
Z. Prasov, J. Y. Chai, and H. Jeong.
The AAAI 2007 Spring Symposium on Interaction Challenges for Artificial Assistants.
Palo Alto, CA. March 2007.