Collaborative Work Systems (CWS), Inc - Multimodal Communications

Human Factors research has demonstrated the potential benefits of supporting multiple modalities and provides results that can help guide technology developers. For instance:

Images and natural language in communications

The dual coding theory developed by Paivio (1986) proposes that that there are fundamentally different types of information stored in working memory; he calls them imagens and logogens. Imagens denote the mental representations of visual information in analogue codes, while logogens denote the mental representations of language information in symbolic codes. Research with PET scans and fMRI supports this theory by showing different parts of the brain are activated by the respective forms of stimuli. This much referenced theory is important because it suggests that information should be presented with consideration for how it will be processed, stored, and retrieved. If, for instance, information is presented in a text format when it will be converted and stored in a visual representation, it would have been more efficient to present it visually.

The architecture of Paivio's Dual-Coding Theory (Ware, 2000)

In Ware's (2000) chapter, Visual and Spoken Language, he summarizes the research on what has been shown to be more efficiently processed and retained when represented with images and with words. He states that images are better for spatial structures, location, and detail, whereas words are better for representing logical conditions, and abstract verbal concepts.

The value of coordinated visual and non-visual stimuli

Paivio's theory describes links between the nonverbal and verbal systems suggesting that they complement each other. Indeed, research shows that verbal information presented with relevant images can be remembered better than verbal information alone (Anderson and Bower, 1973). Synchronized voice and animation over images also focuses attention and improves retention of information in multimedia presentation systems better than voice or animation alone (Faraday and Sutcliffe, 1997).

The effect of supporting messaging centered on deictic gesturing (pointing while saying "this" object, or the area "here", etc.,) over a collection of related performance summarizing images was studied by Chapman (2002) when, in a simulation, airline dispatchers were asked to create messages for FAA traffic managers, and then after these messages were transmitted, traffic managers were asked to respond. Chapman found that synchronized voice and pointing over images was 55% more efficient in message creation time than text annotation over the same images. He also found that dispatchers described the situation more thoroughly with synchronized voice and pointing. Comparing the same communication modes in a follow-up study, Bower (2004) found that ROTC cadets were able to recall more mission crucial information from company Operations Orders in the voice and pointing mode.

The effect of communication mode on collaborative problem solving

Research shows that the mode of communication used by those responding asynchronously to a communication can effect the efficiency and content of that feedback. For example, voice-based annotations communicated between writers working collaboratively and asynchronously support more suggestions over the same period of time (Neuwirth et al., 1994). In another study of collaborating authors, written annotations led to more comments on local problems in text, while speech led them to comment on higher level concerns (Chalfonte et al., 1991). Voice synchronized with pointing over images in asynchronous annotation systems has been to be more efficient in collaborative scheduling tasks than voice-only or text only communication (Daly-Jones et al., 1997). In the study by Chapman, mentioned in the last section, the synchronized voice and pointing mode was 50% more efficient than the text mode in the combined dispatchers' message creation and traffic managers' message response time, and traffic managers described more constraints on potential solutions in the voice and pointing mode. Participants also rated the voice and pointing mode based system more useful and usable than the text mode system.

Tactile displays to alleviate auditory and visual channel bottlenecks

Tactile displays can enable their users to receive and interpret valuable information without compromising the simultaneous utilization of other modalities (Merlo, Stafford, Gilson, & Hancock, 2006). It has been shown that localized vibration on a belt around the torso can effectively provide direction information to the wearer (Elliott, van Erp, Redden, & Duistermaat, 2010), and support their completion of a navigation task better than other communication modalities.

Arm and hand signals, such as those described in US Army Field Manual 21-60, are frequently used for soldier-to-soldier communications, but sometimes visual communications are difficult. For instance, if a team leader of a squad on patrol visually signals a "Halt" command, the soldiers in front of the team leader in particular may not see the visual command. In a field experiment soldiers performing an obstacle course were able to receive, interpret and accurately respond to tactile commands for "Halt", "Rally", "Move-Out", or "NBC" faster than when the information was passed by a leader in the front or back of a wedge formation using conventional arm and hand signals (Merlo, Stafford, Gilson, & Hancock, 2006). Soldiers also commented they were better able to focus on negotiating obstacles and the local area when receiving tactile signals than when maintaining visual contact with their leaders.

Studies have also shown the value of using tactile displays to cue users to information presented in another form indoors (Ferris & Sarter, 2008) and in the air (Rupert, Graithwaite, McGrath, Estrada & Raj, 2004), and that value may well translate to outdoor activities.

In another study participants demonstrated they could accurately recognize and distinguish between all three types of message (navigation, person-to-person commands, and entity descriptions), with between one and four pieces of information, and respond with appropriate physical behaviors within a 10 foot hollow sphere capable of rotating upon wheels placed beneath it (Chapman, Nemec, and Ness, 2013).

A lab experiment conducted at USMA (Chapman, Nemec, and Ness, 2013)

A lab experiment conducted by CWS at USMA

References

Anderson, J. R. & Bower, G. H. (1973). Human associative memory. Washington, DC: Winston.

Bower, J. I. (2004) The Impact of Asynchronous Multimedia Communications on Understanding and Recall. Proceedings of the Human Factors and Ergonomics Society 48th Annual Meeting. New Orleans, Louisiana.

Chalfonte, B. L., Fish, R. S. & Kraut, R. E. (1991). Expressive richness: A comparison of speech and text as media for revision. In Robertson, S. P., Olson, G. M., & Olson, J. S. (Eds.). Reaching through technology: Proceedings of the Conference on Human Factors in Computing Systems. New Orleans, Louisiana, 21-26.

Chapman, R. J. (2002). Multimodal, Asynchronously Shared Slide Shows as a Design Strategy for Engineering Distributed Work in the National Airspace System. Doctorial Dissertation, The Ohio State University, Columbus, OH.

Chapman, R. J., Nemec, L., & Ness J. (2013). The Evaluation of a Tactile Display for Dismounted Soldiers in a Virtusphere Environment. Proceedings of the Human Factors and Ergonomics Society Annual Meeting. San Diego, CA.

Daly-Jones, O., Monk, A., Frohlich, D.M., Geelhoed E. & Loughran S. (1997) Multimodal messages: The pen and voice opportunity. Interacting with Computers 9, 1-25.

Elliot L. R., van Erp J. B. F., Redden, E. S., & Duistermaat (2010). Field-Based Validation of a Tactile Navigation Device. IEEE Transactions on Haptics. Vol. 3, No. 2. April-June 2010.

Faraday, P. M. & Sutcliffe, A. G. (1997). Designing effective multimedia presentations. Proceedings of CHI '97, 272-279.

Ferris, T. K., & Sarter, N. (2008). Cross-modal links among vision, audition, and touch in complex environments. Human Factors. Vol 50(1) 17-26.

Merlo, J. L., Stafford, S. C., Gilson, R. D., & Hancock, P. A. (2006). The effects of physiological stress on tactile communications. Paper presented at the Human Factors and Ergonomics Society 50th Annual Meeting, San Francisco, CA.

Neuwirth, C. M., Chandhok, R., Charney, D., Wojahn P., and Kim L. (1994). Distributed collaborative writing: A comparison of spoken and written modalities for reviewing and revising documents. Human Factors in Computing Systems. April 24-28, 51-57.

Paivio, A (1986). Mental representations: a dual coding approach. Oxford. England: Oxford University Press.

Rupert, A., Graithwaite, M., McGrath B., Estrade A., Raj A. (2004). Tactile Situation Awareness System Flight Demonstration. DTIC Report A891224, Army Aeromedical Research Lab.

Wickens, C. D. (1992). Engineering Psychology and Human Performance, 2^nd ed. Harper Collins, New York.

Collaborative Work Systems Inc.