Using Speech Related Gestures to Aid
Referential Communication in Face-to-face and Computer-Supported Collaborative
Work.
Alison Newlands, Anthony Anderson, Avril Thomson, Bill
Ion and Neil Dickson.
University of Strathclyde.
Department of Psychology;
Department of Design, Manufacture and Engineering ManagementÜ.
Advances in computer and telecommunications technology have permitted the development
of computer-based tools that enable workers in geographically distributed locations
to communicate and work together. These tools can take a variety of forms, but
one generic pattern is a shared workspace in the form of a shared virtual whiteboard
accompanied by communication media permitting textual, graphical, audio and
visual communication links. These systems are frequently referred to as computer-supported
collaborative work (CSCW) tools. That used in the present study is typical
of these systems in that it permitted real-time shared (e.g. alternating) use
of the drawing tools, plus a view of ones interlocutor in a small window
within the screen. Given such computer-mediation of communication, questions
naturally arise as to its efficacy: can designers successfully collaborate using
such mediation? Does such mediation constrain designers dialogues in any
way, rendering them less effective?
Some previous research has suggested that computer mediation of interaction
might indeed constrain dialogues. For example, OConaill and Whittaker
(1997) found that video-mediation results in the increased use of formal methods
of turn-taking, such as using first names to indicate next speaker. The overall
effect is to increase the degree of formality of the participants interactions.
This could adversely affect the processes of grounding (i.e. achieving mutual
understanding and establishing what is commonly known between participants),
since the latter has been shown (see for example, Clark, 1996; Clark and Wilkes-Gibbs,
1986; Isaacs and Clark, 1987) to be a collaborative and highly interactive process.
Increased formalisation of conversational contributions would potentially interfere
with the natural interactivity of this process.
We therefore examined the use of these CSCW tools by pairs of participants who
were either sitting side by side (face-to-face), or working remotely over a
network using a video conference system PictureTel 550 (see figure
1, for illustration of CSCW set-up). Half of the design student participants
were relative novices who had had only 2 months previous experience of
AutoCAD, whilst the other half were more experienced (with two years training
in various CAD tools). Both groups were further subdivided into pairs who worked
in the video-mediated condition and pairs who worked side-by-side in a face-to-face
or copresent condition. We examined both task progress and the participants
dialogues to ascertain whether there were any effects of video-mediation or
expertise.
Method: Participants worked together to transform a 2D diagram of a trolley
wheel bracket into a 3D diagram, using a standard, computer-aided design (CAD)
tool. Two groups of University undergraduates took part in the study novices
(2nd year engineering students, after 2 months of training with AutoCAD) and
experts (4th Year engineering students with 2 years of training
in a range of CAD tools). All participants were naïve users of the video
conferencing system. A small group of Industrial expert users of CAD also completed
the task, using a think aloud protocol, and their views on the usefulness
of this way of collaborating were sought via semi-structured interviews. In
the CSCW condition participants could converse as if face-to-face; they can
see each other via a small video window, and were provided with a duplex audio
channel. The AutoCAD diagram is displayed in a second window, which can be manipulated
or changed by either participant but only when they have control of the mouse.
Results: Analysis of the communication and joint problem-solving activities
has been undertaken; including analysis of task performance, turn-taking management,
content analysis and the strategies employed during referential communication.
Overall the results indicate that the different communicative contexts (face-face
versus CSCW) were associated with similar levels of task outcome. There were
some differences in the process of communication as a function of expertise,
and we are currently exploring these further. The apparent lack of an effect
of communication medium on task outcomes is perhaps surprising in view of previous
literature, which would indicate that turn-taking procedures could be disrupted
in the CSCW context (e.g. Newlands et al. 1996; OConaill and Whittaker,
1997), a more in-depth analysis of the pragmatic functions of utterances of
the dialogues is being undertaken to examine the process of communication in
greater detail. This analysis has highlighted an important point, that participants
used a lot of gesture during the task, especially during acts of reference;
these gestures needed to be incorporated into the transcriptions before the
dialogues became fully comprehensible.
Acts of Reference: Acts of reference play an important role in the design task,
as participants need to be able to discuss and refer to different parts of a
complex diagram in order to complete the task. One effective strategy is to
use gesture as a way of pointing to referents, or to illustrate what is being
said verbally. In face-to-face communication hand gestures are frequently used
in this deictic manner, to point to objects or places. In video-mediated and
CSCW contexts participants can rarely see each others hand movements, but they
can use the on-screen cursor in a deictic manner, to point to parts of a diagram
or draw their addressees attention to part of the visual display. These
types of mouse gestures appeared to occur frequently during the
design task, but more so in some conditions than others. To determine which
factors influenced this gestural behaviour the transcriptions of the dialogues
were annotated to show where hand and computer mouse gestures were used. The
majority of gestures were deictic, however a couple of iconic gestures were
observed in the face-face and CSCW conditions. The data for the deictic gestures
are given in Table 1, which shows the group mean frequency of each type of gesture
(standard deviations are given in brackets).
Table 1. Mean number of Hand and Mouse Gestures in CSCW and Face-to-face interactions.
| Hand Gestures | Mouse Gestures |
| CSCW | Face-to-Face | CSCW | Face-to-Face | |
| Novices | 10.57 (5.94) | 49.29 (13.50) | 31.57 (11.99) | 23.29 (12.92) |
| Experts | 9.43 (6.29) | 52.86 (18.20) | 10.71 (3,73) | 19.00 (9.68) |
All participants used deictic hand gestures for the purpose of referential communication,
but as expected these occurred more frequently in the face-face condition; on
average five times as many hand gestures in the face-face context compared to
the CSCW context. Additionally, the frequency of use of mouse gestures varied
between the novice and experienced participants. Novice users made greater use
of mouse gestures than more experienced users regardless of communicative context,
but this behaviour occurred even more frequently for novice users in the CSCW
context. Examination of the annotated transcripts indicates that there may be
some benefits for novice users of CAD to work in a CSCW environment. Some example
segments of dialogue are considered below, to illustrate the variety of uses
to which gestures are put. (Pauses are indicated with three dots, duration of
gestures by vertical lines (to mark start and end) and underlined text).
The first example from two second year students (see below), illustrates the
use of mouse gestures in which particular referents on the shared diagram are
pointed to using a circular motion of the cursor. In this particular
case, mouse gestures are used no less than seven times within one utterance
to deictically indicate particular referents within the diagram.
Person A: See that hole there, see that?
Person B: Uhm yeah right
Person A: |thats that bit|
|thats a top view|
right so that |hole there
is that| and
|that
hole there is that|
so weve got a thing, |thats this|
you |take that|
and that bit |there is the third view,
taken from the side|.
The second utterance by speaker A is so heavily indexical that
it makes little sense when considered in isolation from the gestures and the
referents. Such speech does, however, have the virtue that whilst it relies
heavily on the extra-linguistic context for interpretation, it minimises the
need for speakers to produce and listeners to interpret technical terminology
within the design field; given that these speakers are relative beginners, this
facet of mouse gestures is one sensible way of reducing collaborative effort.
In the second example, one of the more expert students uses a mouse gesture
to get the attention of his interlocutor:
Person A: whats happened to that part at the bottom, with
the blue bits?
Person B: Ive highlighted it, you can do that if you choose different
views, |see
these buttons at the top, they do that|
This gesture obviously serves an indexical function, but it also
serves to draw the partners attention to some of the icons at the top
of the screen which allow the user to chose different views of the drawing.
The third example involves two novices using mouse gestures first to draw attention
to the cursor position and subsequently to refer indexically to a part on the
jointly-visible drawing:
Person A: |You see where my cursor is?|
Person B: Yes
Person A: right, well have to extrude |that part first|
Person B: Mhmm
Person A: and then add |on this part on top|
Person B: Uh huh, yeah.
Again this dialogue relies heavily on the extra-linguistic context
for interpretability and makes little sense in isolation.
Summary: Our findings suggest that the users, particularly the
less experienced second year students adapt to the situation in a sensible manner
and use mouse gestures strategically to assist grounding of acts
of reference, rather than having to rely on their inexpert knowledge of the
technical jargon involved in the task.
References:
Clark, H. H. (1996). Using language. Cambridge: Cambridge University Press.
Clark, H.H., and Wilkes-Gibbs, D. (1986). Referring as a collaborative process.
Cognition, 22: 1-39.
Isaacs, E.A., and Clark, H.H. (1987). References in conversation between experts
and novices. Journal of Experimental Psychology: General, 116: 26-37.
Newlands, A., Anderson, A.H., and Mullin, J. (1996). Dialog structure and cooperatiave
task performance in two CSCW environments. In J.H. Connolly and L Pemberton
(Eds.) Linguistic Concepts and Methods in CSCW (pp. 41-60). Springer-Verlag.
London.
OConaill, B., and Whittaker, S. (1997). Characterizing, predicting, and
measuring video-mediated communication: a conversational approach. In K.E. Finn,
A.J. Sellen and S.B. Wilbur (Eds.) Video-Mediated Communication (pp. 107-131).
NJ: Lawrence Erlbaum Associates.