ACM Multimedia 97 - Electronic Proceedings

November 8-14, 1997

Crowne Plaza Hotel, Seattle, USA


An Open Architecture for Comic Actor Animation

Knut Manske
Telecooperation Department
Altenberger Str. 69
University of Linz, 4040 Linz, Austria
+43 732 2468 9262
knut@tk.uni-linz.ac.at
http://www.tk.uni-linz.ac.at/~knut

Max Mühlhäuser
Telecooperation Department
Altenberger Str. 69
University of Linz, 4040 Linz, Austria
+43 732 2468 9239
max@tk.uni-linz.ac.at
http://www.tk.uni-linz.ac.at/~max


ACM Copyright Notice


Abstract

The multimedia rapture has held out hopes for advancements in user- centric computing. At the same time, however, there is a move towards autonomous software (cf. intelligent filters, mobile and distributed agents, etc.), leaving users with an uncomfortable lack of knowledge about and control over 'what these components are doing behind their backs'. Visualization of both autonomous agent action and user-agent interaction becomes a crucial issue if these conflicting trends are to be harmonized. We present a system service for comic actor animation, which can be used as a representation of agents of all kinds. A second use case is for rapid authoring of animations which augment multimedia presentations or off-the-shelf software. Our focus is on the reuse of the necessary artwork, using a modular and flexible building-block approach.

As a preliminary step, this approach requires a set of elementary animation sequences to be created by a professional graphic artist, once per character. These sequences can be repeatedly combined in custom animated cartoons by easy-to-use commands at runtime. Our Comic Actor Editor Engine CAeditEngine uses a sophisticated approach for combining the elementary building blocks to form complete animations.

Our Comic Actor Playing Engine CAplayEngine uses a digital chroma keying technique in combination with layering to display the animations on top of any graphical user interface and any interactive software.

The system runs under MS Windows NT; a first version was used in a public interactive exhibit of multimedia and animation techniques and showed excellent performance.


Keywords

multimedia authoring, animation, graphical user interfaces, computer human interaction, system service, intelligent agents.


Table of Contents


1 Introduction

Using multimedia components in graphical user interfaces requires many experts to cooperate. Not only the application programmers are needed, but also graphic artists (for the creation of images and animations), and subject matter experts (contributing further their knowledge about end-users needs).

We put the focus of our work on the use of animations as an addition to state-of-the-art graphical user interfaces (typically, an addition to window-based UIs). Animations can be used to greatly improve the user's understanding of application and system operations, adding substantial benefit to graphical user interfaces. For instance, system operations like file copying or remote access to databases are often visualized already today via simple animations. Animated characters represent a logical next step since they can, e.g., communicate with the user in order to exploit service improvements, give advice, etc.

We present a system service, which provides such animated characters that are displayed directly on top of the graphical user interface. This system service can be used in any application that runs in the system. We call our animated characters "comic actors". They are not only able to "walk over" any application windows but also to trigger system actions by sending events. E.g., comic actors can push buttons or drag windows over the screen.

[IMG]
Figure 1: Two-phase concept. Use cases: agent visualization (left branch) and rapid animation authoring (right)

In an attempt to foster wide-spread use of sophisticated visualizations as described, there are two key issues: the effort required to create animations and the flexibility with which they can reflect system state and action. To this end, we developed a two-phase concept and implemented it with a focus on animated cartoon characters which overlay other presentations (software user interfaces or documents):

  1. A graphic artist designs the character and the basic building blocks of the corresponding animations.

  2. The basic building blocks of the animations are forwarded to our state transition graph editor CAgraphEditor which builds a state graph from the given data and creates a corresponding graph file. This graph file is an abstract representation of the comic actor, emphasizing the possible matches between elementary building blocks of animations. It can be used repeatedly in two use cases: agent visualization (more generally spoken, for system-triggered visualization of software) and rapid animation authoring (human-authored animation cartoons, linked to multimedia presentations and off-the-shelf software of all kinds). For the second use case, the author has to perform, in essence, just a number of simple point-and-click actions. Note that for didactic reasons, we will describe the second use case before the emphasized first one below (Sections CAeditor and CAeditEngine and System Service).

  3. The Comic Actor Playing Engine CAplayEngine maps a complete animation sequence onto the screen at runtime. It synchronizes the animation with sound and provides a comfortable layering mechanism. The CAplayEngine is based on a digital "chroma keying" approach and is designed to interoperate with arbitrary software tools and documents. The comic actors can be used to represent autonomous ("intelligent", "mobile", "adaptive") agents, i.e. programs which actively support the users in using the system or dedicated software. Thereby, the comic actors do not by themselves provide the "intelligence" - this is left to the system or application program, context sensitive help system, etc. Instead, comic actors are intended as the mediators between agents and users.

<-- Table of Contents


1.1 Structure of this Paper

In the following section we will briefly discuss the state of the art. The basic concept of building blocks and matches is explained in Section Basic Concept. Section CAeditor and CAeditEngine presents the CAeditEngine which provides the basic functionality for the systems software components. The system services architecture is described in Section System Service, followed by an explanation of the transparent mapping of animation sequences onto the graphical user interface in Section Playing Engine. A summary will conclude the article.

<-- Table of Contents


2 Related Work

Magnenat Thalmann and Thalmann made considerable contributions to animation techniques for 'digital actors'. In [Thal95] they describe the construction and integration of such digital actors - they use facial animation in their approach. Simulation of clothes with the aim to support virtual actors who can dress and undress is presented in Volino et al. [VMJT96]. These approaches focus on specialized closed-shop presentations and usually require powerful graphics tools.

Perlin and Goldberg [Perl95, PeGo96] put their focus on interactive worlds, inhabited by lifelike, responsively animated characters. They use two engines in their approach, one for the animation and the other one for the behavior of the characters. Both engines are controlled through scripting mechanisms. Blumberg and Galyean [BlGa97] state that a director has to be able to control actors in virtual worlds at a number of different levels. They suggest four levels: the motor skill level, the behavioral level, the motivational level, and the environmental level. A behavior system and a motor controller are described there.

Both approaches just mentioned aim at providing characters with a characteristic behavior or personality and natural movements. Emphasizing very detailed models of movement/action and devoting considerable efforts to the artwork, these approaches reach an incredible level of sophistication as to the appearance and performance. Many of these stunning results are however bound to the full control given over a 'closed-shop' application through rather heavy-weight interfaces which need programming (of scripts, rules, etc.) and, in particular, to the non-deterministic behavior which one expects from human-like agents. We did not intend to compete with these efforts, but tried to find concepts for deterministic characters that could reliably visualize software action and interaction rather than mimic human-like action. We also emphasized easy-to-use interfaces for agents and authors, supporting rapid authoring and even on-the-fly animation construction based on API calls by agents. Given this difference in focus, we restricted ourselves to 2D characters for the time being. We trust and experience that they find much better acceptance than non-sophisticated 3D characters which tend to raise expectations which they do not fulfil. Despite the different foci, it appears to be tempting for future work to integrate the flexibility, ease-of-integration and deterministic behavior of our approach with the sophistication in artwork and movement/action modeling which the approaches mentioned here provide.

To our knowledge, the use of video widgets in the form of video actors was first discussed by Gibbs and Breiteneder [BrGi94, GBMP93]. They used a hybrid analog/digital system and a layering technique which would be very expensive to implement in digital-only form. Their straightforward approach was provided as a C++ class "VideoActor". We preferred an affordable digital-only technique since we target MS Windows PCs as a standard deployment platform; we rather concentrated on a more sophisticated automated building block approach.

As to help systems, the Apple Guide [Appl95] is a system for onscreen instructions. In addition to the textual help, four styles of "coachmarks" can be used to mark portions of the graphical user interface: circle, underline, X, and arrow. All types of marks are static and cannot be animated at all. Yet the representation of help agents as cartoon characters becomes more and more popular (cf. MS Office 97 and Lotus Notes). Such characters, however, are custom designed today in a huge effort and nevertheless usually "sit in a corner of the screen" instead of, e.g., explaining user interface objects by walking up to them and talking while pointing at them.

<-- Table of Contents


3 Basic Concept

The section below will partly resume some core concepts of our system which have been presented elsewhere with different focus [MaMu97]. The brief description given here is essential for understanding the remainder of this article.

We will now emphasize phase one within the two-phase process depicted in Figure 1. One of the first tasks of this phase (although not the first one, see below) is the design of basic pieces of animation by a designer (artist). These pieces are used as building blocks and composed into larger so-called (animation) sequences.

[IMG]
Figure 2: Building block "walk" (six of approx. 30 frames) [1].

In order to stick together different or several identical building blocks, both ends have to meet of course: the end of one sequence has to meet the beginning of the following one in order to provide smooth transition (regarding several aspects, see below). Figure 3 shows building blocks and matches as depicted in this paper for didactic reasons. Note that building blocks can usually be multiplied and joined, transformed (e.g., mirrored) and parametrized (e.g., regarding the gradient of the walking path) in the context of the construction of sequences.

[IMG]
Figure 3: Building blocks with matches.

In order to illustrate part of the process, let us consider an agent or an application programmer (cf. our two use cases) who requests the generation of, e.g., a walking sequence and provides the corresponding starting and end points on a window. In this example, the system takes the following parameters into account: i) the absolute and horizontal distance (i.e. angle) between starting point and end point; ii) the minimal and maximal distance and height by which the actor can advance within one building block (considering maybe several alternative building blocks associated with walking); iii) the direction of motion: The building blocks to be used and the number of necessary repetitions can obviously be calculated based on these considerations.

[IMG]
Figure 4:Example animation from building blocks.

Given this example, we can now describe in more detail the steps that must be performed for each animated character to be enabled for and controlled by the system.

With a (possibly very large) application domain in mind, the design team has to determine the actions, which the character should be able to perform. This also applies to 'partial actions' such as face mimics, which may correspond to expressions and feelings assigned to the character. Such partial actions lead to building blocks, which have to be matched in the spatial domain (e.g., feet, body, head etc.), an aspect that we will not elaborate further at this point. For each of the actions planned, one or more building blocks must be designed and registered with the system. By associating graphical operations like mirroring to a building block, reverse, opposite or otherwise complementary building blocks can be generated automatically.

The simple comic actor used here as an example was designed to walk, to stand, and to point somewhere. It can talk to the user while standing or pointing, using either balloon text or recorded voice output.

Once a set of building blocks is entered into the system for a new character, their computer-aided composition has to be prepared. Therefore, the building blocks and matches are mapped onto transitions and states in a transition state graph to describe their correlation.

[IMG]
Figure 5: State transition graph, example.

Transitions, i.e. edges in the graph, represent the building blocks. Each state represents a match-type, with outbound transitions representing building blocks that start with the state-related match- type and inbound ones representing building blocks that end with the corresponding match-type. At this point, the system provides a first important validation by checking if a strongly connected directed graph [Manb89] is given. If this is the case, any feasible animation can be represented as a path in this graph (cf. numbers in Figure 4 and Figure 5 to see how the seven matching building blocks from Figure 4 form a path through the graph).

Each comic actor is characterized by its particular state transition graph and corresponding animation sequences. For example, the comic actor "bird" shown in Figure 6 can grab objects, an action that the "dumpling" cannot perform.

[IMG]
Figure 6: Another character: "bird".

<-- Table of Contents


4 CAeditor and CAeditEngine

4.1 System Overview

Two engines are used to provide the basic functionality of the architecture:

  1. The Comic Actor Editor Engine CAeditEngine. This engine provides the basic functionality of the system and implements the state transition diagram mechanisms mentioned above, cf. Section Basic Concept. It produces commands for the playing engine CAplayEngine. CAeditEngine is used by

  2. The Comic Actor Playing Engine CAplayEngine. This part maps the animations created by the CAeditEngine directly onto the graphical user interface. The CAplayEngine is used by

[IMG]
Figure 7: Architecture: comic actor related engines.

In addition, there are three basic tools related to the CAeditEngine:

  1. The CAconverter, which is used to specify the transparent portions of the animation sequences and to convert sequences into a custom-designed file format.

  2. The state transition diagram editor CAgraphEditor, used to introduce the building blocks into the system and to edit the additional data stored for every building block. The CAeditEngine is used to create the data files representing the comic actors. The CAservice uses these files to operate with the comic actors.

  3. The sequence editor CAeditor, an animation authoring tool based on the CAeditEngine. It is used to build complete animations from basic building blocks interactively and to store these animations in files.

These three tools are described in more detail in the remainder of this section.

<-- Table of Contents


4.2 Transparent Sequences

The CAconverter, a tool that leverages off Apple's QuickTime technology [Appl94] is used to specify the transparent portions of the sequences, cf. Figure 8. Rectangular regions in any frame of the sequence can be marked as representatives of the keying colors. Every color found inside these regions will be transparent for the whole frame, in every frame in the sequence. The sequences are saved in a custom-designed compressed internal file format for performance reasons.

[IMG]
Figure 8: CAconverter: specification of keying colors.

The sequences are cropped to the size of the largest bounding box of the opaque frame content for storage and performance reasons. Also, some of the sequence properties are stored within these files: the length of the sequence (frame rate and tolerance), the original position in the sequence (before cropping), and the position of the hot spot (e.g., for pointing) if applicable.

<-- Table of Contents


4.3 State Transition Graph Editor (CAgraphEditor)

The CAgraphEditor is used to define new comic actors by associating a set of building blocks with a state transition graph, cf. Section Basic Concept. The states represent matches between building blocks, the transitions represent the animation sequences. Figure 9 depicts the specification process for the simple "dumpling" actor used in this paper.

[IMG]
Figure 9:State transition graph editor.

The actual version of the editor supports the following properties for each building block: a unique name; a standard name (cf. Section System Service); the directory path for the animation data files; the attribute movable (if true, the center of gravity of the movie can be moved as the animation sequence is played); the default distance which the sequence was designed for (for "movable" sequences) and the angular range in which it can be moved (e.g., walk right [-60°, 60°]). In addition, two optional flags can be set: reverse and mirrored. These flags can be used to play an animation sequence in reverse order or horizontally mirrored, as mentioned earlier (e.g., allowing an automatic transformation of "walk left" into "walk right"). Additional layers of building blocks in the spatial domain (e.g., additional equipment for the actor like a hat or suitcase or for special facial animations like grinning) and designated sounds or effects for the sequences can be specified.

Using the state transition graph approach, graph algorithms [Manb89] can be used for validation and automation: the above-mentioned validation of strong connectivity is a first important "syntax check". In addition, the CAeditEngine uses graph algorithms to insert missing animation sequences between two selected actions for the comic actor. This can be done by searching the shortest path from the state at the end of the first sequence to the state at the beginning of the second one, cf. Section Sequence Editor.

The state graph is stored in a state transition graph file together with related properties. In addition, the state transition diagram editor provides support for packing up all data related to a given comic actor, creating a single archive for transport or transmission purposes. This feature is useful, e.g., if an agent roams through the network and wants to use the actor for communicating to different users on different network nodes.

<-- Table of Contents


4.4 Sequence Editor (CAeditor)

This section is devoted to the second use case of our system: rapid animation authoring. It is included for better understanding and for comprehensiveness, the emphasis is put on the first use case (cf. Section System Service). Rapid authoring is based on the CAeditor; this tool in turn uses the two engines subsequently: the CAeditEngine and the CAplayEngine.

When an animation is authored, the author has to select a comic actor from the given set first. In a few steps, (s)he can then create a complete animation interactively, by carrying out the following steps (once or, more likely, a number of times): i) selection of action (composite, like "walk", "explain"); ii) parameter specification (positions, selection of audio files, layering information, etc.).

[IMG]
Figure 10: Onscreen editor, the animation follows cross marks.

Positions are selected by pointing and clicking directly on the graphical user interface. During this onscreen selection phase, the editor window is kept small and simple such as to provide maximum accessibility of the underlying (target) user interface. Popup windows are used when parameter entries have to be made. The mouse events are "caught" by the editor. This way, the author can specify the positions for the actor actions in direct-manipulation mode, referring directly to his or her application or document, cf. Figure 10.

The direction of the motion over the screen is calculated from the given coordinates. The applicable building blocks are selected, e.g., based on their valid angular range. Using hot spot information that is stored as part of a sequences, actor positions can be calculated in relation to action foci.

After entering parameters and related information for one action, the next action can be selected. Using graph algorithms as described in Section State Transition Graph Editor, smooth and logically sequences between consecutive actions are inserted automatically. As an example, the comic actor may be walking, and the user may select "stand and talk" as the follow-on action. The CAeditEngine can then detect that the sequence "stop walk" is missing and insert it into the animation.

The use case described here can be efficiently applied even if animation authors do not have access to the code of off-the shelf tools but want to combine tool action or output with animations.

Buttons to be "pressed" by the comic actor can be selected interactively, even mouse messages can be sent by the comic actor. This feature is however restricted in systems which assign positions and IDs of buttons at runtime, inhibiting a fixed such relationship to be determined from outside the application. For such cases, the involvement of the application or system is required as described in the next section.

The sequence editor creates an animation script file. This animation can be played by the standalone application CAplayer, which can be used to integrate the comic actors with standard applications, cf. Section Playing Engine.

<-- Table of Contents


5 System Service with Open Architecture (CAservice)

In this section, we want to further elaborate on use case one: visualization of agents. As mentioned above, the animations created with the sequence editor (cf. Section Sequence Editor) are compiled into rather fixed "movies". The building blocks can be used repeatedly, but after combining them into a complete animation, every replay has to conform to the user-defined course of action. In contrast to this easy-to-use, user-determined alternative, the CAservice was designed to provide highly flexible comic actor functionality at runtime. Provided that an application has sufficient knowledge about the location of the user interface objects (which is usually not an obstacle), animations can be created as needed with respect to the state of the application or screen.

<-- Table of Contents


5.1 System Service Architecture

The CAservice coordinates the comic actors that run in a system and guarantees a proper mapping on the screen. Designed for MS Windows NT, the CAservice provides comic actor functionality to any application running in the system, cf. Figure 11.

Applications using the CAservice do not have to provide comic actors of their own; rather, they are free to use the comic actors registered to the system service - this approach provides for the option to have user-specific, application-independent actors. On the other hand, an application can introduce its own comic actors into the system.

[IMG]
Figure 11: System architecture, CAservice.

The CAservice uses both the CAeditEngine and the CAplayEngine to provide its functionality. In a simplistic view, the CAservice can be regarded as a replacement of the CAeditor, which in turn is a kind of graphical front end of the CAeditEngine. Instead of a human editor, agents post requests for comic actor functionality, this time even at runtime. Thereby, the user interface commands and interactive selection of positions is replaced by elements of a command interface.

<-- Table of Contents


5.2 Standard Vocabulary

In order to facilitate the second use case, a feature list is introduced. For every comic actor, this list shows its type (human- like, animal, etc.), character (serious, funny, etc.), and abilities (can walk, talk, point, jump, talk while walking, etc.).

Type Character
male (human) serious
female (human)funny
child (human) cool
animal userdef
extraterrestic 
other  
userdef  

Table 1: Standard vocabulary for characteristics.

We use a standard vocabulary for the specification of these features. The feature list of each comic actor has to include some essential information and for each entry in the feature list, the standard vocabulary has to be used, cf. Table 1 and Table 2.

For every animation sequence inserted into the state transition diagram (cf. Section State Transition Graph Editor), a standard name created from the standard vocabulary has to be given in the following way (we use EBNF, for the non-terminals cf. Table 2):

standard name = {additional ["<"direction">"] "_"} basic ["<"direction">"].

If two or more sequences have the same standard name (e.g., more than one kind of walking, laughing, etc.), consecutive numbers are used as postfix; one sequence has to be marked with a "default" flag.

Example: For a sequence that contains a standing video actor which points left and explains something, the standard entry is

point<left>_talk_stand.

Layers cannot be selected by standard names because of their huge variety (think of hats, accompanying dogs, all kinds of objects to handle etc.). Rather, the layers can be selected by a unique name. In the current implementation, application programmers have to know about the existence and name of layers. The same applies to designated sounds.

basic movements additional actions direction
stand turnto none
walk appear left
fly disappear right
run jump up
climb start down
slide stop backwards
lay talk reverse
crawl whisper  
userdef shout  
  sing  
  whistle  
  laugh  
  point  
  pointto  
  look  
  turnhead  
  nod  
  shakehead  
  welcome  
  wave  
  take  
  put  
  push  
  pull  
  throw  
  catch  
  userdef  

Table 2: Standard vocabulary for sequences.

<-- Table of Contents


5.3 System Service Command Interface

For interaction with the CAservice, we use a scripting mechanism. There are two types of requests: CA_REQUESTs and ANIM_REQUESTs. A CA_REQUEST is used to introduce new comic actors, to query information about comic actors, and to receive IDs of actors for further reference. More than one ID may exist for one comic actor character. Several actors based on the same character can be displayed on the screen concurrently. IDs are unique and are stored along with request information until the service is terminated, normally at system shutdown. To avoid synchronization problems in current versions of the operating system used, the active use of comic actors is currently restricted to the activated foreground application.

Example: One application wants a male human video actor with serious mood to appear somewhere on the screen, walk to a button, point there, explain its functionality and disappear. The application first has to send a CA_REQUEST to the CAservice with a specification of the character and the minimal set of features needed:

CA_REQUEST
CHARACTER male_serious
MIN_FEATURES
  appear_stand
  walk
  point_talk_stand
  disappear_stand
END;

The request yields an actor ID in case of a positive result. If the application receives such a positive reply to its request the system also assumes the responsibility for carrying out the intermediate building-blocks that are needed for smooth matches - this feature is realised based on the state transition graph as described earlier. Next, the application can start the animation by sending the following sequence of commands to the CAservice:

ANIM_REQUEST
ACTOR <ID>
ANIMATION
  ACTION("appear_stand", <position>);
  ACTION("walk", POS_NEXT_ACTION);
  REPEAT
    ACTION("talk_pointto_stand", <position>)
    WITH
      LAYER(<layer-filename>, <rel_pos>);
    END;
  UNTIL(<wav_filename>);
  ACTION("disappear_stand", ACT_POS);
END;

The "pointto" action lets the comic actor point to a specific position on the graphical user interface while the simpler "point" action can be used to point at a certain direction. The "REPEAT-UNTIL" construct repeats the content of its body until, e.g., the sound in its condition ends (both the sound and the animation in the body are started synchronously in this example). The directive POS_NEXT_ACTION indicates that the position has to be calculated backwards by evaluating the next action. In this case, the coordinates of the "pointto" action are used for the calculation of the end-position of the walk-cycle. "ACT_POS" is a placeholder for the current comic actor position.

A construct not used in the example above, "DO-WHILE", loops a sound until the animation in its body ends. For example this can be used for looped background music or sound effects.

After the processing of one ANIM_REQUEST, a message is sent to the requesting application for synchronization. Potential error messages (e.g., non-existing action, application not foreground application, etc.) are sent to the requesting application as system messages.

The CAservice uses the CAeditEngine to extract the feature lists and to trigger the animations. The CAeditEngine inserts the fill-in sequences (cf. Interactive use by human author via editor) and produces commands for the playing engine to display the animation onscreen.

<-- Table of Contents


6 Playing Engine (CAplayEngine)

6.1 Onscreen Chroma Keying

Our comic actor playing engine CAplayEngine maps the animations directly onto the graphical user interface. The transparent images are displayed in windows without any decoration, handles, or borders. The content of the desktop "below" the window is used as background for the animation. The animation movie window is shifted over the screen according to the animation contents. The result is the impression that the comic actor moves directly over the desktop, cf. Figure 12.

Note that in the beginning, we experimented with direct mapping onto the graphics context of the desktop. This resulting sometimes in improper coordination with the underlying applications when the latter carried out screen updates while the comic actor would walk over their window. In the meantime, a somewhat expensive (still very well- performing) yet much more stable approach is used as described above.

[IMG]
Figure 12: Comic actors on the screen.

In addition to a former version of the playing engine (cf. [MaMu97]), the current version is able to display layered animations. It is even possible to use layers of different sizes, cf. Figure 13. Therefore the relative position of the layers to one another has to be determined in the definition step of the sequence preprocessing. It is possible to use animation layers (e.g., hats, facial animation, accompanying dog) or static bitmap layers (e.g., for balloons).

[IMG] [IMG] [IMG]

Figure 13: Animation layering.

The CAplayEngine is used in both use cases, i.e. by the CAeditEngine and the CAservice. A standalone player application CAplayer exists, based on the CAplayEngine, for use case two. Animation scripts as created using the CAeditor can be played interactively. The CAplayer can be used to integrate the comic actor functionality with standard applications and standard multimedia authoring tools, e.g., MS Windows, MS Excel, Macromedia Director, etc.

<-- Table of Contents


6.2 Synchronization and Performance

The playing engine controls all visible comic actors on the screen; several threads are used to manage the synchronization. However, this does not completely relieve the burden of multi-actor synchronization within a graphical user interface from the application: if, e.g., two actors trigger actions with the underlying graphical user interface which both modify the state of the desktop or switch to different foreground applications, then the synchronization can only be carried out in a useful way by the (actor-aware) application software.

As to the video performance, we obtain 20 frames per second using sequences of size 320x240 on a standard Pentium PC (133 MHz).

<-- Table of Contents


6.3 Audio Sequences

Audio sequences cannot be adjusted in speed or length. The corresponding animation sequence can be adjusted in length or frame rate, since there is some tolerance in the playing time of an animation sequence, cf. Figure 14.

[IMG]
Figure 14: Tolerance in frame rate: adjustable length of sequence.

The following considerations hold: for a relatively long audio part, the mapping onto an animation sequence is simpler and more accurate because many (of the short) building blocks can and must be used. Given n repetitions and a tolerance t for each building block, the sequence can be adjusted by n*t. If this adjustment is not possible, the animation sequence has to be cut off or may be displayed without audio for a short period of time.

[IMG]
Figure 15: Audio and video building blocks.

<-- Table of Contents


7 Applications of Comic Actors

Originally, the CAeditor has been designed with a focus of application on the integration of comic actors with instructional material, i.e. emphasizing computer aided learning (CAL). Furthermore, comic actors have also been used by explanatory systems.

In the meantime, the focus has shifted to the first use case mentioned in this paper. The more recently developed CAservice allows for the creation of animations at runtime. The comic actors can interact with the user at any position of the screen, right in the context of the interaction. As described in the beginning, this extensions broaden the spectrum of application beyond CAL material and desktop help systems: agents become the primary target, where the term denotes a range from desktop autonomous assistants to highly distributed, nomadic components in the net. Such mobile agents can use agents both for interaction with the users "encountered" and for interaction with their "creator".

In the context of computer supported cooperative work (CSCW), we intend to use comic actors to visualize remote human collaborators; as a replacement for or complement to videoconferencing, actors may often visualize partners and/or their actions more effectively than video pictures (this may relate to bandwidth needs, common visualization of a large number of conference participants, "getting across" important action, etc.). All in one, comic actors may serve as a common representation of both human and artificial (agent) participants in a distributed computer-assisted cooperative task; the notion of an "avatar" has often been used in this context.

Finally, it must be noted that the system described above has emphasized comic actors but is not by design restricted to these. Any kind of animation could be envisioned based on the framework presented, in an attempt to further exploit the reuse approach taken.

<-- Table of Contents


8 Summary and Future Work

The Ars Electronica Center in Linz, Austria, is a mix between media lab and "technology museum of the future". One floor in this center is called 'knowledge net' and was entirely conceived and coordinated and largely implemented by the second author's department and group, including the first author. A version of the sequence editor is shown there as an exhibit. In addition, comic actors are used in the center's "Conference/Classroom of the Future" (CCF) [MBFM96], as a tool for courseware authoring. Many users were pleased by the intuitive handling of the editor. The playing engine performs the onscreen chroma keying with imperceptible cuts and with excellent performance.

Speech generation could be useful for interaction with the user. The current version relies on pre-recorded speech and sound.

We are currently augmenting the comic actors base model in order to better accommodate relevant agent actions. Actions to be visualized with the next version include the ability of actors to: carry visualizations of documents over the desktop; pull windows over the screen (e.g., in order to visualize the existence of new windows with further information); and many others.

In addition, we are working on an extension of the model to represent interaction between actors, such as shake-hands, document passing, mutual guidance, etc.

A further (higher) level in the command interface is currently subject to investigation. A first step is made with the replacement of absolute positions by graphical user interface object identifications (e.g., window ID). The possibility of passing commands like "go to window " or "fetch document " renders the actor-application interface much easier. In addition to relative positioning, an actor should be able to perform the requested action even if graphical object positions change dynamically.

We are currently implementing a set of example applications for the CAservice. We are planning field tests with users in order to study the acceptance and the emotional, qualitative, and quantitative effects that comic actors have on the interaction between humans and computers.

<-- Table of Contents


End Notes

[1] This comic actor represents an Upper-Austrian dumpling.

<-- Table of Contents


Bibliography

[BlGa97]
B. Blumberg and T. Galyean (1997). Multi-level Control for Animated Autonomous Agents: Do the Right Thing ... Oh Not That ... In: Trappl R., Petta P. (eds.), Creating Personalities for Synthetic Actors, Springer, 74-82.
[BrGi94]
C. Breiteneder and S. Gibbs (1994). Interactive Video Actors. Proceedings of ACM CHI'94 Conference on Human Factors in Computing Systems, V 2, 447-448.
[GBMP93]
S. Gibbs, C. Breiteneder, V. de Mey, and M. Papathomas (1993). Video Widgets and Video Actors. Proceedings of the ACM SIGGRAPH Symposium on User Interface Software and Technology, Video, Graphics, and Speech, 1993, 179-185.
[Thal95]
N. Magnenat Thalmann and D. Thalmann (1995). Digital Actors for Interactive Television. Proceedings of the IEEE, Vol. 83, No. 7, July 1995, 1022-1031.
[Manb89]
U. Manber (1989). Introduction to algorithms. Addison-Wesley Publishing Company Inc.
[MaMu97]
K. Manske and M. Mühlhäuser (1997). Point-and-Click Construction of Comic Actor Animation Sequences. Proceedings of the ED-MEDIA 97 Conference on Educational Multimedia and Hypermedia, Calgary, Canada, June 14-19, 1997, AACE, Vol. 1, 683-688.
[MBFM96]
M. Mühlhäuser, J. Borchers, C. Falkowski, and K. Manske (1996). The Conference/Classroom of the Future: An interdisciplinary approach. Proceedings of the IFIP Conference "The International Office of the Future: Design Options and Solution Strategies", University of Arizona, Tucson, Arizona, USA, April 9-11, Chapman and Hall, 1996, 233- 250.
[Perl95]
K. Perlin (1995). Real Time Responsive Animation with Personality. IEEE Transactions on Visualization and Computer Graphics, Vol. 1, No. 1, March 1995, 5-15.
[PeGo96]
K. Perlin and A. Goldberg (1996). Improv: A System for Scripting Interactive Actors in Virtual Worlds. Proceedings of the ACM SIGGRAPH 96 Conference, Addison Wesley, 205-216.
[VMJT96]
P. Volino, N. Magnenat Thalmann, S. Jianhua, and D. Thalmann (1996). An Evolving System for Simulating Clothes on Virtual Actors. IEEE Computer Graphics and Applications, Vol. 16, No. 5, September 1996, 42-51.
[Appl94]
Apple Computer Inc. (1994). QuickTime for Windows 2.0 Developer's Manual.
[Appl95]
Apple Computer Inc. (1995). Apple Guide Complete: Designing and Developing Onscreen Assistance. Addison-Wesley Publishing Company.
<-- Table of Contents