You are here: Home » Program » Tutorials




Human Centered

T01 – Processing Web-Scale Multimedia Data

Malcolm Slaney, Edward Chang
Abstract: In the last few years we have received access to multimedia databases with billions of objects. The massive change in the amount of data available to researchers is changing the face of multimedia. In many domains, speech-recognition is most notable, people have observed that the best way to improve their algorithm’s performance is to add more data. Starting with hidden-Markov models (HMMs) and support-vector machines, people have applied ever greater amounts of data to their problems and been rewarded with new levels of performance.

Details: T01 – Processing Web-Scale Multimedia Data

Duration: half day

T02 – Advances in Multimedia Retrieval

Alan Hanjalic, Martha Larson, Cees Snoek, Arnold Smeulders
Abstract: Multimedia that cannot be found is, in a certain sense, useless. It is lost in a huge collection, or worse, in a back alley of the Internet, never viewed and impossible to reuse. Research in multimedia retrieval is directed at developing techniques that bring video together with users – matching multimedia content and user needs. The aim of this tutorial is to provide insights into the most recent developments in the field of multimedia retrieval and to identify the issues and bottlenecks that could determine the directions of research focus for the coming years. This tutorial targets new scientists in the field of multimedia retrieval, providing instruction on how to best approach the multimedia retrieval problem and examples of promising research directions to work on. It is also designed to benefit active multimedia retrieval scientists — those who are searching for new challenges or re-orientation. The material covered is relevant for participants from both academia and industry. It covers issues pertaining to the development of modern multimedia retrieval systems and highlights emerging challenges and techniques anticipated to be important for the future of multimedia retrieval.

Part 1: Frontiers in Multimedia Search (Alan Hanjalic, Martha Larson)

In this part of tutorial we present a whirlwind tour of hot new multimedia search techniques, covering strategies that exploit users, the collection as a whole and analysis of individual content items. We focus on improving the overall usefulness of search systems. Discussion of information retrieval and speech/language-based techniques is included.

Part 2: Video Search Engines (Cees Snoek, Arnold Smeulders)

In this part of the tutorial we focus on the challenges in video search, present methods how to achieve state-of-the-art performance, and indicate how to obtain improvements in the near future. Moreover, we give an overview of the latest developments and future trends in the field on the basis of the TRECVID competition – the leading competition for video search engines run by NIST – where we have achieved consistent top performance over the years, including the 2008 and 2009 editions.

Details: T02 – Advances in Multimedia Retrieval

Duration: full day

T03 – Understanding Multimedia Content Using Web Scale Social Media Data

Dong Xu , Lei Zhang, Jiebo Luo
Abstract: Nowadays, increasingly rich and massive social media data (such as texts, images, audios, videos, blogs, and so on) are being posted to the web, including social networking websites (e.g., MySpace, Facebook), photo and video sharing websites (e.g., Flickr, YouTube), and photo forums (e.g., and Recently, researchers from multidisciplinary areas have proposed to use data-driven approaches for multimedia content understanding by leveraging such unlimited web images and videos as well as their associated rich contextual information (e.g., tag, comments, category, title and metadata). In our three hour tutorial, we plan to introduce the important general concepts and themes of this timely topic. We will also review and summarize the recent multimedia content analysis methods using web-scale social media data as well as present insight into the challenges and future directions in this area. Moreover, we will also show extensive demos on image annotation and retrieval by using rich social media data.

Details: T03 – Understanding Multimedia Content Using Web Scale Social Media Data

Duration: half day

T05 – Mobile Video Streaming in Modern Wireless Networks

Mohamed Hefeeda, Cheng-Hsin Hsu
Abstract: Modern mobile devices, such as laptops, PDAs (Personal Digital Assistants), smart phones, and PMPs (Portable Media Players), have evolved to powerful mobile computers and can render rich multimedia content. Increasingly more users use mobile devices to watch videos streamed over wireless networks, and they demand more content at better quality. For example, market forecasts reveal that mobile video streaming, such as mobile TV, will catch up with gaming and music, and become the most popular application on mobile devices: more than 140 million subscribers worldwide by 2011. In this tutorial, we will present different approaches to deliver multimedia content over various wireless networks to many mobile users. We will study and analyze the main research problems in modern wireless networks that need to be addressed in order to enable efficient mobile multimedia services. The tutorial will cover common research problems in wireless networks such as HSDPA (High-Speed Downlink Packet Access), MBMS (Multimedia Broadcast Multicast Services) extension of cellular networks, WiMAX (Worldwide Interoperability for Microwave Access), LTE (Long Term Evolution), DVB-H (Digital Video Broadcasting – Handheld), and ATSC M/H (Advanced Television Systems Committee – Mobile/ Handheld). After giving the preliminaries of the considered wireless network standards, we will focus on several important research problems and present their solutions in details. These research problems include: (i) maximizing energy saving of mobile receivers, (ii) maximizing bandwidth utilization of wireless networks, (iii) minimizing stream switching time, and (iv) supporting heterogeneous mobile receivers. Finally, we will discuss open problems and future research directions in mobile multimedia.

Details: T05 – Mobile Video Streaming in Modern Wireless Networks

Duration: half day

T08 – Immersive Future Media Technologies

Christian Timmerer, Karsten Müller
Abstract: The past decade has witnessed a significant increase in the research efforts around the Quality of Experience (QoE) which is generally referred to as a human-centric paradigm for the Quality of a Service (QoS) as perceived by the (end) user. As it puts the end user in the center stage, it may have various dimensions and one dimension recently gained momentum is 3D video. Another dimension aims at going beyond 3D and promises advanced user experience through sensory effects, both introduced briefly in the following. 3D Video: Stereo and Multi-View Video Technology: 3D related media technologies have recently developed from pure research-oriented work towards applications and products. 3D content is now being produced on a wider scale and first 3D applications have been standardized, such as multi-view video coding for 3D Blu Ray disks. This part of the tutorial starts with an overview on 3D in the form of stereo video based systems, which are currently being commercialized. Here, stereo formats and associated coding are introduced. This technology is used for 3D cinema applications and mobile 3DTV environments. For the latter, user requirements and profiling will be introduced as a form to assess user quality of experience. For 3D home entertainment, glasses-free multi-view displays are required, as more than one user will watch 3D content. For such displays, the current stereo solutions need to be extended. Therefore, new activities in 3D video are introduced. These 3D solutions will develop a generic 3D video format with color and supplementary geometry data, e.g. depth maps, and associated coding and rendering technology for any multi-view display, independent of the number of views. As such technology is also developed in international consortia, the most prominent, like the 3D@HOME consortium, the EU 3D, Immersive, Interactive Media Cluster and the 3D video activities in ISO-MPEG are introduced. Advanced User Experience through Sensory Effects: This part of the tutorial addresses a novel approach for increasing the user experience – beyond 3D – through sensory effects. The motivation behind this work is that the consumption of multimedia assets may stimulate also other senses than vision or audition, e.g., olfaction, mechanoreception, equilibrioception, or thermoception that shall lead to an enhanced, unique user experience. This could be achieved by annotating the media resources with metadata (currently defined by ISO/MPEG as part of the MPEG-V standard) providing so-called sensory effects that steer appropriate devices capable of rendering these effects (e.g., fans, vibration chairs, ambient lights, perfumer, water sprayers, fog machines, etc.). In particular, we will review the concepts and details of the forthcoming MPEG-V standard and present our prototype architecture for the generation, transport, decoding and use of sensory effects. Furthermore, we will present details and results of a series of formal subjective quality assessments which confirm that the concept of sensory effects is a vital tool for enhancing the user experience.

Details: T08 – Immersive Future Media Technologies

Duration: half day

T09 – Modeling Human Behavior with Mobile Phones

Daniel Gatica-Perez
Abstract: In just a few years, mobile phones have emerged as the ultimate multimedia device. Current smartphones allow us to take pictures, listen to music, watch videos, interact with the physical world through GPS, communicate via calls, SMS, or MMS, and browse the web. Given their ubiquity, mobile phones have become the most natural device for multimedia consumption, production, and interaction, but there is much more to it. Mobile phones can constantly sense people’s location via GPS or cell tower connectivity, motion through accelerometers, proximity via Bluetooth, and communication through call and SMS logs, and thus represent the most accurate and nonintrusive current means of tracing real-life human activities. Furthermore, all this information, as never before, is being generated at massive scales. It is therefore not surprising that the understanding of personal and social behavior from mobile sensor data at large-scale, where populations of hundreds or thousands of cell phone users are analyzed both as individuals or groups over possibly long periods of time, has emerged as a frontier domain in computing and in social science. This domain has also attracted attention from the media. The concept of Reality Mining, coined at MIT, was identified as one of the 10 technologies “most likely to change the way we live” by Technology Review Magazine in 2008, and featured in mainstream media like Newsweek and The Economist and scientific media like Nature.

Details: T09 – Modeling Human Behavior with Mobile Phones

Duration: half day

T10 – Human-Centered Multimedia Systems

Nicu Sebe, Alejandro (Alex) Jaimes, Hamid Aghajan
Abstract: This tutorial will focus on technical analysis and interaction techniques formulated from the perspective of key human factors in a user-centered approach to developing multimedia systems. The tutorial will take a holistic view on the research issues and applications of Human-Centered Systems, focusing on three main areas: (1) multimodal interaction: visual (body, gaze, gesture) and audio (emotion) analysis; (2) image indexing, and retrieval: user behavior, context modeling, cultural issues, and machine learning for user-centric approaches; (3) multimedia data: conceptual analysis at different levels (feature, cognitive, and affective). This full-day tutorial will consist of two parts: the first half will consist of presentations by the instructors, and the second part will consist of practical workgroup activities.

Details: T10 – Human-Centered Multimedia Systems Tutorial

Duration: full day

  • Digg
  • Facebook
  • Google Bookmarks
  • email
  • FriendFeed
  • RSS
  • Twitter
  • ACM Multimedia 2010 Twitter