Practitioners Day Program
Registrations - If you intend to attend the practitioners day only, you can pre-register by sending an email to firstname.lastname@example.org. The registration fee of 200 Euros can be collected onsite or via bank transfer.
Abstracts of the industrial sessions talks
Attribute-based Object Retrievals (Rogerio Feris, IBM T. J. Watson Research Center) - Searching for objects of interest in surveillance videos, in particular suspicious vehicles and pedestrians, is a common and important task in criminal investigation processes. Current solutions to automate this task are usually based on face recognition or license plate recognition, which are known to be sensitive to typical surveillance conditions such as lighting changes, pose variation, and low-resolution imagery. In this talk, I will describe a novel and complementary object search framework based on parsing of object parts and fine-grained attributes. At the interface, the user can specify queries such as "show me the bald people who entered a given building last Saturday wearing a red shirt and sunglasses" or "show me all blue trucks larger than 7ft length traveling at high speed northbound yesterday from 2pm to 5pm" and the system will retrieve events that match the provided description. I will cover our attribute detection methods which utilize large amounts of data as well as a novel attribute ranking and image retrieval approach based on pairwise modeling of attributes.
Multimedia Research at FXPAL: Combining Analysis with Innovative User Interfaces (Andreas Girgensohn, FX Palo Alto Laboratory) - FXPAL has been active in multimedia research for the past 15 years. An overarching theme of our work is to combine analysis with innovative user interfaces. Throughout, our systems consistently combine flexible and powerful user interfaces with established content analysis methods. We balance automation and user control to simplify tasks where possible while encouraging user creativity. We highlight this in a tour through several projects we have worked on over the years.One of our early systems, MBase, supported the browsing of video collections by displaying video summaries similar to comic books. That collection was also accessible through a 3D interface that represented the video collection as a cityscape with summaries pasted onto the buildings. Another of our video summarization techniques produced collages reminiscent of stained glass windows. We worked on making video editing easy and introduced a form of hypervideo that affords simple navigation. In the area of video surveillance, we created a realistic testbed with 20 cameras in our office building and developed innovative user interfaces for quickly accessing events in video or for tracking people across cameras. More recently, we have focused on supporting search in a publicly accessible lecture webcast search engine and by participating in the TRECVID interactive search evaluation. A recurring research focus has been the management of personal photo collections. We created an application that automatically divides photos into meaningful events such as a birthday party. Another application presented a visual workspace using multiple similarity criteria to layout photos. We also worked on paper user interfaces where a photograph taken of printed content leads to associated multimedia content.
Multimedia mining for real world applications (Adrian Popescu, CEA LIST) - Multimedia research has flourished these last years but the integration of recent developments in working applications is still far from satisfactory. In this context, the talk describes some of the challenges derived from CEA LIST's positioning as a facilitator between academic research and industry. This positioning is illustrated by a number of techniques developed in our lab, which cover a broad range of multimedia related subjects and are oriented towards easy integration in real world applications. The first part of the talk introduces a shared boosting algorithm devised to perform large scale visual concept detection for still images. The performances of the algorithm are comparable to those of state of the art methods, while its computational complexity is significantly reduced. The second part presents a movie character recognition and tracking algorithm which integrates active appearance models and improved feature extraction. The third part introduces our work on multimedia fusion, with a focus on real time query modeling and visual reranking of text retrieval results. Before pointing out current and future challenges, some geotagged content mining results, such as POI extraction, geographic image retrieval or trip characterization, are discussed.
Recent research activities in KDDI R&D Laboratories (Yusuke Uchida, KDDI R&D Laboratories) - We introduce recent research activities in KDDI R&D Laboratories.Music-driven animation on smart phones: to provide new music experience for users, we have developed an Android application on smart phones, which automatically generates animations in real time where your favorite character dances to your favorite music.Hybrid no reference video quality assessment: an accurate NR quality assessment system based on an analysis of video bitstream provides fully automated quality check after file-based content authoring for VoD, Blu-ray Disk, DVD, and so on.Free-viewpoint 3D video: Free Viewpoint Video (FVV) technology makes it possible to feel as if to move freely in the space of the viewing scene. We realized absolute Free Viewpoint Video with 3D from limited number of cameras.Content-based music analysis and applications: we have focused on content-based analysis of music, and development of various applications to expand the user music experience. Major achievements include content-based music search and visualization, automatic DJ mixing, and generation of music slideshows using Web images.
Videntifier Forensic- Automatic Video Identification for Police Authorities (Björn þór Jónsson, Videntifier) - Videntifier™ Forensic is a service that assists police authorities by automatically identifying video material on seized storage devices. When police forces get to know about a suspect of downloading or distributing illegal video content (e.g. child abuse material, terrorist and hatred propaganda, copyrighted material) and they seize the suspect's computer and storage devices, it is very stressful and time-consuming work to look through the material in order to find the evidence.Videntifier™ Forensic is doing this task automatically by watching the visual content of videos, working like human vision. It is using fine-grained visual fingerprints to pursue this task combined with blazingly fast and scalable data mining technology.
ICMR2011 - ACM International Conference on Multimedia Retrieval