MIR 2010


11th ACM SIGMM International Conference on

Multimedia Information Retrieval

March 29-31, 2010, National Constitution Center, Philadelphia, Pennsylvania

Keynote and Invited Talks

The program will feature several keynote and invited talks, from academia, the industry, and the government.

Academic Keynote (on Day 1)

Bio-Image Informatics: Advances and Challenges
B. S. Manjunath, Professor, University of California, Santa Barbara

I will talk about information processing challenges in the context of microscopy images, and issues/challenges that the multimedia/information retrieval communities can help address. Recent advances in microscopy imaging have resulted in large volumes of image and video data, with most of the analysis still done manually and in a qualitative manner. Manual analysis is not only time intensive but often is not reproducible as well. Further, there is little, if any,database support to manage these image/video collections, to store, search and retrieve image related information within an integrated framework. I will illustrate the challenges with some recent problems that we are trying to tackle at UCSB's Center for Bio-Image Informatics.

Biography: Professor Manjunath received the B. E. in Electronics (with distinction) from the Bangalore University in 1985, and M. E. (with distinction) in Systems Science and Automation from the Indian Institute of Science in 1987, and the Ph. D. degree in Electrical Engineering from the University of Southern California in 1991. He joined the University of California in 1991 where he is currently a Full Professor of Electrical and Computer Engineering and the director of the NSF/IGERT program at UCSB on Interactive Digital Multimedia. In addition, he is the director of the NSF funded center for Bio-image informatics at UCSB where the research thrust is in developing new imaging and information processing technologies for large bio-image databases. His research interests include image processing, computer vision, multimedia databases, bio-informatics, learning algorithms and data mining. He has supervised 20 PhD theses and published over 200 papers in refereed journals and conferences. He is a co-inventor of 23 US/International patents and co-edited the first book on the MPEG-7 standard. He was an Associate Editor for the IEEE Transactions on Image Processing, IEEE-Tr. PAMI, IEEE-Tr. Multimedia, and IEEE Signal Processing Magazine. He is a fellow of the IEEE.


Industrial Leadership Invited Speaker Session (on Day 2)

In this session, we will invite a few active scientists from the leading industrial players in the area of multimedia information retrieval. They will illustrate their latest research findings. Each talk will take about 15-18 minutes.

Multimedia Processing for Advanced Content Services
Behzad Shahraray, Executive Director, Video and Multimedia Technologies Research, AT&T Labs

The proliferation of network connected media-enabled devices has given users access to large volumes of information and entertainment in video form. Taking advantage of these vast video resources involves the creation of effective mechanisms for searching, navigating, personalizing, and repurposing video to support alternative consumption modes. Automated content analysis algorithms that utilize media processing techniques are the key to the creation of such mechanisms. Media processing also serves to facilitate retrieval and navigation of content by enabling multimodal user interfaces.

In this talk I will discuss some of the media processing research at AT&T Labs, and describe several prototype systems aimed at giving users easy access to video and multimedia information on a wide range of media-enabled devices.

Biography:Behzad Shahraray is the Executive Director of Video and Multimedia Technologies Research at AT&T Labs. In this role, he leads an effort aimed at creating advanced media processing technologies and novel multimedia communications service concepts. He received the M.S. degree in Electrical Engineering, M.S. degree in Computer, Information, and Control Engineering, and Ph.D. degree in Electrical Engineering from the University of Michigan, Ann Arbor. He joined AT&T Bell Laboratories in 1985 and AT&T Labs Research in 1996. His research in multimedia processing has been in the areas of multimedia indexing, multimedia data mining, content-based sampling of video, content personalization and automated repurposing and authoring of searchable and browsable multimedia content.

Behzad is the recipient of the AT&T Medal of Science and Technology for his leadership and technical contributions in content-based multimedia searching and browsing. His work has been the subject of numerous technical publications. Behzad holds sixteen US patents in the areas of image, video, and multimedia processing. He is a Senior Member of IEEE, a member of the Association for Computing Machinery (ACM), and is on the editorial board of the International Journal of Multimedia Tools and Applications.

Semantic Understanding of Geotagged Pictures
Dhiraj Joshi, Research Scientist, Intelligent Systems Group, Eastman Kodak Research Labs

Semantic understanding based only on vision cues has been a challenging problem. This problem is particularly acute when the application domain is unconstrained photos available on the Internet or in personal repositories. In recent years, it has been shown that metadata captured with pictures can provide valuable contextual cues complementary to the image content and can be used to improve classification performance. With the recent geotagging phenomenon, an important piece of metadata available with many pictures is GPS information. In this talk, I will describe novel research in the area of mining geographic information for boosting semantic understanding. I will discuss the association of image content, tags, and location meta-data with image semantics within a contextual inference framework. With integrated GPS-capable cameras on the horizon and geotagging on the rise, this line of research will revolutionize event recognition and media annotation.

Biography: Dhiraj Joshi is a research scientist in the Intelligent Systems Group at the Eastman Kodak Research Labs, Rochester, NY. At Kodak, Dhiraj's primary focus is associating image content, tags, and location meta-data with image semantics. He is also interested in building intelligent systems which use semantics across multiple modalities of media for enriching user experience. Dhiraj graduated with an M.Sc in Mathematics and Scientific Computing from the Indian Institute of Technology, Kanpur. He completed his Ph.D. in Computer Science from the Pennsylvania State University. His broad research interests include contextual inference-based image understanding, large-scale image retrieval, content analysis in multimedia, aesthetics modeling in images, and statistical learning.

Dhiraj has been a research intern at the I.B.M. T.J. Watson Research Labs, and the Idiap Research Institute (Switzerland). In 2006, he was selected as an emerging leader in multimedia research to present at the Watson Emerging Leaders in Multimedia Workshop. He co-organized a special session on Image Aesthetics, Moods, and Emotions at the IEEE International Conference on Image Processing, 2008. Dhiraj has also participated in Kodak Visiting Scientist programme to promote science and mathematics education in Rochester area schools. He is a member of IEEE and currently serves as a Rochester chapter vice-chair of IEEE Signal Processing Society.

Next Generation Map Making: Automation from Mobile Data Collection
Alwar Narayanan, Director of Research and Emerging Technologies, NAVTEQ Corporation.

NAVTEQ is a leading global provider of digital map data. NAVTEQ maps drive most in-vehicle navigation systems, the top routing web sites, and the leading brands of wireless navigation devices. NAVTEQ continues to enhance the technologies used for collecting, analyzing, and delivering new content to a wide range of users and devices.

This presentation will address NAVTEQ's perspective on automatic creation and update of a navigable map through the use of high-end mobile data collection sensors and computer vision techniques. Results of research efforts based on large scale, geo-referenced, ground level video and LIDAR data collection as well as various challenging problems related to automatic feature extraction for mapping and navigation will be presented. Specifically, our approach to automatically reconstruct lane level maps, overpasses and traffic signs to create a virtual 3D map will be presented.

Biography: Alwar Narayanan is currently the Director of Research & Emerging Technologies group at NAVEQ. His research focus is to identify technologies that help recreate rich set of navigable map content more accurately and efficiently. Alwar has been leading research projects at NAVEQ since 1997. Prior joining to NAVTEQ, Alwar spent 12 years in teaching and working on various research projects at the Department of Computer Science and Engineering, Indian Institute of Technology, Chennai, India. Alwar holds a M.S. Degree in Computer Science from IIT, Chennai and an MBA from Northern Illionois University, Dekalb, IL. Alwar holds 7 U.S. Patents in the areas of digital mapping.

Multimedia Semantics -- Opportunities and Challenges
Apostol (Paul) Natsev, IBM T.J. Watson Research Center.

Digital media production and consumption has skyrocketed in recent years and is now commonplace in many parts of our lives -- from the way we entertain and inform ourselves to the way we communicate, socialize, and learn. With the tremendous growth of multimedia come great opportunities but even greater expectations and challenges. Traditional approaches of multimedia description based on manual tagging, production metadata, and link analysis are typically coarse-grained and inadequate. Advances in semantic understanding of multimedia content over recent years are instrumental for unlocking the full potential of multimedia.

In this talk, I will describe a few case studies of multimedia semantics applications developed at IBM Research to address real world business problems, with emphasis on the key opportunities and challenges for each use case. The goal of this talk will be to raise questions and bring attention to open problems with practical implications, rather than to prescribe specific answers.

Biography: Dr. Apostol (Paul) Natsev is a Research Staff Member and Manager of the Multimedia Research Group at the IBM T. J. Watson Research Center. He received his M.S. (1997) and Ph.D. (2001) degrees in Computer Science from Duke University, and joined IBM Research in 2001. At IBM, he leads research efforts on multimedia analysis and retrieval, with an agenda to advance the science and practice of systems that enable users to manage and search vast repositories of unstructured multimedia content.

Dr. Natsev is a founding member and current team lead for IBM's award-winning IMARS project on multimedia analysis and retrieval, with primary contributions in the areas of semantic, content-based, and speech-based multimedia indexing and search, as well as video copy detection. Dr. Natsev is an avid believer in scientific progress through benchmarking, and has participated actively in a dozen open evaluation/showcasing campaigns, including the annual NIST TRECVID Video Retrieval evaluation, the CIVR VideOlympics showcase, and the CIVR Video Copy Detection showcase.

Dr. Natsev is an author of more than 60 publications and 15 U.S. patents (granted or pending) in the areas of multimedia analysis, indexing and search, multimedia databases and query optimization. His research has been recognized with several awards, including the 2004 Wall Street Journal Innovation Award (for IMARS), a 2005 IBM Outstanding Technical Accomplishment Award, a 2005 ACM Multimedia Plenary Paper Award, a 2006 ICME Best Poster Award, and the 2008 CIVR VideOlympics People's Choice Award (for IMARS). He is a Senior Member of ACM.

Banquet Speech (on Day 2)

Alberto Del Bimbo, University of Florence, Italy

Biography: Professor Del Bimbo is Full Professor of Computer Engineering and the Director of the Master in Multimedia of the University of Florence, Italy. He was the Director of the Department of Sistemi e Informatica, from 1997 to 2000 and the Deputy Rector for Research and Innovation Transfer of the University of Florence, from 2000 to 2006. Presently he is the President of the Foundation for Research and Innovation and the Director of the Media Integration and Communication Center of Excellence of the University of Florence.

His scientific interests are Pattern Recognition, Image and Video Analysis, Multimedia Information Retrieval and Natural Human Computer Interaction. He has published over 250 publications in some of the most distinguished scientific journals and international conferences, and is the author of the monography "Visual Information Retrieval", on content-based retrieval from image and video databases, edited by Morgan Khaufmann, in 1999.

From 1996 to 2000, he was the President of the IAPR Italian Chapter, and, from 1998 to 2000, Member at Large of the IEEE Publication Board. He was the general Chair of IAPR ICIAP'97, the International Conference on Image Analysis and Processing, IEEE ICMCS'99, the International Conference on Multimedia Computing and Systems, AVIVDiLib'05 the International Workshop on Audio-Visual Content and Information Visualization, VMDL07 the International Workshop on Visual and Multimedia Digital Libraries, IEEE ISM2008, the International Symposium on Multimedia and Program Co-Chair of ACM Multimedia 2008. He is the General Co-Chair of ACM Multimedia 2010 and of ECCV 2012, the European Conference on Computer Vision. He is IAPR Fellow and Associate Editor of Multimedia Tools and Applications, Pattern Analysis and Applications, Journal of Visual Languages and Computing and International Journal of Image and Video Processing, and was Associate Editor of Pattern Recognition, IEEE Transactions on Multimedia and IEEE Transactions on Pattern Analysis and Machine Intelligence.


Invited Speakers Session (on Day 3)

Implementing a Content-Based Public-Oriented Audio and Video News Retrieval System
Gregory Grefenstette, Chief Science Officer, Exalead

Video is poised to largely replace both text and images as the media for transmitting information in the coming years. The challenge of the Information Processing community is how to index the information found in this voluminous and dynamic media stream. This talk will describe our current research in providing an index into the content of video and audio streams. I will also describe and demonstrate the Voxalead News system, and other results of the French-German Quaero project, that integrate results from industry and research, for the next generation of video-based information searching.

Biography: Gregory Grefenstette is Chief Science Officer at Exalead. He received his B.S. from Stanford University in 1978, and a Phd in Computer Science from the University of Pittsburgh in 1993. He has been Principal Scientist at the Xerox Research Centre (1993-2001), with Clairvoyance (2001-3) and at the French applied research centre, the CEA (2001-8). His research interest range from most subjects in Natural Language Processing to all aspects of Information Retrieval. He serves on the Editorial board of the Journal for Natural Language Engineering, and edited the first book on Cross Language Information Retrieval (Kluwer 1998). In recent years, he has been working with Adrian Popescu on Geographical Indexing. He is the co inventor of 15 patents, including the design of a photocopier for cross language information retrieval (US 6396951), for finding experts in a company by mining Web usage (US 6446035) and for creating documents that enrich themselves (US 6732090). He organized the first OntoImage Workshop 2007 on bridging the gap between text processing and image processing.

Recovering the Past through Computation - New Techniques for Cultural Heritage
Stephen M. Griffin, Program Director, National Science Foundation

Computation has provided new means for researchers and scholars in the humanities, fine arts and social sciences to address research questions long considered to be too difficult for conventional methodologies. The subject of this presentation will be to discuss emerging state-of-the-art scientific methodologies applied to discovery, recovery, restoration, representation, analysis and ultimately new understanding of a broad range of cultural heritage artifacts. Critically important remnants of the past are disappearing - through neglect, incidental destruction, neglect, and deterioration and looting. Many ancient artifacts are scattered about the world and reside in public and private collections, inaccessible to scholars and far removed from their original location and context of creation.

Digital representation is possible for numerous cultural heritage resources: script and drawings on a variety of media, manuscripts and documents, images, objects of all shapes and textures, and historic sites and events to name a few. Computation can provide means for recovering to some degree what was lost. Computation using geospatial and temporal data is central to visualizing and understanding mechanisms of change over extended periods of time, at once revealing and elucidating the events, social processes and practices that drive or accompanied change. This task involves, in part, processing massive amounts of raw data from a wide range of instruments and combining these with historic records to produce new information. At this point scholarly work, creative approaches, imaginative thinking and international interdisciplinary collaboration can be undertaken to create new knowledge and understanding and bring to light new segments of the human record.

Biography: Stephen Griffin is a Program Director in the Information Integration and Informatics (III) cluster in the National Science Foundation's Division of Information and Intelligent Systems. For the period 1994-2004, Mr. Griffin managed the Special Projects Program which included the Interagency Digital Libraries Initiatives and the International Digital Libraries Collaborative Research and Applications Testbeds program. Prior to joining the Division of Information and Intelligent Systems, Mr. Griffin served in several research divisions, including the Divisions of Chemistry and Advanced Scientific Computing, the Office of the Assistant Director, Directorate for Computer and Information Science and Engineering, and staff offices of the Director of the NSF. He has been active in working groups for Federal high performance computing and communications programs, and serves on numerous domestic and international advisory committees related to digital libraries and advanced computing and networking infrastructure. In 2004-2005 he was on special assignment to the Library of Congress, Office of Strategic Initiatives, to assist with the National Digital Information and Infrastructure Preservation Program. His educational background includes degrees in Chemical Engineering and Information Systems Technology. He has additional graduate education in organizational behavior and development and the philosophy of science. His research interests are in topics related to interdisciplinary research and scholarly communication. He has been active in promoting cultural heritage informatics and computing and the humanities and arts.