Tutorials

1. Interacting with Image Collections – Visualisation and Browsing of Image Repositories (Oct. 29)
Gerald Schaefer (Loughborough Univ., UK)

2. Privacy Concerns in Multimedia and Their Solutions (Oct. 29)
Gerald Friedland (ICSI and Univ. of California, Berkeley, US)

3. Continuous Analysis of Emotions for Multimedia Applications (Oct. 29)
Hatice Gunes (Queen Mary, Univ. os London, UK)
Björn Schuller (Technische Univ. München, DE)

4. Dynamic Adaptive Streaming over HTTP – From Content Creation to Consumption (Oct. 29)
Christian Timmerer (Alpen-Adria-Univ. Klagenfurt, AT)
Carsten Griwodz (Simula Research Laboratory, NO)

5. Multimedia Recommendation (Oct. 29)
Jialie Shen (Singapore Management Univ., SG)
Meng Wang (Hefei Univ. of Technology, CN)
Shuicheng Yan (National Univ. of Singapore, SG)
Peng Cui (Tsinghua Univ. CN)

6. A Human-Centered Perspective on Multimedia Data Science (Oct. 29)
Alejandro Jaimes (Yahoo! Research, ES)

1. Interacting with Image Collections – Visualisation and Browsing of Image Repositories

Abstract
In this tutorial we will look at a variety of techniques and methods for effective and intuitive image database visualisation and browsing. While interaction with traditional image retrieval systems can lead to a confusing and frustrating user experience, image browsing systems attempt to provide the user with an intuitive interface to manage potentially large image databases.

In the tutorial we will look at how image databases are visualised in the three main approaches of mapping-based, clustering-based and graph-based image database navigation systems, how intuitive browsing operations are supported, how image databases can be browsed employing non-desktop systems (such as VR hardward, or mobile devices), and how the effectiveness of image browsing systems can be evaluated.

Organizer
geraldschaefer Gerald Schaefer gained his PhD in Computer Vision from the University of East Anglia. He worked at the Colour & Imaging Institute, University of Derby (1997-1999), in the School of Information Systems, University of East Anglia (2000-2001), in the School of Computing and Informatics at Nottingham Trent University (2001-2006), and in the School of Engineering and Applied Science at Aston University (2006-2009) before joining the Department of Computer Science at Loughborough University where he leads the Vision, Imaging and Autonomous Systems Research Division.

His research interests are mainly in the areas of colour image analysis, image retrieval, physics-based vision, medical imaging, and computational intelligence. He has published extensively in these areas with a total publication count exceeding 250. He is a member of the editorial board of more than 10 international journals, reviews for over 70 journals and served on the programme committee of more than 200 conferences. He has been invited as keynote or tutorial speaker to more than 30 conferences, is the organiser of various international workshops and special sessions at conferences, and the editor of several books, conference proceedings and special journal issues.

2. Privacy Concerns in Multimedia and Their Solutions

Abstract
The growth of multimedia as demonstrated by social networking sites such as Facebook and YouTube combined with advances in multimedia retrieval (geo-tagging, web search, face recognition, speaker verification, location estimation, etc.) provides novel opportunities for the unethical use of multimedia. In small scale or in isolation multimedia analytics have always been a powerful but reasonably contained privacy threat. However, when linked together and used on an Internet scale, the threat can be enormous and pervasive. At the same time, some of the solutions to security and privacy concerns are really simple and follow a limited set of basic principles, which, when already obeyed in the early stages of the development of a system can avoid large unresolvable issues later. Many of them are well known in the security and privacy communities but not so much in the multimedia community. The objective of this tutorial is to introduce interested multimedia students and researchers who are not specialized in security and privacy issues into the thinking of a security and privacy researcher.

The tutorial will be a vivid class with many examples based on material developed for a CS294 course at the EECS department of UC Berkeley. Using real-world examples and their consequences, the tutorial will focus on privacy and security threats induced by modern social networking practices in combination with multimedia retrieval.

Organizer
geraldfriedland Dr. Gerald Friedland is a senior research scientist at the International Computer Science Institute, a private lab affiliated with the University of California, Berkeley, where he is currently leading a group of multimedia researchers supported by NSF, DARPA, IARPA, and industry grants. For the last two years, he has been heading an NSF-sponsored project on the privacy implications on multimedia content analysis, resulting in frequent press appearances, invited talks, and guest lectures about the topic. Most importantly, Gerald has presented and attended high class security and privacy conferences, working on building a bridge between the multimedia and privacy communities. Gerald has published more than 100 peer-reviewed articles in conferences, journals, and books and is currently authoring a new textbook on multimedia computing together with Dr. Ramesh Jain. Despite being mainly a researcher, Dr. Friedland is a passionate teacher. Part of the Tutorial will be based on material taught at Berkeley.

3. Continuous Analysis of Emotions for Multimedia Applications

Abstract
Multimedia content is loaded with emotion: In speech, music, sound, text, and video. People in video or audio media naturally communicate subtle emotions and affective states by means of language, vocal intonation, facial expression, hand gesture, head movement, body movement and posture, and possess a refined mechanism for understanding and interpreting information conveyed by these behavioral cues.
Enabling automatic and continuous emotion analysis in multimedia applications would be extremely beneficial for personalized and emotion-sensitive multimedia content analysis and processing, implicit tagging, multimedia event understanding, search and retrieval, multimedia interaction and digital art installations, etc. Therefore, this tutorial aims to become the initial but crucial step toward bringing together researchers from two very relevant yet disconnected fields of research and practice: affective computing and multimedia.

The tutorial aims to give a comprehensive introduction to automatic, dimensional and continuous analysis of emotions and affective signals, and provide indicators and examples of how the current developments in this field can be utilized to enhance a broad range of multimedia applications. More specifically, this tutorial aims at:

1) Introducing the existing efforts and major accomplishments in automatic, dimensional and continuous analysis of emotions from multiple cues and modalities;
2) Demonstrating the practical aspects, available frameworks, tools, databases, and automatic analyzers, that can be easily used by multimedia researchers around the world;
3) Encouraging the integration of the recent developments in the field into multimedia applications, and inter-disciplinary cross-fertilization of affective computing and multimedia research fields.

The tutorial will also focus on providing a broad overview of recent algorithms and methodology, and predict potential oncoming trends for relevant multimedia applications. The presenters will draw on the most recent developments from the Journal Special Issues they have guest edited and the workshops they organized.

Organizers
haticegunes Dr. Hatice Gunes (http://www.eecs.qmul.ac.uk/~hatice/) is a Lecturer (Assistant Professor) at the School of Electronic Eng. & Computer Science, Queen Mary University of London (QMUL), UK. Prior to joining QMUL, she was a postdoctoral researcher in the Dep. of Computing at Imperial College London, UK, and an Associate of the Faculty of Eng. & IT at University of Technology Sydney (UTS), Australia. She received her Ph.D. degree in Computer Science from UTS in 2007. Her research interests lie in the areas of affective computing, visual information processing, and machine learning, focusing on automatic affective behaviour analysis, continuous prediction, and multicue and multimodal emotion recognition. Dr Gunes has published more than 50 technical papers in these areas, and her work to date has received more than 600 citations (current h-index = 15). She has acted as the main organiser of the very first workshop on automatic affect analysis in continuous and multi-dimensional space (EmoSPACE), organised in conjunction with IEEE FG’11. In continuation of this effort, she is a Guest Editor of Special Issues in Int’l Journal of Synthetic Emotions and Image and Vision Computing Journal, and will be giving a tutorial on ‘Continuous Analysis of Emotions for Multimedia Applications’ at ACM Multimedia’12. She has also served as a member of the Editorial Advisory Board for the Affective Computing and Interaction Book (IGI Global, 2011), as an area chair for 2013 Int’l Conference on Affective Computing and Intelligent Interaction, as a general workshop chair of BCS HCI 2012, as an invited speaker at the Int’l Workshop on Social Signal Processing (WSSP 2011) and the Summer School on Affective Computing & Social Signal Processing (ACSSP 2010), and as a reviewer for numerous journals and conferences in these fields. From 2004 to 2007, for her PhD research in affective computing, she was a recipient of the Australian Government Int’l Postgraduate Research Scholarship (IPRS) awarded to top quality international postgraduate students. Dr Gunes, together with co-authors, has also received a number of other awards for Outstanding Paper (IEEE FG 2011), Quality Reviewer (IEEE ICME 2011), Best Demo (IEEE ACII 2009), and Best Student Paper (VisHCI 2006). In March 2012, Dr Gunes received funding from the British Council under the UK-Turkey HE Partnership Programme and is currently a chair of the Int’l Workshop on Affective Computing for Mobile HCI (AC4MobHCI’12). She is a member of the IEEE, the ACM, and the HUMAINE Association.

bjornschuller Björn Schuller received his diploma in 1999 and his doctoral degree for his study on Automatic Speech and Emotion Recognition in 2006, both in electrical engineering and information technology from TUM (Munich University of Technology), one of Germany’s repeatedly highest ranked and among its first three Excellence Universities. He is tenured as Senior Lecturer in Pattern Recognition and Speech Processing heading the Intelligent Audio Analysis Group at TUM’s Institute for Human-Machine Communication since 2006. From 2009 to 2010 he lived in Paris/France and was with the CNRS-LIMSI Spoken Language Processing Group in Orsay/France dealing with affective and social signals in speech. In 2010 he was also a visiting scientist in the Imperial College London’s Department of Computing in London/UK working on audiovisual behaviour recognition. In 2011 he was guest lecturer at the Università Politecnica delle Marche (UNIVPM) in Ancona/Italy and visiting researcher of NICTA in Sydney/Australia. Best known are his works advancing Affective Computing, Human- Computer-Interaction, Semantic Audio and Audiovisual Processing, and Music Information Retrieval.

Dr. Schuller is president-elect of the HUMAINE Association and member of the ACM, IEEE and ISCA and (co-)authored 3 books and more than 250 publications in peer reviewed books (21), journals (34), and conference proceedings in the field of signal processing, and machine learning leading to more than 2,600 citations – his current H-index equals 27. He serves as cofounding member and secretary of the steering committee, associate editor, and guest editor of the IEEE Transactions on Affective Computing, associate and repeated guest editor for the Computer Speech and Language, associate editor for the IEEE Transactions on Systems, Man and Cybernetics: Part B Cybernetics and the IEEE Transactions on Neural Networks and Learning Systems, and guest editor for the IEEE Intelligent Systems Magazine, Speech Communication, Image and Vision Computing, Cognitive Computation, and the EURASIP Journal on Advances in Signal Processing, reviewer for more than 40 leading journals and 30 conferences in the field, and as workshop and challenge organizer including the first of their kind INTERSPEECH 2009 Emotion, 2010 Paralinguistic, 2011 Speaker State, and 2012 Speaker Trait Challenges and the 2011 and 2012 Audio/Visual Emotion Challenge and Workshop and programme committee member of more than 30 international workshops and conferences. Steering and involvement in current and past research projects includes the European Community funded ASC-Inclusion STREP project as coordinator and the awarded SEMAINE project, and projects funded by the German Research Foundation (DFG) and companies such as BMW, Continental, Daimler, HUAWEI, Siemens, Toyota, and VDO. Advisory board activities comprise his membership as invited expert in the W3C Emotion Incubator and Emotion Markup Language Incubator Groups. So far, he has given two tutorials at the IEEE ICASSP and two at the ISCA INTERSPEECH.

4. Dynamic Adaptive Streaming over HTTP – From Content Creation to Consumption

Abstract
In this tutorial we present dynamic adaptive streaming over HTTP ranging from content creation to consumption. It particular, it provides an overview of the recently ratified MPEG-DASH standard, how to create content to be delivered using DASH, its consumption, and the evaluation thereof with respect to competing industry solutions. The tutorial can be roughly clustered into three parts. In part I we will provide an introduction to DASH, part II covers content creation, delivery, and consumption, and, finally, part III deals with the evaluation of existing (open source) MPEG-DASH implementations compared to state-of-art deployed industry solutions.

Organizers
christiantimmerer Christian Timmerer is an assistant professor at the Institute of Information Technology (ITEC), Multimedia Communication Group (MMC), Alpen-Adria-Universität Klagenfurt, Austria. His research interests include the transport of multimedia content, multimedia adaptation in constrained and streaming environments, distributed multimedia adaptation, and Quality of Service/Quality of Experience. He was the general chair of WIAMIS’08, AVSTP2P’10 (co-located with ACMMM’10), WoMAN’11 (co-located with ICME’11), and TPC chair of QoMEX’12. He has participated in several EC-funded projects, notably DANAE, ENTHRONE, P2P-Next, ALICANTE, and SocialSensor. He is an Associate Editor for IEEE Computer Science Computing Now, Area Editor for Elsevier Signal Processing: Image Communication, Review Board Member of IEEE MMTC, editor of ACM SIGMM Records, and member of ACM SIGMM Open Source Software Committee. He also participated in ISO/MPEG work for several years, notably in the area of MPEG-21, MPEG-M, MPEG-V, and DASH (incl. DASH promoters group). He received his PhD in 2006 from the Klagenfurt University. Publications and MPEG contributions can be found under http://research.tim merer.com, follow him on http://www.twitter.com/timse7, and subscribe to his blog http://blog.timmerer.com. Full bio can be found at http://www-itec.uniklu.ac.at/~timse/cv/.

carstengriwodz Carsten Griwodz is head of the Media department of government-owned research company Simula Research Laboratory, and professor of Computer Science at the University of Oslo. He received his Dipl.-Inf. degree from Paderborn University in 1993 and Dr.-Ing. degree from Technische Universität Darmstadt in 2000. He worked for IBM from 1993–98 and participated in the standardization of MHEG. His research is concerned with streaming media, ranging from scalable distribution architectures through operating system and protocol support to subjective visual quality assessment. He was co-chair of ACM NOSSDAV 2008, ACM/IEEE NetGames 2011, SPIE/ACM MMCN 2006 and 2007, Track chair of ACM MM 2008, TPC chair of ACM MMSys 2012 and is general chair of MMSys 2013. He is Associate Editor of ACM TOMCCAP and Editor-in-Chief of the newsletter ACM SIGMM Records. The Media group publishes news at http://mpg.ndlab.net. His publications can be found at http://simula.no/people/griff/bibliography.

5. Multimedia Recommendation

Abstract
Due to the rapid growth of online multimedia information, the problem of information overload has become more and more serious in recent decades. To address this problem, various multimedia recommendation technologies have been developed by different research communities (e.g., multimedia systems, information retrieval, and machine learning). Meanwhile, many commercial web systems (e.g., Flick, Youtube, and Last.fm) have successfully applied recommendation techniques to provide users personalized multimedia content and services in a convenient and flexible way.

While several tutorials and courses were dedicated to multimedia search in the last few years, to the best of our knowledge, the tutorial should be the pioneering one solely focusing on multimedia recommendation technologies and their applications on various domains and media content. We plan to summarize the research along this direction and provide a good balance between theoretical methodologies and real system development (including several industrial approaches). It includes:

Introducing why accurate recommendation system is important for web scale multimedia retrieval and sharing Examining current commercial systems and research prototypes, focusing on comparing the advantages and the disadvantages of the various strategies and schemes for different types of media documents (e.g., image, video and audio).
Discussing and reviewing various limitations of the current generation of recommendation systems.
Reviewing key challenges and technical issues in building recommendation systems and we explore some of the ways that how recommendation techniques can be used to improve different kinds of retrieval or sharing tasks over large scale collections in long run.
Discussing a few promising research directions and exploring potential solutions.
Make predictions about the road that lies ahead for the scholars in MM and other related communities.

Organizers
jialieshen Dr. Jialie Shen is an Assistant Professor in Information Systems, School of Information Systems, Singapore Management University, Singapore. He received his PhD in Computer Science from the University of New South Wales (UNSW), Australia in the area of large-scale media retrieval and database access methods. Dr. Shen’s main research interests include information retrieval, multimedia systems, economic-aware media analysis, and statistical machine learning. His recent work has been published or is forthcoming in leading journals and international conferences including ACM SIGIR, ACM Multimedia, ACM SIGMOD, CVPR, ICDE, WWW, IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT), IEEE Transactions on Multimedia (IEEE TMM), IEEE Transactions on Image Processing (IEEE TIP), ACM Multimedia Systems Journal, ACM Transactions on Internet Technology (ACM TOIT) and ACM Transactions on Information Systems (ACM TOIS). Besides being chair, PC member, reviewer and guest editor for several leading information systems journals and conferences, he is an associate editor of International Journal of Image and Graphics (IJIG).

mengwang Dr. Meng Wang is a professor in the Hefei University of Technology, China. His current research interests include multimedia content analysis, search, mining, recommendation, and large-scale computing. He has authored more than 100 book chapters, journal and conference papers in these areas. He is an associate editor of Information Sciences and Neurocomputing. He received the best paper awards successively in the 17th and 18th ACM International Conference on Multimedia and the best paper award in the 16th International Multimedia Modeling Conference.

shuichengyan Dr. Shuicheng Yan is an Assistant Professor in the Department of Electrical and Computer Engineering at National University of Singapore, and the founding lead of the Learning and Vision Research Group (http://www.lv-nus.org). Dr. Yan’s research areas include computer vision, multimedia and machine learning, and he has authored or co-authored over 280 technical papers over a wide range of research topics, H-index = 36. He is an associate editor of IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT) and ACM Transactions on Intelligent Systems and Technology (ACM TIST). He received the Best Paper Awards from PCM 2011, ACM MM 2010, ICME 2010 and ICIMCS 2009, the winner prizes of the classification task in both PASCAL VOC 2010 and PASCAL VOC 2011, the honorable mention prize of the detection task in PASCAL VOC 2010, 2010 TCSVT Best Associate Editor (BAE) Award, 2010 Young Faculty Research Award, 2011 Singapore Young Scientist Award, and 2012 NUS Young Researcher Award.

pengcui Dr. Peng Cui is an assistant Professor in Department of Computer Science and Technology, Tsinghua University. He received his PhD in Computer Science from Tsinghua University. His research interests include multimedia content analysis, social network analysis, and social multimedia computing. His recent research work has been published in leading conferences and journals, such as IEEE TMM, IEEE TIP, DMKD, SIGIR, AAAI, ICDM etc. He serves as sponsor chair, co-chairs in ACM SIGKDD2012, ACM MM 2011 workshop, and IEEE ICME 2012 special session etc. He is the guest editor or reviewer of many referee journals including Information Retrieval journal, IEEE TMM, IEEE TCSVT, IEEE TKDE, ACM TKDD etc.

6. A Human-Centered Perspective on Multimedia Data Science

Abstract
In recent years, the amount of data available for analysis has exploded. This is creating many new opportunities for research, particularly in the field of social media. Given the importance of multimedia content in social media, there is no doubt that the two fields go hand in hand. A lot of the sharing and activity on the web currently occurs around multimedia materials. People often share images, videos, and links to multimedia content. Arguably, the social media phenomenon is having a strong impact on multimedia, and is creating opportunities for many new applications built around sharing of multimedia content. It is therefore clear that gaining a deep understanding of data in the social media context can have a strong impact on the multimedia field as a whole.

This tutorial will focus on analyzing user behavior through large-scale data analysis. This includes discovering and leveraging search and navigation patterns, understanding how elements of interaction impact behavior, and how we can use controlled experiments in combination with user studies and other techniques to gain insights into human behavior with a particular emphasis on multimedia, particularly in the context of social media.

Organizer
alejandrojaimes Dr. Alejandro (Alex) Jaimes is Senior Research Scientist at Yahoo! Research where he manages the Social Media Engagement group, which contributes to several products including Yahoo! News, Yahoo! Clues, and Yahoo! Image Search. Dr. Jaimes Dr. Jaimes is General Chair for ACM Multimedia 2013, the founder of the ACM Multimedia Interactive Art program, Industry Track chair for ACM RecSys 2010 and UMAP 2009, and panels chair for KDD 2009. His work has led to over 70 technical publications, he has been granted several patents, and serves in the program committee of several international conferences. He has been an invited speaker at MUM 2011 (keynote) ICME 2011, WWW 2011 (panels on Social Media), Practitioner Web Analytics 2010, CIVR 2010, ECML-PKDD 2010 and KDD 2009 and (Industry tracks), ACM RecSys 2008 (panel), DAGM 2008 (keynote), 2007 ICCV Workshop on HCI, and several others. Before joining Yahoo! Dr. Jaimes was a visiting professor at U. Carlos III in Madrid and founded and managed the User Modelling and Data Mining group at Telefonica Research. Prior to that Dr. Jaimes was Scientific Manager at IDIAP-EPFL (Switzerland), and was previously at Fuji Xerox (Japan), IBM TJ Watson (USA), IBM Tokyo Research Laboratory (Japan), Siemens Corporate Research (USA), and AT&T Bell Laboratories (USA). Dr. Jaimes received a Ph.D. in Electrical Engineering (2003) and a M.S. in Computer Science from Columbia U. (1997). He is an exhibiting artist and his Urban Sensoria workshops have been held in several cities around the world.