ICMR '17- Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

SESSION: Keynote 1

  • Cees Snoek

Searching for A Thing

  • Arnold W.M. Smeulders
  • Ran Tao

SESSION: Keynote 2

  • Nicu Sebe

Making a Cultural Visit with a Smart Mate

  • Alberto del Bimbo


  • George Awad

Video Indexing, Search, Detection, and Description with Focus on TRECVID

  • George Awad
  • Duy-Dinh Le
  • Chong-Wah Ngo
  • Vinh-Tiep Nguyen
  • Georges Quénot
  • Cees Snoek
  • Shin'ichi Satoh

SESSION: Oral Session 1: Vision and Language (Oral Presentations)

  • Horia Cucu

Generating Video Descriptions with Topic Guidance

  • Shizhe Chen
  • Jia Chen
  • Qin Jin

Estimating the Information Gap between Textual and Visual Representations

  • Christian Andreas Henning
  • Ralph Ewerth

MSRC: Multimodal Spatial Regression with Semantic Context for Phrase Grounding

  • Kan Chen
  • Rama Kovvuri
  • Jiyang Gao
  • Ram Nevatia

Leveraging Multi-modal Prior Knowledge for Large-scale Concept Learning in Noisy Web Data

  • Junwei Liang
  • Lu Jiang
  • Deyu Meng
  • Alexander Hauptmann

SESSION: Oral Session 1: Vision and Language (Spotlight Presentations)

  • Horia Cucu

Transductive Visual-Semantic Embedding for Zero-shot Learning

  • Xing Xu
  • Fumin Shen
  • Yang Yang
  • Jie Shao
  • Zi Huang

3D Facial Video Retrieval and Management for Decision Support in Speech and Language Therapy

  • Ricardo Carrapiço
  • Isabel Guimarães
  • Margarida Grilo
  • Sofia Cavaco
  • João Magalhães

Music-Guided Video Summarization using Quadratic Assignments

  • Thomas Mensink
  • Thomas Jongstra
  • Pascal Mettes
  • Cees G.M. Snoek

SESSION: Special Oral Session: Beyond Semantics: Multimodal Understanding of Subjective Properties

  • Miriam Redi

Insiders and Outsiders: Comparing Urban Impressions between Population Groups

  • Darshan Santani
  • Salvador Ruiz-Correa
  • Daniel Gatica-Perez

Deep Sentiment Features of Context and Faces for Affective Video Analysis

  • Claudio Baecchi
  • Tiberio Uricchio
  • Marco Bertini
  • Alberto Del Bimbo

Frame-Transformer Emotion Classification Network

  • Jiarui Gao
  • Yanwei Fu
  • Yu-Gang Jiang
  • Xiangyang Xue

SESSION: Oral Session: Brave New Ideas

  • Chong-Wah Ngo

The Geo-Privacy Bonus of Popular Photo Enhancements

  • Jaeyoung Choi
  • Martha Larson
  • Xinchao Li
  • Kevin Li
  • Gerald Friedland
  • Alan Hanjalic

Light Curve Analysis From Kepler Spacecraft Collected Data

  • Eduardo Nigri
  • Ognjen Arandjelovic

Health Multimedia: Lifestyle Recommendations Based on Diverse Observations

  • Nitish Nag
  • Vaibhav Pandey
  • Ramesh Jain

SESSION: Oral Session: Open Software

  • Mathias Lux

An Unsupervised Distance Learning Framework for Multimedia Retrieval

  • Lucas Pascotti Valem
  • Daniel Carlos Guimarães Pedronette

ClusterTag: Interactive Visualization, Clustering and Tagging Tool for Big Image Collections

  • Konstantin Pogorelov
  • Michael Riegler
  • Pål Halvorsen
  • Carsten Griwodz

Scalable Hadoop-Based Pooled Time Series of Big Video Data from the Deep Web

  • Chris A. Mattmann
  • Madhav Sharan

PACE: Prediction-based Annotation for Crowded Environments

  • Federico Bartoli
  • Giuseppe Lisanti
  • Lorenzo Seidenari
  • Alberto Del Bimbo

SESSION: Oral Session 2: Multimedia Indexing (Oral presentations)

  • Adrian Ulges

TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition

  • Rao Muhammad Anwer
  • Fahad Shahbaz Khan
  • Joost van de Weijer
  • Jorma Laaksonen

DeepHash for Image Instance Retrieval: Getting Regularization, Depth and Fine-Tuning Right

  • Jie Lin
  • Olivier Morère
  • Antoine Veillard
  • Ling-Yu Duan
  • Hanlin Goh
  • Vijay Chandrasekhar

Balanced Search Space Partitioning for Distributed Media Redundant Indexing

  • Andre Mourão
  • Joã Magalhaes

Deep Supervised Hashing for Multi-Label and Large-Scale Image Retrieval

  • Dayan Wu
  • Zheng Lin
  • Bo Li
  • Mingzhen Ye
  • Weiping Wang

SESSION: Oral Session 2: Multimedia Indexing (Spotlight presentations)

  • Adrian Ulges

Accelerated Nearest Neighbor Search with Quick ADC

  • Fabien André
  • Anne-Marie Kermarrec
  • Nicolas Le Scouarnec

Improving Small Object Proposals for Company Logo Detection

  • Christian Eggert
  • Dan Zecha
  • Stephan Brehm
  • Rainer Lienhart

Discrete Multi-view Hashing for Effective Image Retrieval

  • Rui Yang
  • Yuliang Shi
  • Xin-Shun Xu

Quadruplet Networks for Sketch-Based Image Retrieval

  • Omar Seddati
  • Stéphane Dupont
  • Saïd Mahmoudi

SESSION: Oral Session 3: Multimedia Applications (Oral presentations)

  • Wei-Ta Chu

On the Automatic Identification of Music for Common Activities

  • Karthik Yadati
  • Cynthia C.S. Liem
  • Martha Larson
  • Alan Hanjalic

Improving Context-Aware Music Recommender Systems: Beyond the Pre-filtering Approach

  • Martin Pichl
  • Eva Zangerle
  • Günther Specht

Groups Re-identification with Temporal Context

  • Michal Koperski
  • Slawomir Bak
  • Peter Carr

Simple, Efficient and Effective Encodings of Local Deep Features for Video Action Recognition

  • Ionut C. Duta
  • Bogdan Ionescu
  • Kiyoharu Aizawa
  • Nicu Sebe

SESSION: Oral Session 4: Multimedia Applications (Spotlight presentations)

  • Wei-Ta Chu

Musical Instrument Recognition in User-generated Videos using a Multimodal Convolutional Neural Network Architecture

  • Olga Slizovskaia
  • Emilia Gómez
  • Gloria Haro

A Spatio-Temporal Category Representation for Brand Popularity Prediction

  • Gijs Overgoor
  • Masoud Mazloom
  • Marcel Worring
  • Robert Rietveld
  • Willemijn van Dolen

Bridging the Aesthetic Gap: The Wild Beauty of Web Imagery

  • Miriam Redi
  • Frank Z. Liu
  • Neil O'Hare

SESSION: Oral Session 5: Best Paper Candidates

  • Vasileios Mezaris

Multimodal Analysis of Image Search Intent: Intent Recognition in Image Search from User Behavior and Visual Content

  • Mohammad Soleymani
  • Michael Riegler
  • Pål Halvorsen

Nested Invariance Pooling and RBM Hashing for Image Instance Retrieval

  • Olivier Morère
  • Jie Lin
  • Antoine Veillard
  • Ling-Yu Duan
  • Vijay Chandrasekhar
  • Tomaso Poggio

Embedding Watermarks into Deep Neural Networks

  • Yusuke Uchida
  • Yuki Nagai
  • Shigeyuki Sakazawa
  • Shin'ichi Satoh

Learning to Detect Misleading Content on Twitter

  • Christina Boididou
  • Symeon Papadopoulos
  • Lazaros Apostolidis
  • Yiannis Kompatsiaris

SESSION: Special Oral Session: Identifying and Linking Interesting Content in Large Audiovisual Repositories

  • Maria Eskevich
  • Roeland Ordelman

On the Selection of Anchors and Targets for Video Hyperlinking

  • Zhi-Qi Cheng
  • Hao Zhang
  • Xiao Wu
  • Chong-Wah Ngo

Visual Descriptors in Methods for Video Hyperlinking

  • Petra Galuščáková
  • Michal Batko
  • Jan Čech
  • Jiří Matas
  • David Novák
  • Pavel Pecina

Linking Multimedia Content for Efficient News Browsing

  • Rémi Bois
  • Guillaume Gravier
  • Éric Jamet
  • Emmanuel Morin
  • Maxime Robert
  • Pascale Sébillot

Multi-view Manifold Learning for Media Interestingness Prediction

  • Yang Liu
  • Zhonglei Gu
  • Yiu-ming Cheung
  • Kien A. Hua

Utilising High-Level Features in Summarisation of Academic Presentations

  • Keith Curtis
  • Gareth J.F. Jones
  • Nick Campbell

SESSION: Oral Session 4: Cross-media Retrieval (Oral presentations)

  • Giorgos Tolias

How to Make an Image More Memorable?: A Deep Style Transfer Approach

  • Aliaksandr Siarohin
  • Gloria Zen
  • Cveta Majtanovic
  • Xavier Alameda-Pineda
  • Elisa Ricci
  • Nicu Sebe

Cross-modal Image-Graphics Retrieval by Neural Transfer Learning

  • Fabian Junkert
  • Markus Eberts
  • Adrian Ulges
  • Ulrich Schwanecke

DRAW: Deep Networks for Recognizing Styles of Artists Who Illustrate Children's Books

  • Samet Hicsonmez
  • Nermin Samet
  • Fadime Sener
  • Pinar Duygulu

AMECON: Abstract Meta-Concept Features for Text-Illustration

  • Ines Chami
  • Youssef Tamaazousti
  • Hervé Le Borgne

SESSION: Oral Session 4: Cross-media Retrieval (Spotlight presentations)

  • Giorgos Tolias

Leveraging Semantic Facets for Adaptive Ranking of Social Comments

  • Elaheh Momeni
  • Reza Rawassizadeh
  • Eytan Adar

Multi-task Deep Neural Network for Joint Face Recognition and Facial Attribute Prediction

  • Zhanxiong Wang
  • Keke He
  • Yanwei Fu
  • Rui Feng
  • Yu-Gang Jiang
  • Xiangyang Xue

Finger Vein Image Retrieval via Coding Scale-varied Superpixel Feature

  • Kuikui Wang
  • Lu Yang
  • Gongping Yang
  • Xin Luo
  • Kun Su
  • Yilong Yin

Joint Saliency Estimation and Matching using Image Regions for Geo-Localization of Online Video

  • Haoyue Shi
  • Jia Chen
  • Alexander G. Hauptmann


  • Petra Galuščáková

Panorama to Panorama Matching for Location Recognition

  • Ahmet Iscen
  • Giorgos Tolias
  • Yannis Avrithis
  • Teddy Furon
  • Ondřej Chum

Concept Language Models and Event-based Concept Number Selection for Zero-example Event Detection

  • Damianos Galanopoulos
  • Foteini Markatopoulou
  • Vasileios Mezaris
  • Ioannis Patras

Tiny Transform Net for Mobile Image Stylization

  • Shilun Lin
  • Pengfei Xiong
  • Hailong Liu

Query and Keyframe Representations for Ad-hoc Video Search

  • Foteini Markatopoulou
  • Damianos Galanopoulos
  • Vasileios Mezaris
  • Ioannis Patras

Manga FaceNet: Face Detection in Manga based on Deep Neural Network

  • Wei-Ta Chu
  • Wei-Wei Li

Generative Adversarial Networks for Multimodal Representation Learning in Video Hyperlinking

  • Vedran Vukotić
  • Christian Raymond
  • Guillaume Gravier

Efficient Indexing of Regional Maximum Activations of Convolutions using Full-Text Search Engines

  • Giuseppe Amato
  • Fabio Carrara
  • Fabrizio Falchi
  • Claudio Gennaro

Family Photo Recognition via Multiple Instance Learning

  • Junkang Zhang
  • Siyu Xia
  • Ming Shao
  • Yun Fu

TaiChi: A Fine-Grained Action Recognition Dataset

  • Shan Sun
  • Feng Wang
  • Qi Liang
  • Liang He

Conditional Fast Style Transfer Network

  • Keiji Yanai
  • Ryosuke Tanno

Improving Image Classification using Coarse and Fine Labels

  • Anuvabh Dutt
  • Denis Pellerin
  • Georges Quenot

Fast Multi-Modal Unified Sparse Representation Learning

  • Mridula Verma
  • Kaushal Kumar Shukla

Badminton Video Analysis based on Spatiotemporal and Stroke Features

  • Wei-Ta Chu
  • Samuel Situmeang


  • Jaeyoung Choi

Expo: An Expectation-oriented System for Selecting Important Photos from Personal Collections

  • Andrea Ceroni
  • Vassilios Solachidis
  • Claudia Niederée
  • Olga Papadopoulou
  • Vasileios Mezaris

Multimodal Video Retrieval with the 2017 IMOTION System

  • Luca Rossetto
  • Ivan Giangreco
  • Claudiu Tănase
  • Heiko Schuldt

The JORD System: Linking Sky and Social Multimedia Data to Natural Disasters

  • Kashif Ahmad
  • Michael Riegler
  • Ans Riaz
  • Nicola Conci
  • Duc-Tien Dang-Nguyen
  • Pål Halvorsen

LireSolr: A Visual Information Retrieval Server

  • Mathias Lux
  • Michael Riegler
  • Pål Halvorsen
  • Glenn MacStravic

VideoAnalysis4ALL: An On-line Tool for the Automatic Fragmentation and Concept-based Annotation, and the Interactive Exploration of Videos

  • Chrysa Collyda
  • Evlampios Apostolidis
  • Alexandros Pournaras
  • Foteini Markatopoulou
  • Vasileios Mezaris
  • Ioannis Patras

Visually Browsing Millions of Images Using Image Graphs

  • Kai Uwe Barthel
  • Nico Hezel
  • Klaus Jung

SESSION: Oral Session: Doctoral Symposium

  • Tomas Piatrik

A Generic Framework for Social Event Analysis

  • Shengsheng Qian
  • Tianzhu Zhang
  • Changsheng Xu

Semi-Automatic Retrieval of Relevant Segments from Laparoscopic Surgery Videos

  • Stefan Petscharnig

On the Effectiveness of Distance Measures for Similarity Search in Multi-Variate Sensory Data: Effectiveness of Distance Measures for Similarity Search

  • Yash Garg
  • Silvestro Roberto Poccia

Classification of sMRI for Alzheimer's disease Diagnosis with CNN: Single Siamese Networks with 2D+? Approach and Fusion on ADNI

  • Karim Aderghal
  • Jenny Benois-Pineau
  • Karim Afdel

SESSION: Industry Keynotes

  • Neil O'Hare

With 5G Approaching, How will Audio/Video Technology that Serves 800 Million QQ Users Bring Forth New Ideas

  • Xiaozheng Huang

Information Retrieval from Multi-Sensor Data for Enriching Location Services at HERE Technologies

  • Matei Stroila

Intelligently Connecting People with Information

  • Changhu Wang