MCFR '22: Proceedings of the 1st Workshop on Multimedia Computing towards Fashion Recommendation

MCFR '22: Proceedings of the 1st Workshop on Multimedia Computing towards Fashion Recommendation

MCFR '22: Proceedings of the 1st Workshop on Multimedia Computing towards Fashion Recommendation

Full Citation in the ACM Digital Library

SESSION: Keynote Talk

Fashion Meets Computer Vision

  • Wen-Huang Cheng

Fashion is the way we present ourselves to the world and has become one of the world's largest industries. Fashion, mainly conveyed by vision, has thus attracted much attention from computer vision researchers in recent years. This talk aims to present our latest research results on computer vision for fashion and our experience on shipping the developed inventions and technologies to real applications.

SESSION: Workshop Presentations

On Leveraging the Metapath and Entity Aware Subgraphs for Recommendation

  • Muhammad Umer Anwaar
  • Zhiwei Han
  • Shyam Arumugaswamy
  • Rayyan Ahmad Khan
  • Thomas Weber
  • Tianming Qiu
  • Hao Shen
  • Yuanting Liu
  • Martin Kleinsteuber

In graph neural networks (GNNs), message passing iteratively aggregates nodes' information from their direct neighbours while neglecting the sequential nature of multi-hop node connections. Such sequential node connections e.g., metapaths, capture critical insights for downstream tasks. Concretely, in recommender systems (RSs), disregarding a larger neighbourhood and focusing only on the immediate neighbours leads to inadequate distillation of the collaborative signals. In this paper, we employ collaborative subgraphs (CSGs) and metapaths to form metapath-aware subgraphs, which explicitly capture sequential semantics in graph structures. We propose metaPath and Entity-Aware Graph Neural Network (PEAGNN), which trains GNN to perform metapath-aware information aggregation on such subgraphs. The information from different metapaths is then fused using attention mechanism. To leverage the local structure of CSGs, we introduce entity-awareness that acts as a contrastive regularizer on node embedding. Moreover, PEAGNN can be combined with prominent layers such as GAT, GCN and GraphSage. Our empirical evaluation shows that our proposed approach outperforms competitive baselines on three public datasets. Further analysis demonstrates that PEAGNN also learns meaningful metapath combinations from a given set of metapaths.

I-MALL An Effective Framework for Personalized Visits. Improving the Customer Experience in Stores

  • Federico Becattini
  • Giuseppe Becchi
  • Andrea Ferracani
  • Alberto Del Bimbo
  • Liliana Lo Presti
  • Giuseppe Mazzola
  • Marco La Cascia
  • Federico Cunico
  • Andrea Toaiari
  • Marco Cristani
  • Antonio Greco
  • Alessia Saggese
  • Mario Vento

In this paper we present I-MALL, an ICT hardware and software infrastructure that enables the management of services related to places such as shopping malls, showrooms, and conferences held in dedicated facilities. I-MALL offers a network of services that perform customer behavior analysis through computer vision and provide personalized recommendations made available on digital signage terminals. The user can also interact with a social robot. Recommendations are inferred on the basis of the profile of interests computed by the system analysing the history of the customer visit and his/her behavior including information from his/her appearance, the route taken inside the facility, as well as his/her mood and gaze.

Orthogonal Vector-Decomposed Disentanglement Network of Interactive Image Retrieval for Fashion Outfit Recommendation

  • Chen Chen
  • Jie Guo
  • Bin Song
  • Tong Zhang

Interactive image retrieval for fashion outfit recommendation is a challenging task, which aims to search for the target desired image according to a multi-modal query (a reference image and a modification text). Previous studies focus on exploring effective feature composing methods to achieve similarity matching between different modalities. However, the existence of feature redundancy and the semantic inconsistency between modalities introduces many task-irrelevant information. It is intractable to correctly identify the particular information to be modified and will inevitably introduce noise disturbances which lead to suboptimal performance. To this end, we present a novel Orthogonal Vector-Decomposed Disentanglement Network (OVDDN) for image retrieval. It proposes to leverage the disentangled parts to learn a controllable denoising embedding space. First, we design an orthogonal disentanglement module. It is applied to both image and text features to decouple them into two independent components (invariant and specific) through orthogonal constraints. A similarity metric loss ensures semantic consistency of paired images. Then, an attention network generates composition of the reference image invariant part and text task-related part to match the target one. Finally, a differential feature alignment module maintain the cross-modal semantic consistency. Extensive experiments conducted on three benchmark datasets denote the OVDDN achieving the consistently superior performance. Ablation analyses further verify the effectiveness of our proposed model.

CI-OCM: Counterfactural Inference towards Unbiased Outfit Compatibility Modeling

  • Liqiang Jing
  • Minghui Tian
  • Xiaolin Chen
  • Teng Sun
  • Weili Guan
  • Xuemeng Song

As a key task to support intelligent fashion shop construction, outfit compatibility modeling, which aims to estimate whether the given set of fashion items makes a compatible outfit, has attracted much research attention. Although previous efforts have achieved compelling success, they still suffer from the spurious correlation between the category matching and outfit compatibility, which hurts the generalization of the model and misleads the model to be biased. To tackle this problem, we introduce the causal graph tool to analyze the causal relationship among variables of outfit compatibility modeling. In particular, we find that the spurious correlation is attributed to the direct effect of the category information on outfit compatibility prediction by the causal graph. To remove this bad effect from the category information, we present a novel counterfactual inference framework for outfit compatibility modeling, dubbed as CI-OCM. Thereinto, we capture the direct effect of the category information on model prediction in the training phase and then subtract it from the total effect in the testing phase to achieve debiased prediction. Extensive experiments on two splits of a widely-used dataset~(\ie under the independent identically distribution and out-of-distribution assumptions) clearly demonstrate that our CI-OCM can achieve significant improvement over the existing baselines. In addition, we released our code to facilitate the research community.