SUMAC'21: Proceedings of the 3rd Workshop on Structuring and Understanding of Multimedia heritAge Contents

SUMAC'21: Proceedings of the 3rd Workshop on Structuring and Understanding of Multimedia heritAge Contents

SUMAC'21: Proceedings of the 3rd Workshop on Structuring and Understanding of Multimedia heritAge

Full Citation in the ACM Digital Library

SESSION: Keynote Talks

Deep Learning for Historical Data Analysis

  • Mathieu Aubry

This presentation will give an overview of projects on leveraging deep learning for
historical data analysis my group did in the last 3 years, partly in the context of
the ANR EnHerit project. I will first discuss how deep learning can be used to retrieve
and analyze repeated details in artworks in artwork collections [5, 6]. I will then
present several problems related to historical document analysis: historical watermarks
recognition [7], document images segmentation [2], clustering for text modelling [3,
4], and scientific illustration propagation in historical manuscripts analysis [1].
In all cases, I will show that standard approaches can give useful baseline results
when tuned adequately, but that developing dedicated approaches that take into account
the specificity of the data and the problem significantly improves the results.

Analyzing CHANGE in Cultural Heritage Objects through Images

  • Jon Yngve Hardeberg

Cultural heritage (CH) objects have been constantly undergoing changes/degradation
over time. In order to pass the legacy of these objects to future generations, it
is important to monitor, estimate and understand these changes as accurately as possible.
These investigations will support the conservators to plan necessary treatments in
advance or to slow down the specific deterioration processes. The dynamic characteristics
of materials vary from one object to another and are influenced by several factors.
To detect and predict their changes, accurate documentation and analysis are necessary.
Over the years, CH digitization using scientific imaging techniques has become more
widespread and has created a massive amount of datasets of different forms in 2D and
3D. Several past projects focused on different aspects of technological developments
for better digitization methods. There has been less focus on the processing and analysis
of these datasets to make the greatest use of them and to their further exploration
for monitoring 'changes' in CH artifacts for conservation purposes. The CHANGE project
takes cultural heritage digitization to a new level by exploring digital datasets
for deeper analysis and interpretation by developing methodologies for the assessment
of changes in CH objects by comparing and combining digital datasets captured at different
time periods.

SESSION: Oral Presentations

Built Year Prediction from Buddha Face with Heterogeneous Labels

  • Yiming Qian
  • Cheikh Brahim El Vaigh
  • Yuta Nakashima
  • Benjamin Renoust
  • Hajime Nagahara
  • Yutaka Fujioka

Buddha statues are a part of human culture, especially of the Asia area, and they
have been alongside human civilisation for more than 2,000 years. As history goes
by, due to wars, natural disasters, and other reasons, the records that show the built
years of Buddha statues went missing, which makes it an immense work for historians
to estimate the built years. In this paper, we pursue the idea of building a neural
network model that automatically estimates the built years of Buddha statues based
only on their face images. Our model uses a loss function that consists of three terms:
an MSE loss that provides the basis for built year estimation; a KL divergence-based
loss that handles the samples with both an exact built year and a possible range of
built years (e.g., dynasty or centuries) estimated by historians; finally a regularisation
that utilises both labelled and unlabelled samples based on manifold assumption. By
combining those three terms in the training process, we show that our method is able
to estimate built years for given images with 37.5 years of a mean absolute error
on the test set.

Software and Content Design of a Browser-based Mobile 4D VR Application to Explore
Historical City Architecture

  • Sander Muenster
  • Jonas Bruschke
  • Ferdinand Maiwald
  • Constantin Kleiner

The Kulturerbe4D project aims at making the diversity and change processes of architectural
monuments in the urban context virtually visible and experienceable, especially for
children and young people, but also for residents and tourists. A virtual city tour
providing cultural and historical information is to be combined with the transfer
of knowledge about monuments, anthropogenic factors of influence, and protective measures.
This article focusses on three main challenges in producing city-scale mobile 4D applications:
(a) 4D content creation specifically for historical purposes is highly labour intensive,
(b) web applications are better accepted by users but require more adoption to cope
with technical limitations, (c) historically accurate 4D content is of disperse visual
quality and visualization strategies are rarely empirically proven. Within this article
we present our research and development work to overcome those issues.

Evaluation of Deep Learning Techniques for Content Extraction in Spanish Colonial
Notary Records

  • Nouf Alrasheed
  • Shivika Prasanna
  • Ryan Rowland
  • Praveen Rao
  • Viviana Grieco
  • Martin Wasserman

Processing and analyzing historical manuscripts is considered one of the most challenging
problems in the document analysis and recognition domain. Manuscripts written in cursive
are even more difficult due to overlapping words with random spacing, irregular and
varying characters' shapes, poor scan quality, and insufficient labeled data. Despite
the significant achievements of deep learning approaches in computer vision, handwritten
word recognition is far from solved. Most of the existing methods focus on well-segmented
word datasets. In this paper, we present an empirical study investigating how well
state-of-the-art deep learning models perform on detection and recognition of handwritten
words in Spanish American notary records. Professional historians were involved in
preparing a labeled dataset of 26,482 Spanish words employed in the experiments. We
investigate the performance of some state-of-the-art models on optical character recognition
(OCR) on handwritten text documents: Keras-OCR, the object detection algorithm "You
Only Look Once" (YOLO), Tesseract OCR, Kraken, and Calamari-OCR. Since YOLO does not
include a text recognizer, we propose YOLO-OCR, an innovative model to detect and
recognize words in historical manuscripts written in Spanish. Our results show the
performance of pre-trained models on our dataset and that Keras-OCR and YOLO-OCR models
are highly valuable for content extraction.

How to Spatialize Geographical Iconographic Heritage

  • Emile Blettery
  • Nelson Fernandes
  • Valérie Gouet-Brunet

This article is dedicated to the spatialization of image contents, with a focus on
geographical iconographic heritage, i.e. digitized or born-digital image collections,
acquired at variable temporal periods and showing the territory and its human-made
and natural visual landmarks. We present a panorama of the current solutions (manual,
semi-automatic and fully automatic alternatives) that exist to spatialize a visual
content, with respect to the data available and the level of spatialization targeted.
In particular, we highlight the characteristics of the approaches dedicated to geographical
iconographic heritage, and in some cases, we present tests and practical feedbacks
that we had the opportunity to conduct for old photographic contents in oblique aerial
and terrestrial imagery.

Searching Silk Fabrics by Images Leveraging on Knowledge Graph and Domain Expert Rules

  • Thomas Schleider
  • Raphael Troncy
  • Thibault Ehrhart
  • Mareike Dorozynski
  • Franz Rottensteiner
  • Jorge Sebastián Lozano
  • Georgia Lo Cicero

The production of European silk textile is an endangered intangible cultural heritage.
Digital tools can nowadays be developed to help preserving it, or even to make it
more accessible for the public and the fashion industry. In this paper, we propose
an image-based retrieval tool that leverages on a knowledge graph describing the silk
textile production as well as rules formulated by experts of this domain. Out of several
possible similarity scenarios, two have proven to work best and have been integrated
into an exploratory search engine.