MAHCI'18- Proceedings of the 2018 Workshop on Multimedia for Accessible Human Computer Interface

Full Citation in the ACM Digital Library

SESSION: Oral Session

Sound Event Detection and Haptic Vibration Based Home Monitoring Assistant System for the Deaf and Hard-of-Hearing

Gee Yeun Kim
Seung-Su Shin
Jin Young Kim
Hyoung-Gook Kim

Acoustic signals contain a significant amount of information generated by sound sources. Unfortunately, deaf and hard-of-hearing people cannot access this information. Therefore, an assistive technology is required to help people with hearing loss. In this paper, we present a home monitoring assistant system based on sound event detection and sound-to-haptic conversion for the deaf and hard-of-hearing. The system detects the sounds erated in the home environment, converts the detected sound into text and haptic vibration, and provides them to the deaf and hard-of-hearing. The proposed approach is mainly composed of four modules, including signal estimation, reliable sensor channel selection, sound event detection, and conversion of sound into haptic vibration. During signal estimation, lost packets are recovered to improve the signal quality. Next, reliable channels are selected using a multi-channel cross-correlation coefficient to improve the computational efficiency for distant sound event detection. Finally, the sounds of the selected two channels are used for environmental sound event detection based on bidirectional gated recurrent neural networks and for sound-to-haptic effect conversion using kernel-based source separation. Experiments show that the proposed approach achieves superior performances compared to the baseline.

Tactile Symbol Discrimination on a Small Pin-array Display

Fabrizio Leo
Caterina Baccelliere
Aleksander Waszkielewicz
Elena Cocchi
Luca Brayda

Pin-array displays are a promising technology that allow to display visual information with touch, a crucial issue for blind and partially sighted users. Such displays are programmable, therefore can considerably increase, vary and tailor the amount of information as compared to common embossed paper and, beyond Braille, they allow to display graphics. Due to a shortage in establishing which ideal resolution allows to understand simple graphical concepts, we evaluated the discriminability of tactile symbols at different resolutions and complexity levels in blind, blindfolded low-vision and sighted participants. We report no differences in discrimination accuracy between tactile symbols organized in 3x3 as compared to 4x4 arrays. A metric based on search and discrimination speed in blind and in low-vision participants does not change at different resolutions, whereas in sighted participants it significantly increases when resolution increases. We suggest possible guidelines in designing dictionaries of low-resolution tactile symbols. Our results can help designers, ergonomists and rehabilitators to develop usable human-machine interfaces with tactual symbol coding.

ASYSST: A Framework for Synopsis Synthesis Empowering Visually Impaired

Shreya Goyal
Chiranjoy Chattopadhyay
Gaurav Bhatnagar

In an indoor scenario, the visually impaired do not have the information about the surroundings and finds it difficult to navigate from room to room. The sensor-based solutions are expensive and may not always be comfortable for the end users. In this paper, we focus on the problem of synthesis of textual description from a given floor plan image to assist the visually impaired. The textual description, in addition to a text reading software, can aid the visually impaired person while moving inside a building. In this work, for the first time, we propose an end to end framework (ASYSST) for textual description synthesis from digitized building floor plans. We have introduced a novel Bag of Decor (BoD) feature to learn $5$ classes of a room from $1355$ samples under a supervised learning paradigm. These learned labels are fed into a description synthesis framework to yield a holistic description of a floor plan image. Experimental analysis of real publicly available floor plan data-set proves the superiority of our framework.

Tactile Facial Expressions and Associated Emotions toward Accessible Social Interactions for Individuals Who Are Blind

Troy McDaniel
Diep Tran
Samjhana Devkota
Kaitlyn DiLorenzo
Bijan Fakhri
Sethuraman Panchanathan

For individuals who are blind, much of social interactions are inaccessible: The majority of information exchanged is non-verbal, e.g., facial expressions and body language. Little work has been done toward building social assistive aids for individuals who are blind. This work presents a mapping between facial action units and vibrotactile representations that may be presented through haptic displays. We present a study exploring how well individuals who are blind can learn to recognize universal emotions of happy, sad, surprise, anger, fear and disgust from vibrotactile facial action units. Results show promising recognition accuracy and subjective feedback, demonstrating that individuals who are blind can learn to understand the emotional content of facial movements presented through vibrations.

DynamicSlide: Exploring the Design Space of Reference-based Interaction Techniques for Slide-based Lecture Videos

Hyeungshik Jung
Hijung Valentina Shin
Juho Kim

Slide-based video is a popular format of online lecture videos. Lecture slides and narrations are complementary: while slides visually convey the main points of the lecture, narrations add detailed explanations to each item in the slide. We define a pair of slide item and the relevant sentence in the narration as a reference. In order to explore the design space of reference-based interaction techniques, we present DynamicSlide, a video processing system that automatically extracts references from slide-based lecture videos and a video player leveraging these references. Through participatory design workshops, we elicited useful interaction techniques specifically for slide-based lecture videos. Among these ideas, we selected and implemented three reference-based techniques: emphasizing the current item in the slide that is being explained, enabling item-based navigation, and enabling item-based note-taking. We also built a pipeline for automatically identifying references, and it shows 79% accuracy for finding 141 references in five videos from four different authors. Results from a user study suggest that DynamicSlide's features improve the learner's video browsing and navigation experience.