Sonic Augmented Reality

Case Study • Engineering Student at Chalmers • Sweden 2018
My Role:

Research, Prototyping, Testing, Writing

Year:

2018

Team:

Solo project

Time:

3 weeks

Title Photo by Henry Be on Unsplash

Advisory: Prof. Palle Dahlstedt (Interaction Design, Department of Computer Science and Engineering)

According to my research, I believe I have discovered a new augmentation of sound, which I call "Sonic Augmented Reality." Sonic Augmented Reality, is the technology of combining ”sonic interactions” (convey information and meaning through interactive context) from our technology-driven environment with computer-generated sound information as a second layer on top of our consumption of music.

Read the Research-Paper here!

The Problem

In this project, I examine the impact of the increasing use of headphones and earpods by people who isolate themselves from the urban sonic environment. I intend to investigate whether "Machine Learning" (ML) with "Urban Sound Classification" is a useful tool to minimize accidents involving urban moving objects such as cars, trams, bicycles, etc. while simultaneously preserving the enjoyment of listening to music.

Learnings

  • To generate high-quality feedback, urban sound classification needs to be connected to other data, such as Geographic Information Services.
  • Controlling the volume of the music in urban settings enhances the experience of being more aware in critical situations and more immersed in uncritical situations.
  • Urban sound classification takes longer as the size of the classification file increases.

Design Process

The following process describes the process of how I explored the topic of wearing headphones or ear pods and putting oneself at risk by isolation from the sonic urban environment.

Overview of design process

Field research

I began my research by understanding what the sounds are and how they work; therefore, I researched topics such as:

  • Directional and Spatial Hearing
  • The peripheral auditory system: the inner ear, the middle ear, and the outer ear.

Secondly, I researched what projects are being done in the field of hearing that are relevant to the problem I want to solve. I looked at existing devices and art projects, such as:

  • Audio Guides: An audio guide is a handheld device that provides information in a text or recorded format for visitors in an exhibition context (Museum, gallery, etc.)
  • Acoustic Ecology (ecoacoustics) or soundscape studies is the study of the relationship between humans and their environment.
  • Sonic city by M. Johansson and S. Learn: Sonic city was an interactive wearable technology that created music in real-time, based on inputs from sensing bodily and local factors.
  • Microsoft Soundscape, Soundscaping App: ”Microsoft Soundscape is a research project that explores the use of innovative audio-based technology to enable people, particularly those with blindness or low vision (see fig:5), to build a richer awareness of their surroundings.
  • (More research examples in the research paper)

Ideation, Experimentation, and Testing in a loop

Since I wasn't sure how my solution to the problem could look like, I tried four different experiments which may lead me to a potential concept I can further develop. Detailed information about each experiment is included in the research paper.

Experiment FOUR

Experiment FOUR consists of a video prototype and a participant who speaks their thoughts (Think Aloud) while watching it. A semi-structured interview followed the viewing. In the video prototype, a person walks through New York City as Warning Sounds are generated based on the current environmental sound situation. The notifications were delivered via computer-generated voices and a variety of notification sounds. In the prototype, the user should feel cared for. Along with the Urban Sound Classification, the system uses different sources of 'Geographic Information Services' to provide more context to the sounds classified. As an example, if you are in a high-crime area, the system can warn you that a high level of pickpocket crime is present.

VR experience prototype featuring a city walk and warnings of surroundings

I failed to create a Hi-fi prototype

Although the fourth experiment was promising, I attempted to build a high-fidelity prototype to enable better test results in the field. In order to build a prototype for this project, I had to overcome several limitations. Despite my best efforts, I could not make the prototype work. I tried to train a tram bell using my first machine learning environment. It took a considerable amount of time to train my model due to the slow processing power of my computer. Consequently, the resulting model was not accurate enough to recognize the same sound through my headphones. Moreover, this can also be attributed to the poor quality of my headphone microphone. My microphone produced too much back noise, so I did some research to get it to work. Back noise from the microphone and the noise from an urban environment (white noise, traffic, people, etc.) cause a 'Demixing Problem': "Demixing is the problem of identifying multiple structured signals from an overlaid, undersampled, and noisy observation". In order to solve this problem, I would have to use some preprocessing filter algorithms. In an urban environment, it takes a lot of time and patience to capture sounds, such as a tram bell. Instead of standing outside for hours waiting until a tram rings the bell, I used Youtube and the Youtube 8M-Modules to get training sounds.

Build a neural network prototype to classify sounds

Last testing

I created a video prototype showing the learnings from my fourth experiment while conducting further research on a high-fidelity prototype that failed. Based on this, the last test resulted in a list of learnings for this research paper. Below is a prototype of the final product.

Learnings

  • The test persons perceived the voice feedback as feedback of what they saw in its surrounding.
  • The test persons perceived the information as useful in most of the cases.
  • Location-based information, about restaurants, bars, waterfalls was perceived as annoying information.
  • The test persons perceived the system as a system that is capable of seeing, hearing, and recognizing surroundings.
  • The test persons liked the idea of getting a nudge about things they should be aware of in some situations (pickpocket in a crowd of people, high traffic, cyclists ).
  • The warning in critical situations is too slow (cyclists, tram bell ).
  • The controlled volume by the system was perceived as very helpful.

Experience Prototype

The following is an experience prototype that illustrates various scenarios in which Sonic Augmented Reality might be used.