OAK

OnomaCap: Making Non-speech Sound Captions Accessible and Enjoyable through Onomatopoeic Sound Representation

Metadata Downloads
Author(s)
Kim, JooYeongHong, Jin-Hyuk
Type
Conference Paper
Citation
2025 CHI Conference on Human Factors in Computing Systems, CHI 2025
Issued Date
2025
Abstract
Non-speech sounds play an important role in setting the mood of a video and aiding comprehension. However, current non-speech sound captioning practices focus primarily on sound categories, which fails to provide a rich sound experience for d/Deaf and hard-of-hearing (DHH) viewers. Onomatopoeia, which succinctly captures expressive sound information, offers a potential solution but remains underutilized in non-speech sound captioning. This paper investigates how onomatopoeia benefits DHH audiences in non-speech sound captioning. We collected 7,962 sound-onomatopoeia pairs from listeners and developed a sound-onomatopoeia model that automatically transcribes sounds into onomatopoeic descriptions indistinguishable from human-generated ones. A user evaluation of 25 DHH participants using the model-generated onomatopoeia demonstrated that onomatopoeia significantly improved their video viewing experience. Participants most favored captions with onomatopoeia and category, and expressed a desire to see such captions across genres. We discuss the benefits and challenges of using onomatopoeia in non-speech sound captions, offering insights for future practices. © 2025 Copyright held by the owner/author(s).
Publisher
Association for Computing Machinery
Conference Place
Yokohama
URI
https://scholar.gist.ac.kr/handle/local/31488
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.