Automated Sound-Visualized Caption System to Enhance Accessibility for Deaf and Hard-of-Hearing Users: Focus on Speech and Non-Speech Sound Nuances
- Author(s)
- 김주영
- Type
- Thesis
- Degree
- Doctor
- Department
- 대학원 융합기술학제학부(문화기술프로그램)
- Advisor
- Hong, Jin-Hyuk
- Abstract
- The sounds around us create rich information—such as intent, emotion, and atmosphere—through both linguistic and paralinguistic cues. For individuals who are deaf or hard-of-hearing, these layers of auditory information are often inaccessible, highlighting the need to translate sound into visual formats. Captions have been a powerful tool in this endeavor, providing a visual means to interpret sounds and enabling users to interact with sound visually. However, conventional caption systems mainly focus on spoken words or basic sound categories, limiting their ability to convey sound nuances. This limitation restricts the depth of experience for users who rely on captions to fully engage with audio-rich content. Recent advances in artificial intelligence (AI)-based sound recognition technologies have improved caption quality by accurately identifying speech and sound classes. Still, there has been limited research on recognizing and visualizing paralinguistic expressions. Due to the scarcity of datasets and studies related to sound nuances, research on sound-visualized captions remains challenging. To address this gap, this thesis proposes sound-visualized caption systems that automatically visualize nuanced information into captions. These systems include both speech and non-speech visualizations, using ty- pographic design and punctuation to express speech nuances, and onomatopoeic descriptions to represent non- speech sound nuances. Specifically, this work explores methodologies for mapping caption design elements to effectively convey sound nuances, techniques for recognizing these nuances, and a system architecture that generates sound-visualized captions from auditory input. This thesis not only presents a novel framework for implementing sound-visualized captions but also offers an empirical understanding of how these captions can enhance sound accessibility and viewing experiences for deaf and hard-of-hearing users. In addition, it discusses the potential benefits and challenges of sound- visualized captions, design implications for improving the proposed systems, and future research opportunities to connect with other studies on sound accessibility and bidirectional sound-text conversion.
- URI
- https://scholar.gist.ac.kr/handle/local/18955
- Fulltext
- http://gist.dcollection.net/common/orgView/200000826500
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.