Light-Weight Causal Speech Enhancement with Bone-Conduction
- Abstract
- Speech enhancement aims to improve the quality of speech degraded by various types of noise, particularly under challenging conditions such as extremely low signal- to-noise ratio (SNR). Traditional methods predominantly rely on speech data captured by air-conduction (AC), which are highly susceptible to noise. This makes speech en- hancement at low SNRs a challenge. In contrast, bone-conduction (BC) is more robust to noise but provide information constrained to a limited frequency bandwidth. In this paper, we propose a novel fusion module that effectively integrates information from both air-conduction and bone-conduction. Additionally, we introduce a light-weight, causal network designed for low computational complexity, making it suitable for de- ployment on resource-constrained devices. Experimental evaluations demonstrate that the proposed model significantly outperforms the baseline, achieving superior speech quality while reducing model size without an increase in computational complexity.
- Author(s)
- 이상윤
- Issued Date
- 2025
- Type
- Thesis
- URI
- https://scholar.gist.ac.kr/handle/local/19455
- Alternative Author(s)
- 이상윤(Sangyun Lee)
- Department
- 대학원 AI대학원
- Advisor
- Shin, Jong Won
- Table Of Contents
- Abstract (English) i
List of Contents ii
List of Tables iii
List of Figures iv
1 Introduction 1
1.1 Introduction 1
1.2 Related works 3
2 Baseline Model 4
2.1 Problem formulation 4
2.2 Architecture 4
2.2.1 Encoder-Decoder 4
2.2.2 Group LSTM 6
3 Proposed Model 7
3.1 Fusion Module 7
3.2 Encoder-Decoder 8
3.3 Frequency Refinement Module 9
3.4 Complex Skip-Connection 10
3.5 Group Dual-path Gated Recurrent Units 11
4 Experiment and Result 12
4.1 Datasets 12
4.2 Training Objective Functions 13
4.3 Evaluation Metrics 14
4.4 Experimental Results 15
5 Conclusion 20
References 21
– ii –
- Degree
- Master
-
Appears in Collections:
- Department of AI Convergence > 3. Theses(Master)
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.