PLAF : Penalized Logit And Feature Knowledge Distillaion
- Abstract
- Advances in computer vision using deep neural networks (DNNs) have led to improved performance, but require significant hardware resources to accommodate high computational costs and memory. To overcome this challenge, knowledge distillation (KD) has been proposed as a means of transferring information from a large teacher network to a smaller student network while preserving performance. However, current KD methods primarily focus on using either logits or features as a guide for student models, rather than utilizing both simultaneously. In this thesis, we propose a novel approach to KD that uses both penalized logits and features to train student networks to mimic teacher networks. We revisit the Kullback-Leibler divergence and reformulate it to create penalized logits. Additionally, we design a convolutional layer to acquire features. Our proposed method is evaluated through experiments, which demonstrate its effectiveness in reducing model size and maintaining performance. Overall, this thesis contributes to the development of more efficient DNNs through a novel approach to KD that uses both logits and features as training guides.
- Author(s)
- Byunggwan Jeon
- Issued Date
- 2023
- Type
- Thesis
- URI
- https://scholar.gist.ac.kr/handle/local/19586
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.