MAE-based Hybrid Convolutional ViT for Self-Supervised Learning
- Abstract
- In this study, we aim to achieve lightweight models by adopting the Convolutional Vision Transformer (CvT) as the backbone and incorporating key techniques of Inpainting, namely update mask strategy and skip connections, to enhance the model's performance. Experiments were conducted on the Tiny-ImageNet-200 and ImageNet-1k datasets. The results of our approach demonstrate the effectiveness of model lightweightization and novel training strategies in improving performance. This provides a new direction for achieving efficient model training even in the context of limited resources.
- Author(s)
- Nami Seo
- Issued Date
- 2023
- Type
- Thesis
- URI
- https://scholar.gist.ac.kr/handle/local/19472
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.