OAK

GIST Library Login

Metadata Downloads

Citation: Proceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018

Abstract: We present a data-efficient representation learning approach to learn video representation with small amount of labeled data. We propose a multitask learning model ActionFlowNet to train a single stream network directly from raw pixels to jointly estimate optical flow while recognizing actions with convolutional neural networks, capturing both appearance and motion in a single model. Our model effectively learns video representation from motion information on unlabeled videos. Our model significantly improves action recognition accuracy by a large margin (23.6%) compared to state-of-the-art CNN-based unsupervised representation learning methods trained without external large scale data and additional optical flow input. Without pretraining on large external labeled datasets, our model, by well exploiting the motion information, achieves competitive recognition accuracy to the models trained with large labeled datasets such as ImageNet and Sport-1M. © 2018 IEEE.

Appears in Collections:: Department of AI Convergence > 2. Conference Papers

공개 및 라이선스

qrcode

OAK GIST Repository는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.