OAK

Improving esports viewing experience through hierarchical scene detection and tracking

Metadata Downloads
Abstract
The role of an observer in esports is to provide spectators with the most engaging scenes in real time. To automate this process, various research has been conducted. In this study, we utilize Vision Transformer (ViT)-based object detection to enhance the accuracy of automatic observers. However, while ViT-based detection more accurately identifies engaging game scenes, it often leads to frequent and abrupt scene changes, reducing viewer comfort. To address this issue, we propose a novel hierarchical structure that combines scene detection with scene tracking, maintaining high accuracy while ensuring smoother transitions between scenes. This approach also improves inference speed, as the tracking model is faster than the detection model. We computationally evaluated six observer models in terms of accuracy and camera stability, with our method demonstrating significantly more stable camera control. Additionally, user testing indicated a strong preference for our model over those without tracking. A video comparing our method to the state-of-the-art can be viewed at https://youtu.be/gWiU4GACZEg. © The Author(s) 2025.
Author(s)
Joo, Ho-TaekLee, Sung-HaChung, InsikKim, Kyung-Joong
Issued Date
2025-03
Type
Article
DOI
10.1038/s41598-025-93692-0
URI
https://scholar.gist.ac.kr/handle/local/8995
Publisher
Nature Research
Citation
Scientific Reports, v.15, no.1
ISSN
2045-2322
Appears in Collections:
Department of AI Convergence > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.