OAK

GIST Library Login

GIST Scholar College of Engineering Department of Mechanical and Robotics Engineering 3. Theses(Master)

Enhancing Video Analysis of Car Accidents Using Multimodal Large Language Models with Effective Prompting Techniques

Metadata Downloads

Author(s): Inho Park

Type: Thesis

Degree: Master

Department: 대학원 기계공학부

Advisor: Lee, Yong-Gu

Abstract: In this study, we applied instruction tuning to LLMs (Large Language Models) to ensure that their outputs align with users' expectations. This approach assumes that AR (autoregressive) types of LLMs consider context when trained with large datasets. However, creating specialized datasets for instruction tuning, such as for accident video analysis, is challenging due to the difficulties in data collection and the high costs and time requirements for extensive processing. To address these challenges, this research introduces an innovative approach that utilizes the structural properties of prompts through the chain of prompts technique without extensive data training. Additionally, this study introduces the prompt structure called Diagnosticity to enhance the robustness of (Large Vision Language Model) LVLM models for video data, diverging from traditional prompt styles that focus mainly on images or basic tasks. The experiments in this paper avoid training on specific data by utilizing a zero-shot approach. For testing, the AccidentInsight(AI) Dataset, comprising 1,000 accident video clips with high-quality traffic accident-related summaries and six short questions, was used to evaluate models using only prompt techniques. This paper critically approaches the evaluation methods used for recent LVLMs, which primarily rely on LLM-based evaluation. Instead of solely using LLMs, we incorporate traditional methods like Character n-gram F1 Score (CHRF) and MoverScore to propose the H-Score, a new evaluation metric that balances the strengths and weaknesses of both n-gram and LLM evaluation methods. This comprehensive approach evaluates LVLM model performance across both paragraphs and short texts to provide a more accurate assessment.

URI: https://scholar.gist.ac.kr/handle/local/19253

Fulltext: http://gist.dcollection.net/common/orgView/200000878496

Alternative Author(s): 박인호

Appears in Collections:: Department of Mechanical and Robotics Engineering > 3. Theses(Master)

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.