OAK

GIST Library Login

GIST Scholar College of Information and Computing Department of AI Convergence 1. Journal Articles

CXR-LLaVA: a multimodal large language model for interpreting chest X-ray images

Metadata Downloads

Author(s): Lee, Seowoo; Youn, Jiwon; Kim, Hyungjin; Kim, Mansu; Yoon, Soon Ho

Type: Article

Citation: EUROPEAN RADIOLOGY, v.35, no.7, pp.4374 - 4386

Issued Date: 2025-07

Abstract: ObjectiveThis study aimed to develop an open-source multimodal large language model (CXR-LLaVA) for interpreting chest X-ray images (CXRs), leveraging recent advances in large language models (LLMs) to potentially replicate the image interpretation skills of human radiologists.Materials and methodsFor training, we collected 592,580 publicly available CXRs, of which 374,881 had labels for certain radiographic abnormalities (Dataset 1) and 217,699 provided free-text radiology reports (Dataset 2). After pre-training a vision transformer with Dataset 1, we integrated it with an LLM influenced by the LLaVA network. Then, the model was fine-tuned, primarily using Dataset 2. The model's diagnostic performance for major pathological findings was evaluated, along with the acceptability of radiologic reports by human radiologists, to gauge its potential for autonomous reporting.ResultsThe model demonstrated impressive performance in test sets, achieving an average F1 score of 0.81 for six major pathological findings in the MIMIC internal test set and 0.56 for six major pathological findings in the external test set. The model's F1 scores surpassed those of GPT-4-vision and Gemini-Pro-Vision in both test sets. In human radiologist evaluations of the external test set, the model achieved a 72.7% success rate in autonomous reporting, slightly below the 84.0% rate of ground truth reports.ConclusionThis study highlights the significant potential of multimodal LLMs for CXR interpretation, while also acknowledging the performance limitations. Despite these challenges, we believe that making our model open-source will catalyze further research, expanding its effectiveness and applicability in various clinical contexts.Key PointsQuestionHow can a multimodal large language model be adapted to interpret chest X-rays and generate radiologic reports?FindingsThe developed CXR-LLaVA model effectively detects major pathological findings in chest X-rays and generates radiologic reports with a higher accuracy compared to general-purpose models.Clinical relevanceThis study demonstrates the potential of multimodal large language models to support radiologists by autonomously generating chest X-ray reports, potentially reducing diagnostic workloads and improving radiologist efficiency.Key PointsQuestionHow can a multimodal large language model be adapted to interpret chest X-rays and generate radiologic reports?FindingsThe developed CXR-LLaVA model effectively detects major pathological findings in chest X-rays and generates radiologic reports with a higher accuracy compared to general-purpose models.Clinical relevanceThis study demonstrates the potential of multimodal large language models to support radiologists by autonomously generating chest X-ray reports, potentially reducing diagnostic workloads and improving radiologist efficiency.Key PointsQuestionHow can a multimodal large language model be adapted to interpret chest X-rays and generate radiologic reports?FindingsThe developed CXR-LLaVA model effectively detects major pathological findings in chest X-rays and generates radiologic reports with a higher accuracy compared to general-purpose models.Clinical relevanceThis study demonstrates the potential of multimodal large language models to support radiologists by autonomously generating chest X-ray reports, potentially reducing diagnostic workloads and improving radiologist efficiency.

Publisher: SPRINGER

ISSN: 0938-7994

DOI: 10.1007/s00330-024-11339-6

URI: https://scholar.gist.ac.kr/handle/local/9085

Appears in Collections:: Department of AI Convergence > 1. Journal Articles

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.