OAK

GIST Library Login

GIST Scholar College of Information and Computing Department of Electrical Engineering and Computer Science 4. Theses(Ph.D)

Machine learning and multi-omics based drug candidates prediction modeling

Metadata Downloads

Author(s): Eunyoung Kim

Type: Thesis

Degree: Doctor

Department: 대학원 전기전자컴퓨터공학부

Advisor: Nam, Hojung

Abstract: Undesired drug effects, including drug toxicity and side effects, are one of the major causes of failure in the drug discovery and development process and withdrawal from the market. As identification through experiments requires tremendous time and cost, in silico modeling has been developed for drug safety screening.
In this study, I first constructed a machine learning model to predict drug-induced liver injury (DILI), one of the most frequent and concerning issues in drug toxicology. Because of a rare occurrence of DILI, two machine learning models were used to develop reliable prediction models with weighted molecular fingerprints as features. The Bayesian probability of each substructure was calculated to give weights on frequent substructures in DILI-positive compounds. The constructed Random Forest and Support Vector Machine models resulted in an accuracy of 73.8% and 72.6% in internal validation and 60.1% and 61.1% in external validation. The weighted fingerprints contributed to the performance increase, and significant substructures were reported. Finally, I applied the constructed models to predict the hepatotoxic potentials of compounds in natural products.
Secondly, the prediction of undesired drug effects from drug-drug interactions (DDIs) was considered. DDIs can result in various side effects while either drug is not harmful solely. It has been a significant concern because polypharmacy is commonly used to treat complex diseases. I developed a deep learning-based framework for DDI prediction with drug-induced gene expression signatures. Also, feature engineering using a gating mechanism was conducted to mimic co-administration effects on each paired drug and to analyze significant genes when interactions occur. Moreover, a translational embedding method and margin-based loss were adopted to learn distance-based scoring function for classification labels. As a result, the model achieved an AUC of 0.889 and an AUPR of 0.915 in unseen interaction prediction, outperforming previous methods. Also, the model could predict potential interactions with new compounds and proved interpretability by feature analysis.

URI: https://scholar.gist.ac.kr/handle/local/19466

Fulltext: http://gist.dcollection.net/common/orgView/200000883262

Alternative Author(s): 김은영

Appears in Collections:: Department of Electrical Engineering and Computer Science > 4. Theses(Ph.D)

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.