Development of a drug-protein affinity prediction model using contrastive learning
- Abstract
- The purpose for efficient drug discovery is increasingly leveraging computational predictions of compound-protein interactions (CPI), a vital aspect due to the vast chemical space and multitude of protein types. Traditional structure-based and physics-based methods for predicting CPIs, while useful, often stumble due to the unavailability of 3D structural data. To circumvent these limitations, our study introduces a deep learning-based model. To address the cold start problem commonly encountered in CPI prediction, we implemented two key strategies: (i) Use domain related embedding methods to generate a generalized representation of each drugs and proteins. (ii) Contrastive learning, which refines the model's capacity to predict affinities for unseen data by minimizing and maximizing vector distances between positive and negative samples, respectively. Our model demonstrates superior performance than baseline models, evidenced by lower root mean squared error (RMSE) and higher Pearson correlation coefficients. Ablation studies further underscore the value of embedding methods and contrastive learning in enhancing the model's accuracy and generalizability.
In summary, this study presented a novel approach that applies the already verified embedding methods to generate protein and compound embeddings, respectively, and uses a contrastive learning method to integrate them effectively.
- Author(s)
- Yoon, Jaesuk
- Issued Date
- 2024
- Type
- Thesis
- URI
- https://scholar.gist.ac.kr/handle/local/19112
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.