Reconstruction of 1D Genome Sequences and Prediction of 3D Genome Organization
- Author(s)
- Yeonghun Lee
- Type
- Thesis
- Degree
- Doctor
- Department
- 대학원 전기전자컴퓨터공학부
- Advisor
- Lee, Hyunju
- Abstract
- Genome analysis is fundamental to biological research. Technical advances of next generation sequencing led us to understand human genomes and computational methods have been following them. In the thesis, we present an integrative framework for genome reconstruction using whole genome sequencing that overcomes the limitations of early stage methods. In the recent few years, the 3D genome organization has led attention with advance in chromosome conformation capture techniques. These techniques provide the 3D genome information beyond the 1D genome context including promoter-enhancer interactions, loops, and topologically associating domains. Based on 1D genome sequences reconstructed by our framework, we next investigate 3D genome organization using chromosome conformation capture sequencing.
In the first part of the dissertation, we present Integrative Framework for Genome Reconstruction (InfoGenomeR)-a graph-based framework that can reconstruct genome sequences at the genome-wide level by integrating structural variations, total copy number alterations, allele-specific copy numbers, and haplotype information. Using whole-genome sequencing data sets of cancers, we demonstrate the analytical potential of InfoGenomeR. We discover derivative chromosomes and karyotype topologies including chromothripsis, homogeneously staining regions, and double minutes. Moreover, we show that InfoGenomeR can discriminate private and shared karyotype changes between primary and metastatic cancers that could contribute to tumour evolution.
In the second part of the dissertation, we present Integrative Framework for Hi-C prediction (InfoHiC)-an architecture to predict 3D genome folding from rearranged DNA sequences using a convolutional neural network. In combination with InfoGenomeR, InfoHiC provides 3D genome organization of individual genomic contigs. By applying InfoHiC to cancer Hi-C data, we found neo formations of loops and topologically associating domains. We demonstrate superenhancer hijacking events are crucial for gene overexpression revealing functional effects of noncoding structural variants.
- URI
- https://scholar.gist.ac.kr/handle/local/19629
- Fulltext
- http://gist.dcollection.net/common/orgView/200000884560
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.