Cancer is an evolutionary process in which cells acquire genomic, epigenomic, and other cell-state alterations. These alterations can confer selective advantages, promote tumor growth, and drive treatment resistance. This evolutionary complexity is especially pronounced in highly heterogeneous cancers such as skin cutaneous melanoma, where diverse subclonal populations and high mutational burden complicate clinical targeting. Understanding how distinct tumor subclones arise, diverge, and acquire mutations is therefore essential for identifying cancer drivers and improving treatment.
The completed work investigates tumor evolution through the integrated analysis of somatic single-nucleotide variants, structural variants, copy number alterations, and DNA methylation of the B2905 (M4) melanoma model. Using long-read sequencing of 23 single-cell-derived melanoma subclones, we produce a high-resolution view of genomic and epigenomic evolution between model subclones across its major clades and over time. We examine complex structural variants, parallel copy number alterations, and distinct methylation differences between clades. To harmonize these diverse variant classes across an established phylogeny, we developed TreeHarmonizer, a method that places variants onto tree branches and enables a joint evolutionary interpretation of multiple alteration types.
The proposed work extends this framework in two directions. First, TreeHarmonizer will be expanded from a threshold heuristic based approach into a generalized, loss-aware model capable of identifying homoplasy using constrained maximum likelihood estimation, in order to work with broader data inputs and provide a faster alternative to joint-variant phylogeny reconstruction methods. The tool will also be developed into a standalone pipeline compatible with arbitrary variant callers’ inputs and include broader downstream analyses. Second, a new method will be developed to improve structural variant detection in repetitive and traditionally excluded genomic regions in analyses of tumor evolution, including centromeres, telomeres, and tandem repeats. By combining motif-aware read recruitment, local normal graph assembly, and repeat-aware complex variant resolution, this approach aims to uncover somatic alterations in these difficult regions of cancer genomes. Together, these efforts advance computational methods for profiling tumor evolution and expand the genomic landscape accessible to tumor evolutionary analysis.
Anton Goretsky is a PhD student in Computer Science at the University of Maryland, College Park, and a Research Fellow in the Cancer Data Science Laboratory at the National Cancer Institute, NIH. They are co-advised by Dr. Mikhail Kolmogorov (NIH) and Dr. Erin K. Molloy (UMD). Their research centers on algorithmic development in cancer gemonics, with a focus on profiling tumor evolution and variant calling using long-read sequencing technologies.
Examining Committee Chair: Dr. Erin Molloy
Department Representative: Dr. Laxman Dhulipala
Members:
Dr. Mikhail Kolmogorov
Dr. Robert Patro
Dr. Stephen Mount

