Abstract Accurately detecting structural variation in the genome is a challenging task. Many approaches have been developed over the last few decades, yet it is estimated that tens of thousands of variants are still being missed in a given sample. Many of these variants are missed due to the limitations of using short-read sequencing to identify large variants. Although many of these missed variants are located within complex regions of the genome, it has been shown that some still have clinical relevance making their discovery important. New platforms have been developed for sequencing the genome using long-reads and show promise for overcoming many of these limitations creating the ability to identify the full spectrum of simple and complex structural variants. Because this technology is relatively young, new computational approaches to support the analysis of long-read sequencing data can aid in the discovery of these variants which are still being missed. In addition to detecting novel variation in samples with long-read sequencing data, computational approaches can be developed to leverage these novel variant calls to reanalyze the hundreds of thousands of short-read datasets currently available. In this proposal, we plan to develop new computational approaches to identify novel structural variation in the genome. In Aim 1, we will apply a recurrence approach to analyze long read sequencing datasets utilizing deep neural networks. In Aim 2, we will develop a tool to derive profiles of structural variants predicted in long- reads which can be used to identify and genotype structural variants calls in short read data-sets. Together, these approaches will allow researchers to accurately characterize structural variation in both long and short- read datasets.