Noninvasive prenatal diagnosis using targeted DNA capture and sequencing Abstract: Noninvasive Prenatal Diagnosis (NIPD) is the study of fetal genetic material circulating in maternal peripheral blood, with which molecular diagnosis is achieved earlier in pregnancy and without risk to the health of the fetus. During the last decade, several approaches have been developed to determine fetal sex for sex-linked disorders, fetal rhesus D blood group status in rhesus D negative women and paternally inherited mutations for dominant diseases. More recently, massively parallel sequencing of maternal plasma DNA was applied to detect fetal chromosomal aneuploidies and a monogenic condition. However, the small amount of cell-free fetal DNA and the interference caused by the high maternal DNA background greatly limits the accurate base calling of DNA variants, which depends on a sufficient number of sequence reads (coverage) of the genomic regions of interest. Multiplex DNA sequence capture provides a new strategy but the current methods such as hybridization-based target enrichment require DNA input amounts beyond those obtained from a maternal plasma sample and are relatively expensive. Here we propose to develop long padlock probes (LPPs) for applications in NIPD in order to overcome the difficulties associated with identifying genetic and structural variants in fetal DNA. LPP-based capture combined with next- generation sequencing (e.g. Illumine HiSeq) allows for targeted resequencing of specific disease- related genomic regions at high accuracy and completeness and at comparably lower costs and DNA requirement. To assess capturing efficiency and accuracy of the base calling algorithms, such as to estimate and control the false discovery rate (FDR) for novel DNA variants, we will utilize sample sequences with known chromosomal aneuploidies and genotype content (i.e. Trisomy 21, Cystic fibrosis). We will establish computer resources, data distribution and analysis software for calling genotypes from sequence data that take into account sequence quality and coverage distributions and provide feedback to the experimental strategy. Our targeted approach allows multiple samples to be analyzed per sequencing run through sample indexing and reduces the sequencing costs to approximately $100/sample. Developing this integrated approach has important implications for the diagnosis, in early pregnancy, of genetic diseases or groups of disorders that are prevalent in high- risk families and in certain populations.