Methods for retrospective multi-site research will be developed in this project. Integrating data from multiple existing sources has the potential to substantially advance the study of behavior, particularly with respect to understanding complex behavioral disorders. Dyslexia is used as a model to develop methods for complex disorders because of the significant behavioral heterogeneity across people with dyslexia that could be used to develop a richer understanding of reading disability and other complex disorders with equally varied behavioral profiles such as autism or attention deficit hyperactivity disorder. Methods for three critical phases of retrospective multi-site research will be developed. Aim 1 is focused on subject privacy, including the development of an automated data de-identification software tool that can be used for multi-dimensional data sets. Aim 2 is focused on characterizing the behavioral heterogeneity within and across samples of complex disorders, [including the use of multiple imputation to address missing-ness in multi-site data sets.] Aim 3 is focused on how to appropriately analyze data from different samples that have included matched and unmatched case-control study designs. Our goal is to enhance the quality control and scientific power for behavioral and biologic studies of complex disorders, as well as advance the behavioral and neurobiological understanding of dyslexia. By developing standards for integrating behavioral and biological data, retrospective databases could be used to reveal etiologies and establish a consensus for the effective treatment of complex behavioral disorders. PUBLIC HEALTH RELEVANCE: Integrating data from existing sources has the great potential to speed the identification of endophenotypes and their neurogenetic origins for complex behavioral disorders, develop biomarkers for tracking disease and treatment efficacy, and evaluate treatment effects. This project focuses on the development of methods for retrospective multi-site behavioral and biological studies, using dyslexia as a model. The results will establish standards for analysis of multi-site data that will enhance quality control and statistical power.