The advent of genomic and imaging technologies provides us with a great opportunity to study and understand health conditions, including substance use and mental illnesses, which are complex and depend on both genetic and environmental factors. In the past decades genomewide association studies (GWA) have identified and robustly replicated numerous genetic variants that are associated with complex diseases. Despite those successes, it remains persistently difficult to identify genes and environmental factors--the so called geneticist's nightmare. Most of the identified variants have low associated risks and account for little heritability, and there is increasing attention focused on finding the ?missing heritability of complex diseases. Furthermore, it is documented that clinical contributions from neuropsychiatric research have been minimal due to traditionally small sample sizes of studies, biologically incorrect diagnostic labels, comorbidity and heterogeneity of the diseases. To address these problems and advance clinical science, we need to develop novel models and methods to efficiently use and understand the available data. This is the primary motivation for our project. We will develop more efficient approaches that utilize biological information (genetic and/or phenotypic data) and directly address the comorbidity issue. In addition, we will analyze large datasets such as UK BioBank with demographic, clinical, and genetic data. We will further take advantage of the investigators' many years of experience in the data collection and analysis of GWA studies and build on our successes in the development and applications of statistical methods and software for complex studies. The primary aim of this application is to develop, evaluate, and apply new statistical (both parametric and nonparametric) models, methods, and software to conduct genetic analyses of complex diseases. To deal with the challenges stated above, our proposed methods will address one or more of the following topics: (a) analysis of genetic, phenotypic, and environmental data; (b) modeling comorbidity through multivariate traits; and (c) identification and incorporation of novel genetic variants including their interactions with environmental factors by using and developing state-of-the-art statistical methodology and software, such as trees and forests. The success of our project will have a direct impact on our understanding, and ultimately, the treatment and prevention of diseases which are of significant public health concern.