Although recently published genome-wide association studies (GWASs) have localized many disease-associated genetic variants, they only account for tiny proportions of heritable phenotypic variations, suggesting that only a small fraction of causal loci have been identified, due to genetic heterogeneity (i.e. multiple genetic variants associated with a complex trait), confirmed small to modest effect sizes of common genetic variants and limited statistical power of current analysis methods. On the other hand, GWAS data also offer an exciting opportunity for personalized medicine, aiming to assign the most suitable treatment or intervention to an individual based on his/her clinical and genetic information. However, there is still quite a distance in translating GWAS data to practice of personalized medicine, largely due to the paucity of powerful analysis methods. This research is devoted to several emerging topics in personalized medicine with high-dimensional genetic and clinical data. Building on the advances in penalized regression and classification made during the previous funding period, we propose developing innovative and powerful statistical methods for GWAS data to discover novel gene pathways and utilize them in personalized medicine. In particular, we aim to discover de novo gene pathways containing SNPs with individually weak, but collectively strong, effects on complex disease and traits. We combine the available Lung Health Study (LHS) clinical data and GWAS data from two different sources, applying the developed statistical methods to explore how genetic variants and baseline clinical variables possibly modify the effects of smoking interventions, and how to determine an optimal individualized intervention rule for any given subject.