Summary Statistical modeling plays a central role in a wide range of scientific investigations. Studies of complex traits and disorders such as addictive behaviors, psychopathology, cardiovascular disease, obesity, and cancer are now faced with a set of statistical challenges that require improved software. The challenges include: i) the difficulty of measuring behavioral traits;ii) the availability of technologies - such as magnetic resonance imaging, continuous physiological monitoring and microarrays - which generate extremely large amounts of data often with complex time-dependent patterning, and iii) increased sophistication in the statistical models used to analyze the data. The current project proposes to rewrite the popular Mx statistical package in order to address these challenges. Both the user specification of models and the algorithms used to ?t them will be significantly improved. The software will be i) split into modules that interoperate with the R statistical package, ii) released as open source so as to provide a stable path for future maintenance and development and iii) integrated with the VDL parallel work software. Grid/parallel computing and data management using VDL will provide significant speedup for processing large (up to multi-terabyte) data sets, through the use of analytical work that provide detailed provenance tracking and annotation of derived results. Revised algorithms for model and optimization will increase both the scope of the software and its performance. Both the code and its use will be documented and disseminated at national and international workshops. Studies of complex traits and disorders such as addictive behaviors, psychopathology, cardiovascular disease, obesity, and cancer are now faced with a set of statistical challenges that require improved software. The challenges include: i) the difficulty of measuring behavioral traits;ii) the availability of technologies - such as magnetic resonance imaging, continuous physiological monitoring and microarrays - which generate extremely large amounts of data often with complex time-dependent patterning;and iii) increased sophistication in the statistical models used to analyze the data. The current project proposes to develop software that uses massively parallel computing grids, "cyberinfrastructure", in order to address these challenges.