Improving statistical methods to provide better classification performance and new analytical capabilities for categorical regression would be invaluable to the medical and health care research communities. Categorical regression models (e.g., binary logistic, multinomial logistic) are used extensively to identify patterns of alcohol-related symptoms, screen for disorders, and assess policies. In addition, such models are used extensively in other areas of research such as mental illness, cancer, traumatic injuries, and AIDS-related pathologies. However, many such models are developed with inadequate support to fully analyze and exploit the intrinsically probabilistic nature of their results. This is of critical importance as health researchers, clinicians, and administrators are often faced with classification decisions using categorical regression models to identify unacceptable risks, adequate outcomes, and acceptable guidelines for screening, diagnoses, treatment, and quality of care. Commercially available statistical software does not offer sophisticated methods for robust estimation of posterior probabilities in the presence of model misspecification, missing covariates, and nonignorable missing data generating processes. Such robust missing data handling methods provide natural mechanisms for dealing with verification bias and modeling correlated, longitudinal, or survey data with complex sampling designs. Moreover, commercially available statistical software does not provide automated methods for using estimated posterior probabilities to make optimal classification decisions with respect to different optimality criteria. In particular, automated features such as optimizing multiple decision criteria (allocation rules) that trade off specificity against sensitivity, decision threshold confidence intervals, statistical tests for evaluating correct specification of posterior probabilities, statistical tests for comparing competing classifier thresholds, and methods for multi-outcome classification and inference are not readily available. Phase II research will extend Phase I findings for binary logistic regression to develop and implement automated robust classification methods for multinomial logistic regression modeling, which also applies to the larger class of nonlinear categorical regression models that output posterior probabilities. The Phase II software prototype will provide: 1) new user-selectable robust decision threshold estimators, 2) robust confidence intervals on decision threshold estimators, 3) new classifier threshold comparison tests, 4) new outcome probability specification tests, 5) efficient missing data handling methods in the presence of nonignorable nonresponse data, and 6) second-order analytic and simulation-based Bayesian methods for improved small sample and rare event outcome probability estimation. These new methodologies will be integrated into a prototype user-friendly software package, evaluated with extensive simulation studies, and then applied to real world classification problems encountered in: alcohol, mental illness (depression, bipolar, schizophrenia), cancer (prostate), trauma (emergency room), and infectious disease (AIDS) through collaborations with domain experts in those respective fields. In summary, Phase II research will establish the essential technical foundation for Phase III commercialization with the objective of providing a suite of new classification analysis methods as an advanced statistical tool that improves epidemiologic, clinical, and public health research. [unreadable] [unreadable] [unreadable] [unreadable] [unreadable]