Project Summary Current clinical practice in screening tests involves subjective interpretation of patients' test results such as mammograms by trained experts. Substantial variability is often reported between radiologists' visual classifications of breast images, impacting the accuracy and consistency of common screening tests including mammography. Factors related to patients and raters and the technology itself may impact experts' ratings of breast cancer and density, an important predictor of breast cancer. However, the study of accuracy and consistency between radiologists' ratings in large-scale cancer longitudinal screening studies is challenging due to the ordinal nature of the classifications and many experts each contributing ratings. Newly emerging processes including automated 3-D procedures provide exciting potential for estimating breast density in routine clinical settings. Currently very few statistical approaches and summary measures exist to model the consistency and accuracy between several radiologists' ordinal ratings. Further, few methods can investigate the influence of patient and radiologist characteristics, the use of automated procedures and comparison of the different technologies upon accuracy and consistency. Our goals are to develop new statistical methods based upon generalized linear mixed models and latent variable models to study accuracy and consistency amongst many experts in large-scale screening studies. Our approach can flexibly accommodate many experts' ratings and other factors to examine their influence on consistency and accuracy. We will derive novel model-based summary measures of agreement and accuracy. We will implement our new statistical methods in recent large-scale breast imaging studies. A key strength of our proposed research is to provide medical researchers with a flexible modeling approach and novel summary measures that utilize all the data simultaneously, where conclusions can be drawn about the consistency between typical experts and patients in the populations, greatly increasing efficiency and power. The study of patient and rater characteristics on the levels of consistency and accuracy between raters' classifications will translate to improvements in training radiologists and practice of interpreting mammograms, and ultimately, a more effective breast screening procedure.