Study 1: Combine the results from multiple tests in a multi-stage screening study. Summary: In studies to ascertain true disease status, a definitive diagnostic test often is too invasive or expensive to be applied to all subjects, in which case a two-phase design is often used. The results for all subjects from the Phase 1 test, which is inexpensive and non-invasive, are used to determine which subjects will receive the gold standard test in a later phase. Analysis restricted only to verified cases leads to verification bias. The multiple phase design has been used in studies of dementia and in the diagnosis and screening of many other diseases, e.g., colorectal and breast cancer. The design usually involves more than two phases. For example, in a three-phase study the prevalent test in Phase 1 usually has high sensitivity, but relatively low specificity;Phase 2 consists of a second application of the screening test or a more confirmatory test;and the test in the final phase is the gold standard. In this paper, we proposed a method of estimating the parameters of test efficacy and the ROC curves for continuous screening tests in a multiple-phase study in the presence of verification bias. The verification process and efficacy of the screening test could also depend on covariates. We evaluated estimates of parameters of test efficacy after adjusting for verification bias, and we compared different schemes for combining the sequential tests using empirical studies. If we assume the people with unverified dementia status as non-demented, we tend to be optimistic about the ROC curve. For example for a subject who is 70 years old with no education, using a cut-off at 75 yields FPR=0.42 and TPR=0.64. If we ignore the verification bias, then FPR=0.39 and TPR=0.96, which over-estimates the specificity. Comparing Figure4 a-d, we see that education level has a remarkable impact on test accuracy if the cut-off of 75 is used, but there is not much difference for different ages. For the subject who is 70 years old and has 10 years of education, the FPR is 0.05 and TPR is 0.30 for the cut-off of 75. So the screening test using the cut-off of 75 has a high sensitivity and relatively low specificity for subjects with a 10-year education. For subjects with low education, both sensitivity and specificity are moderate. In terms of AUC under the ROC curve, the CASI test performed better for subjects with higher education. Study 2. ROC-GLM model for comparing multiple tests with verification bias The diagnostic capability or accuracy of an continuous screening test is often assessed using a receiver operating characteristic (ROC) curve using the true disease status, which is ascertained by a deterministic gold standard test. In practice, a gold-standard test may be too expensive or too invasive for regular use. As a result, a subset of the study population is selected to ascertain the disease status, based on the results of screening test. In the screening stage, one or multiple screening tests are used to identify the subjects for final work up of gold standard test. This study design is usually called multi-stage (phase) screening design. The approach used in Study 1 involves assumption on the distribution of the cognitive tests scores, which may be untestable in the presence of verification bias. Here we consider a parametric distribution-free method for comparing the accuracies of multiple tests. In particular, we will parameterize the form the ROC curve but will not make any additional assumptions about the distributions of test results. We use a generalized linear model (GLM) to estimate the ROC curves of multiple tests in the presence of verification bias. The GLM-ROC model is useful to compare the accuracy of multiple screening tests in a multi-stage diagnosis study, e.g., AGES-RS study.