Prostate specific antigen (PSA) is the only molecular marker routinely used for the early detection of a common cancer, with approximately 75% of men aged 50 years or older having had at least one PSA test. The results of the European Randomized Study of Prostate Cancer, which assessed the value of PSA testing in men who would otherwise not be screened, offered qualified support for PSA testing, showing a modest reduction in prostate cancer mortality. Yet PSA remains an imperfect test. Although PSA is highly specific to the prostate gland, a modestly elevated PSA in blood is not specific to cancer as common benign conditions such as benign prostatic hyperplasia and prostatitis also lead to modest PSA elevations. Accordingly, most men with modestly elevated PSA do not have prostate cancer and it has been estimated that at least 750,000 American men are needlessly subjected to prostate biopsy. Prostate biopsy is not only inconvenient, with 1 in 3 men having to take two or more days from work, but can cause infection, urinary retention or other major complications. To aid the clinical decision as to whether biopsy an individual patient, several investigators have proposed risk prediction models that include information such as PSA level, age, family history and the results of the digital rectal exam (DRE). These models provide a predicted probability of a positive biopsy that can be used to aid patient counseling. However, current prediction models have often been shown to be inaccurate on external validation. One possible reason is that all of the numerous models currently available were developed using a single cohort, a highly problematic approach given important differences between cohorts. In this proposal, we will establish a large, international, multicenter collaboration that will share data on biopsy outcome to allow statistical prediction modeling that incorporates cohort characteristics. The collaboration includes centers in the US, Canada, the Caribbean, Germany and Italy; major academic centers as well as more community-based practices; those seeing predominately Caucasians and those with a racially-diverse group of patients. In total, we will have data on approximately 7000 biopsies per year, over 1000 of which will be in African-Americans or Hispanics. We will establish an informatics infrastructure to allow these centers to send harmonized data on biopsy outcomes to the statistical center. We will then create a prediction model that incorporates both patient-level factors (such as age and PSA) and those at the cohort-level (such as biopsy scheme). The model will be evaluated on an independent validation sample. We will formally test the hypothesis that including multiple cohorts in prediction modeling leads to more accurate models than those generated on single cohorts. As an additional aim, we will update the model as new data become available to evaluate dynamic modeling in contrast to traditional static models. We will also evaluate statistical methods for incorporating data on novel markers into existing prediction models.