Timely implementation of appropriate disease control measures is facilitated by earlier detection of disease[unreadable] outbreaks whether due to bioterrorism or naturally occurring pathogens. There currently exists a range of[unreadable] automated health systems data concerning e.g. ambulatory care and emergency department visits,[unreadable] hospitalizations, diagnostic tests and pharmaceutical drugs; moreover, the availability of these data is likely[unreadable] to increase with greater use of health information technology. While such information could be invaluable for[unreadable] disease outbreak detection, not enough is known about the relative merits of different data sources for early[unreadable] detection of disease outbreaks.[unreadable] In this project we will evaluate and compare the efficacy of different health services data sources for early[unreadable] disease outbreak detection, including telephone inquiries, ambulatory care visits, emergency department[unreadable] visits, laboratory test requests and results, radiology tests, hospitalizations, drug prescriptions and drug[unreadable] dispensings. As test-beds we will use two large integrated health delivery systems (Harvard Pilgrim Health[unreadable] Care / Harvard Vanguard Medical Associates and Kaiser Permanente Northern California) with[unreadable] comprehensive electronic medical information on over four million persons. This means that we will have[unreadable] information about each health encounter data source for exactly the same well-defined population, which is[unreadable] critical for proper comparison. The data sources will be evaluated using all three statistical signal detection[unreadable] algorithms chosen by the BioSense Initiative, plus the space-time permutation scan statistic. The latter[unreadable] automatically adjusts for any purely temporal and purely spatial variation in the data, so that the data[unreadable] comparison does not depend on our relative success at modeling that noise through statistical regression[unreadable] models for different data sources. The different data sources will be evaluated with respect to the number,[unreadable] timeliness, accuracy and precision of signals in four different ways, (i) Total number of signals compared to[unreadable] expect under the null hypothesis of no outbreaks, (ii) Concordance between signals and known disease[unreadable] outbreaks as defined by e.g. local public health departments, (iii) Confirmation or rejection of signals by[unreadable] boking at subsequent detailed health information for those individuals generating the signals, (iv) Presence[unreadable] or not of signals when the real data is spiked with simulated outbreaks.