Geographic data can be enormously beneficial for analyses. In studies of aging, for example, they can reveal areas where elderly people live in high densities;they can illuminate how environmental factors impact the health and quality of life of elderly people;and, through contextual data, they can yield insights into the social and economic conditions and lifestyle choices of the elderly. However, geographic variables are among the most challenging data to share when making a primary data source available to others. Fine geography enables ill-intentioned users to pinpoint the identities of individuals in the shared file. Thus, data collectors typically delete or aggregate geographies to very high levels before sharing data. As examples, both deletion and aggregation are employed on geography in the public use files of the Health and Retirement Study;and, the Health Insurance Portability and Accountability Act requires that any geographic units on shared files comprise at least 20,000 people. These actions reduce the quality of analyses based on finer geographic detail, thereby sacrificing the benefits of using geography in analysis. We develop new methods to protect confidentiality in data with geographic identifiers. Our approach is to simulate values of geography and other identifying attributes, such as age, from statistical models that capture the spatial dependencies in the collected data. These simulated values replace the collected ones when sharing data. Partially simulated datasets can preserve confidentiality, since identification of units and their sensitive data is difficult when the geographies and other quasi-identifiers in the released data are not collected values. And, when the simulation models faithfully reflect the relationships in the collected data, the shared data preserve spatial associations, avoid ecological inference problems, and provide details about the tails of distributions. We have three specific aims in this proposal. First, using techniques from spatial modeling, we develop methods for simulating geographic variables conditional on attributes and for simulating at- tributes conditional on geography. Second, we apply our approach on a genuine dataset to evaluate the confidentiality protection and analytic utility of partially simulated data under three scenarios: only geography simulated, only non-geographic identifiers simulated, and both geographic and other identifiers simulated. Third, we compare our approach against aggregation techniques on the genuine dataset. Our long term goal is to develop general-purpose methodology and publicly available software for sharing inference-valid, safe data that includes finer details about geography than are currently released. This will provide statistical agencies, researchers, and other data producers with more and better options for data sharing than exist at present. PUBLIC HEALTH RELEVANCE: This research has the potential to improve the way statistical agencies, research centers, individual researchers, and other data producers share data on aging, and more broadly any health or de- mographic data containing geography. Unlike existing approaches such as deletion and high level aggregation, our approach promises to preserve fine geography and spatial relationships while pro- tecting confidentiality. Ultimately, this enables secondary data analysts to make more and better inferences, leading to deeper understanding of public health.