Health information technology has enabled healthcare community to store and share a large amount of health and healthcare data electronically. While secondary use of this data has significantly enhanced the quality and efficiency of medical and healthcare research, there is a growing concern about privacy due to such use of personal data. The goal of this research, as a response to this challenge, is to develop and test a novel data- masking technology that can be used by healthcare organizations to prevent or limit privacy disclosure when sharing patient data for research. To protect patient privacy, the Health Insurance Portability and Accountability Act (HIPAA) has established a set of rules concerning what information cannot be released to a third party. However, studies have shown that the HIPAA rules lack the flexibility to adequately meet the diverse needs of data users;they can be under- protective in some cases and over-protective in others. Recognizing this limitation, HIPAA also provides guidelines that enable a scientific assessment of privacy disclosure risk to determine if the data is appropriate for release. This research focuses on this aspect of HIPAA and its related topics. The specific aims of this research are: (1) to identify weakness in the HIPAA rule-based privacy protection mechanism and demonstrate this problem using data available to users with different access levels;(2) to propose metrics for assessing and quantifying privacy disclosure risk and data utility;(3) to develop methods and techniques for privacy protection when sharing and disseminating data;and (4) to conduct experiments to evaluate the afore-mentioned risk and utility metrics, and data-masking techniques. The proposing team has identified an effective technique to systematically compromise data privacy. This provides a basis for a more thorough study to achieve specific aim 1. Methods grounded on statistics and information theory will be employed to construct the metrics for specific aim 2. The data-masking approach for specific aim 3 employs an innovative divide-and-counter strategy, which first partitions data into subsets and then masks the data within each subset. Experimental design for specific aim 4 involves performance evaluations in terms of disclosure risk, data utility, and computational scalability, using three categories of data: clinical data, Medicare claims, and publicly available personal data. This research is highly relevant to the mission of NIH. By adequately protecting privacy, the proposed technology will alleviate concerns about loss of participant confidentiality and enable improved quality and efficiency for research based on secondary use of data. This will greatly help design and develop "programs for the collection, dissemination, and exchange of information in medicine and health," thereby achieving NIH's goal to "expand the knowledge base in medical and associated sciences." This research will also offer valuable insights for policy makers to assess the tradeoff between privacy protection and data sharing and analysis.