There is growing interest in how knowledge can be effectively and efficiently extracted from large collections of healthcare data. In today's healthcare environment, large clinical and administration databases have been growing, facilitated by the proliferation of information systems. Those data have been actively used for healthcare research, with an emphasis on studying access to health care, costs, and quality of care. However, there have been limitations in working with these data that are known to be very large and complex and that easily exceed the capabilities of traditional analytic approaches and human cognition. Knowledge Discovery in Large Databases (KDD) by using data mining techniques is an emerging approach to the development of knowledge bases and prediction models in large collections of healthcare data. A Bayesian network is one promising data mining technique with a high predictive capacity for prediction problems. The Bayesian network has been successfully applied to diagnoses and prediction of prognoses in the medical domain. The purpose of this study is to explore the potential application of Bayesian networks to health outcomes by predicting outcomes using large healthcare databases. One area of particular interest to researchers, clinicians, and administrators is developing a better understanding of access to care. Sub-populations of HIV positive individuals in the U.S. are known to have decreased access to and use of care, which may contribute to increasing complications and mortality. The HIV Cost and Services Utilization Study (HCSUS) dataset has been used to examine access using traditional statistical modeling techniques, such as logistic regression. However, these techniques have limitations, particularly in untangling the complexity of predicting health service utilization. The specific aims of this study are to: 1) develop Bayesian network models predicting limited health service utilization using the HCSUS data; and 2) validate Bayesian network models by comparing them to the logistic regression models built in a previously published study. This study will provide researchers new insight into working with large datasets. The repeated testing and refinement of Bayesian network modeling techniques, particularly as they are applied to other health outcomes in large databases, contributes to knowledge discovery and promotes more sophisticated analyses.