The large and growing size of the healthcare system makes it imperative to understand what is happening to us, the recipients of healthcare, to be able to efficiently conduct research to improve healthcare delivery and to improve the state of biomedicine by advancing its science. i2b2, "Informatics for Integrating Biology and the Bedside" seeks to provide this instrumentation using the informational by products of healthcare and the biological materials accumulated through the delivery of healthcare. This complements existing efforts to create prospective cohort studies or trials outside the delivery of routine healthcare. In the first round of i2b2, we demonstrated that we could identify known adverse events and phenotypically select and then genotype patients for genetic association at approximately 1/10* of the price and less than l/10 of the time usually entailed to develop such populations for study. The challenge we have set ourselves for the next methodological challenge in i2b2 is the development of Virtual Cohort Studies (VCS) encompassing the population of a healthcare system as study subjects and asking questions of comparative effectiveness, unforeseen adverse events and identification of clinically relevant subpopulations including both clinical and genome-scale measures. We will be comparing the results of the VCS to those of carefully planned and executed cohort studies such as the Framingham Heart Study. VCS will require multiple methodological advances and tools development including in the disciplines of natural language processing, temporal reasoning, predictive modeling, biostatistics and machine learning. VCS methods will be tested by two driving biology projects, the first studying a collection of autoimmune diseases and the second type 2 diabetes. In both projects, VCS methods will be applied to investigate the components of cardiovascular risk from the genetic to the epigenetic and including the full range of clinical history including medications exposure. A systems/integrative approach will be taken to identify commonalities in these risk profiles across these disparate disease domains. VCS methods will be shared with i2b2 user community under open source governance while i2b2 user community contributions are folded into the i2b2 toolkit.