The diversity of all living organisms is encoded within their DNA, where it is stably maintained through DNA replication, and then retrieved through transcription into RNA and translation into proteins. Thus, the diversity of life is limited by the four natural nucleotides that comprise DNA and RNA and the twenty natural amino acids that they encode. Drawn by the potential conceptual and practical ramifications, chemists and biologists have been fascinated by the idea of expanding the genetic alphabet (DNA and RNA) as well as the genetic code (proteins). This requires an unnatural base pair that is replicated within DNA, transcribed into RNA, and then able to direct the incorporation of an unnatural amino acid into a protein during translation. Expansion of the genetic alphabet/code would make available DNA, RNA, and proteins that are tailored to possess novel properties - just as these biopolymers are receiving increased attention for applications ranging from novel materials to therapeutics. The ability to synthesize or evolve these biopolymers with desired activities outside the scope encoded by their natural constituents promises to greatly increase their potential applications. Expansion of the genetic alphabet/code would also lay the foundation for the first semi-synthetic organism, able to store increased information in its DNA and retrieve it in the form of novel proteins. A survey of the literature demonstrates that the expansion of the genetic alphabet/code, at least in vitro, is currently limited by the absence of a functional unnatural base pair that is replicated and transcribed with sufficient efficiency and fidelity. In the previous funding period we developed the first unnatural base pair, d5SICS:dNaM, that is replicated and transcribed with efficiency and fidelity that is beginning to approach that of a natural base pair. Molecular recognition within this pair is based not on complementary hydrogen-bonding, as with the natural base pairs, but on complementary hydrophobic and packing forces, more similar to proteins. We also developed a polymerase selection system capable of tailoring the enzymes responsible for the replication of DNA to better recognize the unnatural base pair. With these tools in hand, we now propose to use a variety of synthetic and biological methods to: 1) Characterize structural determinants of d5SICS:dNaM stability, replication, and transcription;2) Optimize d5SICS:dNaM for natural like replication and transcription; and 3) Develop in vitro applications of an expanded genetic alphabet. The completion of these aims should produce an expanded genetic alphabet/code with biophysical and biomedical utility, as well as lay the foundation for a living organism with a semi- synthetic genome. Figure 1. The d5SICS:dNaM base pair