DESCRIPTION: Recent develops in automated techniques for DNA sequencing have led to an explosion of information on the complete sequences for the genomes of several organisms. Entire genomic sequences of 11 microorganisms are available now, and soon the genomes of almost three dozen additional organisms will be completed. These revolutionary data have stimulated mosaic research from the basic science, medical, and biotechnology communities that is focused on determining the essential complement of genetic information and functional attributes of an organism that is need to sustain life. A striking observation that has been made as each organism's genome is analyze is that almost one third of the putative open reading frames, although conserved among several organisms, encode for hypothetical proteins of no known function. The physiological function of the protein products represent a major gap in our understanding of the full complement of genetic information that is needed for the viability of any free living organism. The overall goal of this research program is to elucidate the function of roughly 50 proteins from a set of 65 bonafide hypothetical proteins from Haemophilus influenzae, an organism of moderate genetic size, by determining their high-resolution atomic structures. The first step is to develop high throughput methodology for subcloning the open reading frames that have already been screened for expressible polypeptides into high-level expression vectors to optimize the yields of soluble protein products. The second phase is to develop efficient protocols for high yield purification of native proteins of suitable quality and in sufficient quantities to begin large-scale crystallization studies. The final component is to characterize the targeted proteins in terms of their quaternary structure, and also in terms of their solubility and stability in solution. These data will permit protein targets to be identified for structure determination by NMR methods, provide clues for the crystallization of challenging proteins, and yield data on the physical and chemical properties of the hypothetical proteins that will be useful for functional determinations.