The Human Genome Project faces many difficult technical challenges. Among these are the problems of automatically recognizing coding regions in uncharacterized DNA sequences, assembling contig maps from YAC/STS content information, and other problems of knowledge discovery and database organization. The goal of this Special Emphasis Research Career Award is to study these and other problems that pose informatics and computational challenges, taking advantage of the candidate's experience as a computer scientist. The goals of this training program are to: (1) obtain a stronger background in basic molecular biology and genetics, (2) learn the fundamental laboratory techniques used for genome sequencing experiments, so as to better understand the problems actually faced by biologists in the lab, (3) become familiar with the technical details of the major genomic databases, including GenBank and the Genome Data Base (GDB), and (4) develop a long term project focused on bioinformatics, using the candidate's new biological knowledge in combination with his background as a computer scientist. In particular, as part of (4) he will significantly extend his previous work on finding coding regions in uncharacterized human DNA, and build a general gene-finding system to be distributed to the genomic research community. The candidate also plans to develop new projects for the analysis and management of DNA mapping data, for which the training under goal (3) will be invaluable. The project will begin with formal courses and laboratory work under the guidance of the distinguished molecular biologist Hamilton Smith, who will serve as the candidate's advisor. This will be followed by immersion in the activities of the Genome Data Base group at Johns Hopkins, where the candidate will have access to the latest computational resources for genomics research. The training and experience during the NCHGR/SERCA grant period will provide an ideal basis upon which to build a new research program in computational biology.