We plan to provide bioinformatics support to the Human Genome Project, and work with the Whitehead Institute for Biomedical Research/MIT Center for Gene Research on comparisons between human and other vertebrate genomes. We will continue to supply monthly assemblies of the currently available human genome sequence, which we have provided since May 2000. Along with each assembly we will provide an integrated database of links to ESTs, mRNAs, SNPs, gene predictions, markers, isochores, contigs, gaps, clones, and homologies with other vertebrates, mapped to specific positions on the assembled genome and viewable side-by-side on the web through an interactive browser. This web browser will facilitate the use of the primary working draft and finished human genome data, stored by NCI and EBI, by medical and scientific worldwide. In addition to our support role, we propose to develop new methods to use comparative genomics, computational algorithms, and microarray to uncover the structure and function of human genes. Sequence homologies with the genes of other vertebrates, combined with EST and mRNA data, will be used to determine gene structure, alternative splicing, and regulatory motifs for human genes. Custom microarrays will be built with specific oligos designed to confirm splicing and regulation for the most medically important genes. Hidden Markov models will be built to model the domain structure of the protein products, and suggest functional classification by homology. Expression data will be combined with predicted structural features and data from external sources to produce the most comprehensive computational and experimental classification possible. The centerpiece of the proposal is a 1000 CPU computer cluster, funded jointly by Howard Hughes Medical Institute and through this award from NHGRI, which will enable us to perform the necessary genome-wide computations. Funding from the Sloan Foundation and internal funding from the UCSC campus will also provide support for the development of M.S. and Ph.D. programs in bioinformatics. The other financial component of the proposal is for staff and student support. By focusing funds from several sources, we are able to train a new generation of scientists, explore new research methodologies, and provide a vital service for the Human Genome Project.