The "twillight zone" is a term coined by Burkhard Rost to refer to remote protein homologs whose sequence similarity to proteins of known structure is sufficiently low that computational detection of the homology becomes quite challenging. We propose to construct new threading methods that extend protein structure prediction and sequence/structure alignments further into the "twilight zone". We attack the problem on two fronts: first, we intend to extend the "wrapping" methods we designed to successfully attack two hard special cases of this problem to other SCOP superfamilies. This involves casting the pairwise dependencies in the wrapping portion of the energy function into the general framework of Markov Random Fields, while still allowing them to wrap multiple aligned sequences to narrow the search space; then using a more sophisticated energy function based on a backbone-dependent rotamer library for sidechain packing. Second, our programs for the beta-helix and trefoil folds used human intervention to construct the core structural templates on which we are wrapping the sequences to predict whether they could fold into these structures or not. In order to construct a general threading program with reasonable fold library coverage for the PDB, we need to solve the challenge of automating the construction of a structural template from a set of proteins that, for example, all belong to the same SCOP superfamily. Most current threading programs train too closely to a backbone of one particular structure to be able to capture remote homologs. We propose a novel multiple structure alignment that adds geometric flexibility to capture similarities between more distant homologs, from which more general core templates can be abstracted. The applications of better computational protein structure prediction to speed up medical discovery are well-known. BetaWrap, our first beta-helix prediction program, already uncovered a previously-unknown relationship between the beta-helix fold and the virulence of microbial pathogens. A striking prediction of the BetaWrap program is that the beta-helix fold is predicted for many surface adhesins, toxins, and other recognition/penetration proteins of human pathogens. Our prediction that a major pollen allergen forms the beta-helix shape has just recently been confirmed experimentally. PUBLIC HEALTH RELEVANCE: Advances in computational protein structure prediction can help guide prediction of protein function, and thus speed medical discovery. This proposal is especially targeted at improving prediction of beta-structural motifs, which include many protein families that are important for bacterial pathogenesis, with representatives from whooping cough toxin (beta-helices) to the botulism toxin (beta-trefoils). BetaWrap, our first beta-helix prediction program, already uncovered a previously-unknown relationship between the beta-helix fold and the virulence of microbial pathogens. A striking prediction of the BetaWrap program is that the beta-helix fold is predicted for many surface adhesins, toxins, and other recognition/penetration proteins of human pathogens. Our prediction that a major pollen allergen forms the beta-helix shape has just recently been confirmed experimentally. [unreadable] [unreadable] [unreadable]