The availability of a reference human genome sequence provides the foundation for studying the genetic and molecular basis of human health and disease. Using a variety of modern technologies to study differential gene expression, large numbers of researchers in both academic and industrial settings look for causative effects of activation or repression of particular genes and gene products with respect to the onset and progression of disease. Bioinformatic and computational approaches play a critical role in such studies. NewLink Genetics Corporation is a biopharmaceutical company developing novel drugs and functional genomics solutions with a primary focus on cancer. The company has its own proprietary technology for target gene discovery. To facilitate its in-house bioinformatic analysis, NewLink has licensed software for gene structure prediction from Iowa State University and software for genome-scale sequence matching ("Vmatch") from Prof. Stefan Kurtz, University of Hamburg, Germany. These programs are key components of NewLink's internal target-discovery bioinformatics pipeline. Vmatch is a versatile string matching application based on enhanced suffix arrays. Recognizing that there are hundreds of other companies with similar bioinformatics needs, NewLink acquired the exclusive distribution and development rights to Vmatch and is now licensing this program to for-profit customers (an academic version of the programs remains freely available to non-profit researchers). This SBIR proposal seeks funds to develop VmatchNL, a GUI driven application of the Vmatch software. The overall goals of our business are to integrate all software into a GUI-driven, industry-quality, comprehensive platform for target gene discovery ("TargetSeqerNL"). The market for these products includes other small companies that do not have the resources to develop similar capabilities in-house, big companies that look for the superior performance of GeneSeqer and Vmatch relative to other programs, and also to academic researchers with similar needs to organize and analyze their gene expression data. [unreadable] [unreadable] [unreadable]