PROJECT SUMMARY Proteins are capable of performing a wide variety of complex molecular functions and play a central role in all biological processes. A detailed and quantitative understanding of the relationship between a protein's sequence and it's biochemical properties would have a profound impact across all areas of biology, medicine, and biotechnology. We are developing data-driven approaches for dissecting the molecular basis of protein function. Our general framework involves designing informative libraries of protein sequences, experimentally mapping the relationship between sequence and function, and extracting detailed functional information from large sequence-function data sets. This work leverages emerging technologies and methods in DNA sequencing and synthesis, microfluidic screening, large-scale statistical learning, and optimization. We will develop generalizable platforms that can be applied to study a wide variety of enzymes and membrane transport proteins.