ABSTRACT Nextgenerationgenomescalesequencingofpatientsisnowbecomingroutinefortwoclassesofdisease:rare Mendeliantraitsandcancer.Infavorablecases,thesedataallowidentificationofrelevantmutationsandthus aiddiagnosisandtherapy.Inbothclassesofdisease,themostcommontypeofmutationismissense-single base changes that result in an amino acid substitution in a protein. Uncertainty as to the impact of these mutationsoninvivoproteinactivityhasresultedinaveryconservativeapproachtotheirinterpretationinthe clinic, so causing many missed opportunities for targeted treatment. The goal of this project is to use a combinationofthreestrategiestomaketheinterpretationofthesemutationsmuchmoreapplicableintheclinic. Therearealreadyalargenumberofcomputationalmethodsthatattempttodeterminetheimpactofmissense mutationsonfunction,andthereissubstantialevidencethatthesehaveusefulaccuracy.Theprimarydifficulty isthattheaccuracyinanyparticularcaseisnotreliablycalibrated.Therefore,ourfirstaimistouseacombination ofthesemethodstodevelopanapproachfocusedonmorereliableestimatesfortheprobabilityofhighimpact on protein function (i.e. more confident P values). The second aim is to maximize the utilization of three- dimensionalstructuralinformation,largelyignoredbymostcomputationalmethods.Alargefractionofmissense mutationsintheseclassesofdiseaseactbydestabilizingproteinstructureandknowledgeofstructureallows thesetobeidentifiedwithmuchhigherreliability.Also,structureprovidesaframeworkfordetailedannotation andcomprehensionoffunction.Tofacilitatetheutilizationofstructure,wewillimplementamodelingplatform thatleveragesavailableexperimentalinformationtomaximizethestructuraldataavailableforanalyzingmutation impact. An important aspect of the platform is incorporation of methods for evaluating the reliability of the structuralfeaturesrelevanttoanalysisofeachmutation.Inthethirdaimwewillbuildspecificfunctionalmodels foreachproteinofinterest,integratinginformationfromcurrentdatabases,theliterature,andcommunityinput, soastoprovidetherichestpossiblebackgroundagainstwhichtojudgetheimpactofmutations.Proteopedia,a wellestablishedmediawikiforproteins,willbeusedtoprovideanintegratedviewoftext,data,andstructure.A keycomponentoftheinformationresourcewillbecontributionsfromcurators,whowillprovideannotationand alsosolicitinputfromotherexperts.Thisaspectoftheprojectbuildsonexperiencewithothercrowdsourcing endeavors, including CASP, CAGI and Proteopedia. There will be three primary outcomes from the project: First,improvedreliabilityfortheinterpretationofmissensemutations.Second,aprototypemutationannotation proceduresuitableforuseinaclinicalsetting.Third,theresourcewillprovideinformationofbenefittoarange ofotherscientists,thusfacilitatingtheanalysisofdiseaserelatedmutations.