This is the second annual report for the Machine Learning Team, and the first full year of operation we have had. The team comprised Francisco Pereira, Charles Zheng, and Patrick McClure until June of 2019, when Yenho Chen (postbac fellow) joined the team. We hosted Jessica Huang as our summer intern. Research activity: We continued research projects initiated in the previous year, bringing them to publication (in press or being prepared): 1) Martin Hebart and Chris Baker - Discover interpretable mental representations of objects from a large database of behavioral judgements, which can be used to predict human behavior on a variety of other tasks. 2) Emalie McMahon and Maryam Vaziri-Pashkam - Detect what parts of an optic flow image contained information about a decision the participant was making, and at what points during a trial it was present. 3) Ana Inacio and Soohyun Lee - Predict various aspects of mouse behavior from calcium imaging and use the prediction models to identify informative cells for experimentation. 4) Hanna Keren and Argyris Stringaris - Develop models to predict mood during a gambling task, based on participant characteristics, trial parameters, and experiment history. 5) Bob Cox - Develop neural network to predict freesurfer segmentation of a brain image in minutes, with estimates of segmentation uncertainty. In addition to these, we have ongoing research projects with Armin Raznahan, Bevil Conway, Zheng Li, and Peter Bandettini, none of which are close to publication yet. Finally, we have three major team-initiated research projects, one of which was concluded and published this year. This was Distributed Weight Consolidation, where the goal was to make it possible to train neural networks in a situation where datasets may not be publicly available, for confidentiality, privacy, or other regulatory reasons. We developed a method that allows a starter neural network to be trained for a particular prediction purpose, distributed to various sites, and specialized to make the same prediction from their data. The resulting networks can then be consolidated back into a single, better network, without ever having required direct access to the data. Consultation/Service activity We carried out several service projects (only require a standard machine learning technique for solving a practical problem): 1) Mark Histed - Altered existing code for simultaneous cell segmentation and deconvolution in calcium imaging to compensate for optogenetic stimulation. 3) Chris Baker - Accelerated training of standard image processing neural networks on very large image sets, reducing training time from weeks to less than a day. 4) Zheng Li - Developed a new approach for sorting of tetrode waveforms, for use in situations where current automation fails, 10X faster than manually. 5) Eli Merriam - Modified searchlight classification code to work on cortical meshes, instead of volumes, and run fast enough to handle > 1 million voxels. 6) Peter Bandettini - Converted a 150-session functional imaging data from the IARPA Knowledge Representation in Neural Systems program to BIDS format. 7) Ben White - Helped improve the performance of deep learning code, and provide a working example of how to run it on the GPU resources in Biowulf. 8) Michal Ramot - Automated creation of visual saliency maps on video stimuli. In addition to these, we have ongoing research projects with Zheng Li, Soohyun Lee, Maryam Vaziri-Pashkam. The remaining service activities were primarily ad-hoc consultations on machine learning methods (which can take hours to days, if they require reading articles or finding/testing code), co-advising of postbac and postdoc trainees (days-weeks), or helping with selecting and interviewing candidates by gauging computational skills (hours). In addition to the PIs mentioned above, we provided consulting to people in the groups of Drs. Zarate, Berman, Afraz, Pine, Atlas, Usdin, and Innis, as well as groups in NINDS, NCCIH, NHGRI, NIBIB, NCBI, and NCI. We also engaged in software development, in response to requests from researchers in various groups, and all three projects are in use and works in progress (1 will be released soon, 2/3 are still in development): 1) searchlight classification/regression models - the purpose is to produce maps of where certain information is present in brain imaging data, as measured by the ability to predict it from new images, fast enough that the area around each voxel can be considered, even if there are millions of them. 2) common latent variable models - the purpose is to wrap up several different methods for identifying brain activation that corresponds across participants exposed to the same stimuli, and automatically selecting parameter values from the data to make the methods easier to use and more reliable. 3) agglomerative clustering - the purpose is to generate a brain parcellation in a completely data-driven way, by grouping adjacent voxels with similar behavior, and using splits of the data -- within and between participants -- to determine which parcels are stable. In all three cases, our goal is to have the methods working fast enough to make it feasible to use them on very high-resolution functional MRI data from a 7 Tesla scanner. Education and Training We hosted the following talks as part of the Machine Learning in Brain Imaging and Neuroscience seminar: - 9/4 - Tim Kietzmann - Recurrence required to capture the dynamic computations of the human ventral visual stream - 4/29 - Irina Rish - AI and Neuroscience: Bridging the Gap - 5/6 - Janice Chen - Brain dynamics underlying memory for continuous natural events - 7/1 - Per Sederberg - Quantifying Cognition with an Experiment and Computational Modeling Ecosystem - 7/29 - Alex Huth - Mapping Representations of Language Semantics in Human Cortex We gave invited presentations in various academic institutions (Pennsylvania State University, Cognitive Psychology Conference 2018, University of Maryland College Park, University of Louisville). After surveying PIs and trainees on their research challenges and what sort of training activities might provide them with tools to face them, we decided to prepare two different ones: 1) Introduction to machine learning on functional MRI data The goal of this training activity is to introduce attendees to the basic concepts and vocabulary of machine learning techniques, and how they have been used to answer scientific questions in cognitive neuroscience. We've tested the training at Massachusetts General Hospital (MGH), with three groups at NCCIH, and also individuals in NIMH, and plan to make it widely available during the Fall. 2) Introduction to deep learning for computational neuroscience research We are developing this training activity with feedback from Chris Baker's group, given that we already have relevant project collaborations with them. In addition to these, we also provided ad-hoc training lectures to other NIH Institutes - deep learning Q&A at the NCBI Deep Learning Journal Club - deep unsupervised learning at the NIH Image Segmentation Journal Club - distributed weight consolidation deep learning at Cancer Data Science Laboratory, NCI