The 7th David Finney Lecture (2023)
We are excited to announce that the 7th David Finney Lecture will be presented by Bin Yu.
Title: Veridical data science with a case study to seek genetic drivers of a heart disease
Abstract:
"AI is like nuclear energy–both promising and dangerous." Bill Gates, 2019
Data Science is a pillar of AI and has driven most of recent cutting-edge discoveries in biomedical research and beyond. Human judgment calls are ubiquitous at every step of a data science life cycle, e.g., in problem formulation, choosing data cleaning methods, predictive algorithms and data perturbations. Such judgment calls are often responsible for the "dangers" of AI.
To mitigate these dangers, we introduce in this talk a framework based on three core principles: Predictability, Computability and Stability (PCS). The PCS framework unifies and expands on the ideas and best practices of statistics and machine learning. It emphasizes reality check through predictability and takes a full account of uncertainty sources in the whole data science life cycle including those from human judgment calls such as those in data curation/cleaning. PCS consists of a workflow and documentation and is supported by our software package veridical or v-flow. Moreover, we illustrate the usefulness of PCS in the development of iterative random forests (iRF) for predictable and stable non-linear interaction discovery. Finally, in the pursuit of genetic drivers of a heart disease called hypertrophic cardiomyopathy (HCM) as a CZ Biohub collaborative project, we use iRF and UK Biobank data to recommend gene-gene interaction targets for knock-down experiments. We then analyze the experimental data to show promising findings about genetic drivers for HCM.
Bin Yu (University of California, Berkeley)
Bin Yu is Chancellor's Distinguished Professor and Class of 1936 Second Chair in Statistics, EECS, and Computational Biology at UC Berkeley. Her research focuses on statistical machine learning practice and theory and interdisciplinary data problems in neuroscience, genomics, and precision medicine. She and her team developed in context iterative random forests (iRF), hierarchical shrinkage (HS) for decision trees, Fast Interpretable Greedy-Tree Sums (FIGS), stability-driven NMF (staNMF), and adaptive wavelet distillation (AWD) from deep learning models. She is a member of the National Academy of Sciences and American Academy of Arts and Sciences. She was a Guggenheim Fellow, Tukey Memorial Lecturer of Bernoulli Society, and IMS Rietz Lecturer, and won COPSS E. L. Scott Award. She is to deliver the IMS Wald Lectures and the COPSS DAAL (formerly Fisher) Lecture at JSM in Aug 2023. She holds an Honorary Doctorate from The University of Lausanne. She served on the inaugural scientific advisory board of the UK Turing Institute of Data Science and AI and is serving on the editorial board of Proceedings of National Academy of Sciences (PNAS).
Schedule
Wednesday 3rd May
2:30-4:00pm: 7th David Finney Lecture, Larch Lecture Theatre, Nucleus Building (King's Buildings)
4:00pm - 5:00pm: Drinks reception, Magnet Cafe, JCMB (King's Buildings)