Skip to content

News & Events

From In Vitro to In Vivo AI Evaluation

Kawin Ethayarajh (Stanford University)

Colloquium

Tuesday, April 18, 2023, 3:30 pm

Abstract

AI can fail spectacularly in the wild despite passing evaluation in the lab. Why? The in vitro evaluation done in research is divorced from the complexity of the real world in which AI models are eventually deployed. This has had catastrophic effects, including systemic discrimination and the loss of billions of dollars. By making interdisciplinary connections, Kawin's work develops in vivo evaluation paradigms for AI that bridge the gap between research and reality. In this talk, he will discuss: (1) how we can create datasets that are as difficult as the underlying tasks we want to solve; (2) how we can measure and incorporate more context into the representations of AI models; (3) the importance of tracking the hidden costs of making predictions. With in vivo evaluation, we can be better assured that the progress we make on paper translates to progress in the real world.

Bio

Kawin Ethayarajh is a Ph.D. candidate in Computer Science at Stanford University, advised by Dan Jurafsky. His work makes interdisciplinary connections to bring realism to how AI is evaluated in the lab, which has traditionally been much simpler than the real world in which AI is eventually deployed. He was awarded a Facebook Fellowship in 2021 and an NSERC PGS-D scholarship in 2019, and his work has received an Outstanding Paper Award at ICML 2022 and a Best Paper Award at Repl4NLP-ACL 2018. Prior to Stanford, he graduated from the University of Toronto with a B.Sc. and M.Sc. in Computer Science, and has spent time at Google and the Allen Institute for AI.