A New Chapter for Data Science
From new models for tracking inheritable diseases to unprecedented accuracy in forecasting catastrophic drought, Columbia’s third annual Data Science Day showcased powerful illustrations of how researchers across campus help us better understand our world through data.
In a series of lightning talks at the March 28 summit hosted by Columbia’s Data Science Institute (DSI), institute members discussed the impact of increasingly rich and robust data sets on fields spanning medicine, epidemiology, climate science, finance, and law. Among the presentations, Garud Iyengar, chair of Columbia Engineering’s Department of Industrial Engineering and Operations Research, moderated a conversation about how analysts are using environmental data to measure and anticipate financial risk from extreme weather and variable agricultural yields, while Janfeng Yang, a professor of computer science, explored vulnerabilities of the machine learning systems increasingly being relied upon for tracking disease and malware.
David Blei, a computer scientist and statistician, explained how sophisticated probabilistic machine learning can now incorporate vast multimodal troves of data to make assumptions and discover patterns.
“We want to make sense of complicated data in ways that are predictive, exploratory, observational, and hopefully easily scalable,” Blei said. “The goal is to build a model of process.”
The day also included a sneak peek into a near future driven by increasingly autonomous AI.
We are on the cusp of an unparalleled explosion of data, said keynote speaker and CEO of Google Cloud Diane Greene, that will power ever smarter machine learning across areas like business, healthcare, energy, and transportation. Already, a profound transformation is taking place, she argued, as data sets move from cumbersome and siloed human processing to more streamlined real-time generation by machines and algorithms.
“What we’ve seen in the cloud when you bring together data from different silos is that suddenly people become much more curious because the effort to ask questions goes way down,” Greene said.
Afterward, Jeannette Wing, Avanessians Director of DSI and professor of computer science, sat down with Columbia University President and noted First Amendment scholar Lee Bollinger to talk about the challenges of big data and groups like Wikileaks to traditional conceptions of free speech and government secrecy.
The full-day event also featured student research and celebrated DSI’s fifth birthday. Columbia Engineering Dean Mary Boyce and Dean of the Faculty of Arts and Sciences David Madigan joined Wing on stage to announce a set of annual awards named for the institute’s founders: each year the top master’s candidate in data science will receive the Culligan Academic Achievement Award, named for Professor of Civil Engineering and Engineering Mechanics Patricia J. Culligan, while an outstanding research project will receive the McKeown Research Award, named for Henry and Gertrude Rothschild Professor of Computer Science Kathy McKeown.
“DSI has made a really remarkable impact,” Boyce said, “and become a great pathway for students and faculty.”
— Jesse Adams, Data Science Institute