Featured Faculty
Professor of Management & Organizations; Professor of Industrial Engineering & Management Sciences (Courtesy), Director, Center for Science of Science and Innovation (CSSI), Co-Director for the Ryan Institute on Complexity
Jesús Escudero
Artificial intelligence, one of science’s crowning achievements, is poised to come full circle to revolutionize science itself.
Indeed, some AI systems have already tackled vexing scientific problems, like predicting the structure of proteins (an advance that was recently recognized with a Nobel Prize) and discovering novel mathematical algorithms. With the technology rapidly advancing, there is every reason to believe that its impact will only grow over time.
“I’d argue the central question today in AI is about whether AI can make new scientific discoveries, which is commonly viewed as the crucial milestone toward artificial general intelligence,” says Dashun Wang, a professor of management and organizations at Kellogg, where he also directs the Center for Science of Science and Innovation (CSSI) and codirects the Ryan Institute on Complexity.
In light of previous advances—and the promise of more to come—Dashun Wang and Jian Gao, a research assistant professor at CSSI, wanted to better understand how AI is currently benefiting science today, how it will benefit science in the future, and whether the educational system is adequately training the next generation of scientists to take advantage of this new opportunity.
After analyzing tens of millions of research papers, Gao and Wang provide the first quantitative answers to these questions.
They find that, since 2015, AI’s influence has indeed spread to nearly every area of science—from biology and chemistry to geology and physics. Many researchers who employ AI techniques also enjoy a “citation premium,” meaning that their papers become more influential among their peers.
But Gao and Wang also find that the benefits of AI aren’t evenly distributed: they are lower in disciplines with a higher share of female and minority researchers.
And perhaps most pressingly, with the exception of a few disciplines, there is a sizeable gap between how well-trained a discipline’s scientists are to use AI in their work and the potential benefit of AI in that discipline.
“This is the crucial insight of the paper—the misalignment in terms of supply and demand for AI talents across the disciplines,” says Wang.
To measure AI’s use and benefits across the sciences, Gao and Wang analyzed a massive dataset containing the titles and abstracts of nearly 75 million academic papers from 19 disciplines and 292 fields, published between 1960 and 2019.
First, the researchers used the data to establish a broad definition of what “AI” means to working scientists. Within the discipline of computer science, Gao and Wang identified five AI subfields—machine learning, artificial intelligence, computer vision, natural language processing, and pattern recognition. From this subset of AI papers, the researchers extracted the most frequently used key phrases that corresponded to specific AI techniques (such as “supervised learning,” “word embedding,” and “generative adversarial network”). Then the researchers searched the full publication dataset—that is, all the papers published in each discipline and field—to see where and how often the AI-related phrases, or “n-grams,” showed up.
“This is the crucial insight of the paper—the misalignment in terms of supply and demand for AI talents across the disciplines.”
—
Dashun Wang
Their analysis showed that while the use of AI on science has been steadily rising for the past two decades, a hockey-stick-like “takeoff” began in many disciplines around 2015. (Perhaps not coincidentally, that was the year Nature published an influential review article on deep learning by three pioneering AI researchers, and AI algorithms surpassed the human-level performance on ImageNet classification.)
From 2015 to 2019, the direct AI use scores in physics, engineering, geology, and psychology papers each increased by 24 percent compared with a hypothetical control. Other disciplines, from biology and economics to materials science and sociology, also saw increases ranging from ten to thirty percent. The researchers also found that papers mentioning AI n-grams were roughly twice as likely to be a “hit” within their respective fields (defined as being in the top 5 percent of total citations among papers within the same field and year).
Next, the researchers wanted to estimate AI’s potential benefits for scientific disciplines going forward.
To do so, they performed a different key-phrase analysis on scientific papers to extract “field tasks”—pairs of verbs and nouns describing what scientists in each discipline do. For biologists, one such field task might be “identify gene”; for chemists, it could be “catalyze reaction.” Gao and Wang also gathered similar verb–noun pairs from AI-related papers (as well as AI patents identified from 7.1 million patents granted by the United States Patent and Trademark Office between 1976 and 2019) to establish a set of “AI capabilities.” Finally, they compared the two sets of key phrases (i.e., verb–noun pairs) and looked for overlaps. If an “AI capability” also appeared as a “field task” in a certain discipline, that discipline was deemed more likely to benefit from AI in the future.
Using this measure, Gao and Wang found that AI has the potential to benefit nearly every scientific discipline.
They did find significant differences among subfields within a particular discipline. For example, tasks within the subfield of biological systems (which seeks to computationally model the complex interactions within organisms) were four times as likely to be impacted by AI compared with tasks in other subfields of biology, like horticulture or food science. But overall, says Gao, “AI has a pervasive impact and benefits for science across disciplines.”
That said, Gao and Wang also found that disciplines with a higher proportion of women and underrepresented minorities were less likely to benefit from AI—both in terms of the direct use of AI today and the potential benefits of AI tomorrow. For example, in sociology—where roughly half the number of researchers holding a PhD are female, and 16 percent identify with an underrepresented racial category—the current benefits of AI for the discipline are roughly half that of physics, which has a much higher proportion of male, Caucasian, and Asian researchers. Career-level analyses further revealed that underrepresented scientists who engage in AI-related research see a smaller increase to their “hit” rate (as measured by citations) than do other scientists.
“We’ve known for a very long time that technological change is often a source of inequality in labor,” says Gao. “If we predict that AI will continue to benefit scientific research in the future, we should also be concerned about how those benefits are distributed.”
But the biggest finding of all might just be how ill-prepared most disciplines are to take advantage of AI’s advance. After all, AI can only benefit a discipline if its scientists have the skills and training to use AI correctly.
The researchers examined the education system’s preparedness for AI by scanning a database of 4.2 million English-language university syllabi for scientific references to AI-related papers. “We wanted to know how well-prepared the next generation of scientists is to use AI advances,” Gao explains. By measuring the frequency of AI references in a discipline’s coursework, “we can estimate [that discipline’s] level of investment in AI education.”
Here the results can only be described as disappointing. With the exception of three computational disciplines (computer science, mathematics, and engineering), disciplines were not investing enough in teaching AI-relevant skills to graduate students and junior scientists to achieve the full benefits from AI.
To Wang, this finding should be a clarion call to policymakers around the world: “What kinds of levers in science policy might address this?”
Indeed, their research points to one answer. When Gao and Wang analyzed collaboration patterns in disciplines other than computer science (e.g., biology), they found that the number of AI publications produced via collaborations between computer scientists and, say, biologists was growing more quickly than those produced by biologists alone.
In other words, scientists in a wide range of disciplines are increasingly finding it useful to lean on their peers with more specialized knowledge of AI. This suggests that fully utilizing AI in science could require not just more funding to train scientists but more opportunities for interdisciplinary collaboration.
To some extent, this is already happening. “Some institutions are launching interdisciplinary research centers, which encourage faculties from different disciplines to have conversations on how to leverage different AI tools and advances,” Gao says. “That would give researchers more exposure to each other to potentially learn from each other, when they’re actually doing research together.”
But to maximize AI’s potential, say Gao and Wang, cross-collaboration will need to happen on a much larger scale. To that end, Gao and Wang’s results were included in a larger report presented at the National Academy of Sciences, which advises the U.S. government on science-related policy.
As for AI’s benefits to his own field of research, Gao is optimistic. “I’m excited about how AI can help automate labor-intensive tasks and enhance our creativity,” he says. “It can free us up to have more time to ask new questions, delve into more challenging areas, and push the boundary of knowledge.”
John Pavlus is a writer and filmmaker focusing on science, technology, and design topics. He lives in Portland, Oregon.
Gao, Jian, and Dashun Wang. 2024. "Quantifying the Use and Potential Benefits of Artificial Intelligence in Scientific Research." Nature Human Behaviour.