Can big data about whom we call be used to predict how a viral epidemic will spread?

It seems unlikely. After all, viruses do not spread over a cell network; they need us to interact with people in person.

Yet, it turns out that the patterns in whom we call can be used to predict patterns in where we travel, according to new research from Kellogg’s Dashun Wang. This in turn can shed light on how an epidemic would spread.

Both phone calls and physical travel are highly influenced by geography. The further away a shopping mall or post office is from our home, after all, the less likely we are to visit it. Similarly, our friends who live in the neighborhood are a lot likelier to hear from us frequently than our extended family in Alberta.

“It’s just fascinating to see this kind of deep mathematical relationship in human behavior.”

But Wang and colleagues were able to take this a step further. By analyzing a huge amount of data on where people travel and whom they call, they were able to determine the mathematical formula that illustrates the link between how distance impacts these two very different activities. This understanding provides a framework for using data about long-distance interactions to predict physical ones—and vice versa.

As humans, we do not like to think that someone could anticipate our actions, says Wang, an associate professor of management and organizations. But his evidence says otherwise. “It’s just fascinating to see this kind of deep mathematical relationship in human behavior,” he says.

Wang’s conclusions were based on the analysis of three massive troves of cell phone data collected for billing purposes. The data, from three nations spanning two continents, included geographic information about where cell phone users traveled, as well as information about each phone call placed or received, and how far a user was from the person on the other end of the line.

The discovery of this underlying relationship between physical and nonphysical interactions has significant practical implications. For example, the researchers were able to model the spread of a hypothetical virus, which started in a few randomly selected people and then spread to others in the vicinity, using only the data about the flow of phone calls between various parties. Those predictions were remarkably similar to ones generated by actual information about where users traveled and thus where they would be likely to spread or contract a disease.

“I think that’s a great example to illustrate the opportunities brought about by big data,” Wang says. “The paper represents a major step in our quantitative understanding of how geography governs the way in which we are connected. These insights can be particularly relevant in a business world that is becoming increasingly interconnected.”