Podcast: Think You Understand Why Ideas Go Viral? Big Data May Change Your Mind
Skip to content
Data Analytics Marketing Oct 3, 2016

Podcast: Think You Understand Why Ideas Go Viral? Big Data May Change Your Mind

From tweets to scientific discoveries, human behavior is surprisingly predictable.

A researcher uses big data to understand patterns

Yevgenia Nayberg

Based on the research of

Dashun Wang

Duncan Watts

Listening: Big Data and Ideas Going Viral
download
0:00 Skip back button Play Skip forward button 16:47

Why do some ideas go viral while others go nowhere? Is it all about reaching that mythical tipping point, or is something else at work?

Kellogg Insight talked with two researchers who are starting to find answers by analyzing huge amounts of data. Microsoft’s Duncan Watts explains why we should stop worrying about a tipping point, and Kellogg Professor Dashun Wang discusses how human behavior is more predictable than you might think.

Podcast transcript

[music prelude]

Emily STONE: Remember the Ice Bucket Challenge? Or that striped dress that some people swore was blue and black and others were sure was white and gold? Or the latest hilarious cat video? You likely do, because they went viral. They were shared by millions on social media and picked up as stories in the news.

But for every Ice Bucket Challenge, there’s also a fundraising gimmick that falls flat. Is it possible to know what’s going to be a runaway success versus a dress that’s … just a dress? To put it another way: Is human behavior predictable?

[music interlude]

STONE: Hello, and welcome to Kellogg Insight’s monthly podcast. I’m your host, Emily Stone. In this episode, we talk with two researchers who use huge amounts of data to try to predict how humans will collectively act, despite all our diversity and complexity. So stay with us.

[music interlude]

Duncan WATTS: We are interested in a wide range of questions about collective human behavior: So what is the structure of organizational networks and how can we understand that and map those out using e-mail logs? How does information spread over large social networks and media networks like Twitter? How do people cooperate or solve problems in groups? So we’re interested, generally, in how people behave collectively and how we can shed new light on these old questions using modern digital technology.

STONE: That’s Duncan Watts, a researcher in the computational and social science group at Microsoft Research.

This line of questioning is not new. Scientists have long been interested in how people work together and how ideas, to use the language of social scientists, diffuse.

WATTS: For most of that time, it’s been a very theoretical exercise, where you sort of sit and think deeply about how you think things are and maybe you have some sort of anecdotal observations from your own experience, or you go and sit and watch a small group of people in a natural setting, or you administer some survey.

STONE: Even the ideal scenario, where you would run a formal experiment, has its downsides. You end up with a limited amount of data about a limited number of people to demonstrate really broad theories about what makes humans tick.

But this type of research is changing on a fundamental level. Today scientists have vast amounts of data available to use in their studies. Think of all those bits and bytes of information that pour out of your computer, your phone—perhaps even your thermostat or refrigerator—every single second. All of that can be collected and crunched by researchers who want to better understand our basic behavioral patterns.

And yet with that data comes the realization that we may have previously gotten things completely wrong.

WATTS: For example, in the diffusion literature, there’s many, many theoretical models about how things spread on networks and a lot of the focus is on what’s called the epidemic threshold or the tipping point.

It’s sort of this transition from when things don’t spread to when things start spreading. If you’re a marketing person, you want to get things above the threshold and if you’re an epidemiologist, you want to get them below the threshold. But all the focus is on this threshold.

STONE: Right.

We’ve all heard of the tipping point. And it’s such an appealing concept that once ideas reach that tipping point, they spread virally just like a disease. Think for a minute about how often you say something “went viral” when it really has nothing whatsoever to do with an actual virus.

Don’t feel bad. Watts found this to be a compelling parallel, too.

WATTS: I’ve written several papers about social contagion using different kinds of mathematical models that have exactly the sort of biological metaphor that I just described. So that’s how I was thinking things worked as well. But when you look, when you really look, you don’t see that. That’s just not how the world works.

STONE: His team discovered this by studying the diffusion of nearly a billion stories, videos, pictures, and petitions on Twitter. They learned that there really is no “patient zero” for social dissemination. Instead, they found another key player.

WATTS: The media is, first of all, an indispensable element in any kind of social diffusion. You always have these entities that are either formal media organizations or increasingly online celebrities who have, at this point, many tens of millions of followers and effectively act like media organizations. And anything that becomes popular invariably goes through these channels, right? There’s no real equivalent in epidemiology.

STONE: So how did so many social scientists get this wrong for so long? The problem, in large part, boils down to not having the right data to examine.

WATTS: Even though our mental model of how things spread is: a person gets infected with an idea or a new behavior or something and then passes that along to the people that he or she interacts with, and they pass it along to people they interact with, and you can imagine this network in the background and it’s kind of lighting up as this entity, this contagion spreads through it.

The kind of data that was available was not that kind of data.

You want individual, person-to-person level transmission data, and what you actually had historically was aggregate counts over time.

STONE: There was another big problem with the data: the sin of omission.

WATTS: You’re trying to understand what makes something successful, but you’re only studying successful things.

But of course, most of the things that successful people do are also done by unsuccessful people. So all successful people have breakfast, right? Maybe having breakfast is the key to being successful. Well, it turns out that’s not a very predictive feature. But you already know that if you study successful people and unsuccessful people. You have to have a total sample of the population, and the same is true for diffusion.

You have to look at not only the things that do spread but also the things that don’t spread, which are massively more numerous.

STONE: It’s easy to see why this research is of great interest to marketers, who are eager to diffuse their ideas and products to the masses.

But understanding networks and how people interact within them is crucial for business leaders, as well. Take, for example, the goal of improving how people communicate within a company or getting teams to interact more efficiently when they tackle a creative challenge.

WATTS: If you think about the size of the economy and how much of the economy depends on firms and on teams within firms, it’s sort of a multi-trillion-dollar question. If you could even improve efficiency by a small fraction, it would have a huge impact.

STONE: And, again, the advent of big data is key. Because just like the faulty notion of an epidemic tipping point for Tweets, leaders are making assumptions about improving their organizations that the data may completely invalidate.

If a team or firm is floundering, a CEO may be convinced of the need for layoffs or a reorganization. But are those truly the best options?

WATTS: It’s really sort of stunning how little we really know about any of these things.

STONE: Instead of basing these enormously consequential decisions on intuition, leaders could use big data, Watts says. For example, he’s starting to analyze email communications. Even completely anonymized data about how Person A communicates frequently with Person B but never with Person Z could shed light on networks of collaboration.

This type of data could be used to address a variety of questions about how we nurture success at work.

WATTS: These are going to be tricky questions, it’s hard to define performance, it’s hard to measure it, it’s really hard to predict it, but we might be able to build something like a comprehensive theory of performance that will allow managers to make more data-driven decisions about, not just who to promote, but how to reduce attrition, or how to compose teams, or when to move a team into an open office floor plan versus keeping them in some other kind of floor plan. I mean, these are all decisions that could be addressed with data, both observational and experimental, and this is sort of a big project that we’re really just getting started on.

[music interlude]

STONE: We’ve been talking about how data can help predict success in a specific context: whether that Tweet will go viral, whether re-organizing a team will improve communication.

But what can big data teach us about success more broadly?

Dashun WANG: Success by nature is a collective phenomenon. What this means is that you can only be successful if everybody else thinks you’re successful.

STONE: That’s Dashun Wang, an associate professor of management and organizations at Kellogg.

WANG: If we start to accept that success or influence fundamentally is a collective phenomenon, then what this means is that there must be fingerprints in the data surrounding how people react to that artifact, that innovation, that individual.

STONE: Wang’s research looks for those fingerprints inside giant data sets. In one study, he and his colleagues looked at success within the world of academic publishing.

In the same way that the success of a Tweet depends in large part on how often it’s retweeted, the success of a scientific paper depends in part on how often it’s cited by other academics.

But think about all the factors that could go into this. Is a paper cited more when there are more collaborators, or when the authors are from a more prestigious institution?

WANG: We can now precisely measure and model this phenomenon. We’re able to show that in a system that we thought was very, very noisy and unpredictable, there’s deep regularities underlying this system.

STONE: In other words, they developed a way to predict how much of a “hit” a given paper would be among fellow scientists.

They found three main factors that drive citation success. The first is what Wang refers to as the “rich get richer” phenomenon. When a paper gets cited, it is considered a well-cited paper and thus garners more citations. The second is what he calls the “aging effect.” New papers are fresh and exciting and get cited more often than they do once they start to age. The third has to do with the actual ideas in the paper: Are they high-quality? And how much of the scientific community could find them relevant?

WANG: What we find in our studies is by combining these factors we’re able to build precise mathematical formulas that are analytically solved that will be able to help us predict, understand what is the underlying formula that governs how citation is being cited.

STONE: Wang points out that once you know that these are the three critical factors, it makes sense. But if he had told you that citation success depends on the number of collaborators and how prestigious their institutions are, you probably would also say, sure, that makes sense. It’s only through analyzing huge reams of data that we know which intuitive model is correct.

Furthermore, Wang’s more recent research shows that these same factors often dictate success in other areas, too.

WANG: What becomes interesting is this idea of influence or success, if you will, that’s really generalizable across many, many different domains. They share a set of common underlying fingerprints and principles that they follow.

Think about tweets in the social media space. How things get viral? Think about how technology penetrates a population. Or think about an individual, how someone starts to produce a lot of work and that by itself creates a rich-get-richer effect.

STONE: Of course, conducting science is very different from getting your ad to go viral. And, indeed, Wang stresses that in some contexts, other factors are also important. But these are generally in addition to, rather than in place of, the underlying principles that Wang uncovered.

Wang is quick to point out that while giant data sets are rich with potential insights into human behavior, he would have no idea how to gain those insights without the computational tools that have been developed over the past 15 or so years.

For example, consider a study that Wang recently conducted that analyzed data on millions of mobile-phone users in three countries: Who called whom, how often, and from where?

STONE: He used sophisticated tools to analyze this massive dataset and saw a relationship that nobody had found before: a link between the patterns in where we travel and who we call.

[music interlude]

STONE: Most of us tend to make lots of short trips from home, with the occasional long-distance foray. The same pattern plays out in our communications. We make lots of phone calls to people who live nearby—colleagues, local businesses—and just a few to people who live further away—a family member once a week, or an old friend once a month.

In populations where long trips are more common, you can predict that long-distance phone calls will also be more common. And vice versa.

WANG: What we realized is that these two aspects, previously pursued as different lines of inquiry, are actually in fact connected through precise mathematical formulas, because they actually represent two facets of the same phenomena. Then once we have one set of phenomena, we’ll be able to derive the information for the other side.

It’s just fascinating to see this kind of deep mathematical relationship in the human behavior.

STONE: OK, so scientists can predict how far we’re likely to travel based on who we call; they can predict whether Tweet A or academic paper B will be a huge success while similar ones go nowhere; they may soon be able to predict whether our office will be more productive with an open floor plan. Is human nature really so predictable?

Here’s Watts again.

WATTS: One reaction that people have when they hear about advances in computational social science is you know that it’s going to kill all the mystery in the world, right? That everything will be predicted, and free will will disappear, and human experience will sort of be reduced to algorithms and numbers, and that just sounds sort of dismal, and there’ll be blanked-face social scientists kind of pulling the levers of society behind the scenes.

I don’t think that’s a plausible outcome. I’m more concerned that we won’t be able to figure anything out, right? That everything will be so complicated and so contingent and so dependent on randomness and context that nothing will generalize at all. What I would like to see is that there are some things that we can figure out well enough, that we can do better than just going with our guts, which is sort of how we’ve been doing it forever basically.

[music interlude]

STONE: This program was produced by Jessica Love, Fred Schmalz, Emily Stone, and Michael Spikes.

Special thanks to Microsoft’s Duncan Watts and Kellogg School professor Dashun Wang.

You can stream or download our monthly podcast from iTunes, Google Play, or from our website, where you can read more on data analytics, collaboration, and leadership. Visit us at insight.kellogg.northwestern.edu. We’ll be back next month with another Kellogg Insight podcast.

Featured Faculty

Professor of Management & Organizations; Professor of Industrial Engineering & Management Sciences (Courtesy), Director, Center for Science of Science and Innovation (CSSI)

More in Data Analytics