You’re scrolling through Facebook and see a post from your favorite clothing store showcasing a great pair of jeans. You “like” it, perhaps even leave a comment that you are eager to buy a pair, and then scroll on.
How is that store putting your likes and comments to use? It’s probably using them to shape its social media marketing strategy. But it is much less likely that the retailer is using that data to make operations decisions, such as how many pairs of those jeans to manufacture or whether to mark down prices.
That could change. In a recent study, Antonio Moreno, an associate professor of operations at Kellogg, found that social media data can improve sales forecasts. When researchers incorporated information about a clothing company’s Facebook interactions into prediction models, they could more accurately estimate purchases the following week.
Using advanced algorithms was key to the improvements, meaning simply collecting social media data is not enough: companies should also upgrade their forecasting techniques.
“It’s important to get new data but also use more sophisticated prediction methodology,” Moreno says.
The study did not reveal why the Facebook information improves forecasts. Moreno speculates that the data might reflect how much attention customers are paying to the brand, as well as good or bad word of mouth.
“By introducing social media data, we can do better.”
But companies may care more about the technique’s effectiveness than the mechanism behind it.
“If something works,” he says, “sometimes they might be able to live without knowing why it works.”
Using Social Media Data
The idea of mining social media data to guide operations is in its infancy.
For instance, companies might show a forthcoming shirt in two colors and see which one generates more clicks. The company could then use that information to decide which color to produce. “But it’s still not mainstream,” Moreno says.
And while academic studies have explored whether social media posts boost sales, little research has been done on using the data for internal operations decisions.
Moreno decided to investigate that idea with Ruomeng Cui, at Emory University, and Dennis Zhang, at Washington University, both of whom are former Kellogg doctoral students, along with Santiago Gallino at Dartmouth College.
The team worked with an online clothing company. Most of the firm’s social media–driven traffic came from its Facebook page, which had more than 300,000 followers at the time of the study. But to forecast sales, the company was largely relying on basic information such as its overall sales growth and weekly or seasonal patterns—such as the tendency to sell more on weekends.
Moreno’s team wrote software to extract information about the company’s Facebook posts from January to July 2013. The final data set included more than 171,000 users, 1,900 company posts, about 25,000 comments, and a quarter-million likes.
Next the researchers used language-processing software to categorize each comment as positive, negative, or neutral. In addition, the team obtained internal data on the company’s sales and advertising campaigns during that time.
Training the Forecasting Models
Using what they gathered, the researchers produced two sets of sales-forecasting models: the baseline forecast, which included only internal company information, and a second forecast that combined internal and social media data.
For both the baseline and social media forecasts, the team experimented with a variety of prediction methods. Most of the models relied on machine learning, in which the model trains itself to identify which factors are most important.
To assess accuracy, the researchers used a measure called mean absolute percentage error (MAPE), which captures how much the estimate deviates from actual sales. For instance, a MAPE of 10% would mean that, on average, the model’s estimates were 10% off.
The company’s existing sales forecasts for the next week had a MAPE of 12%. The researchers’ best-performing baseline model—the one without the social media data—brought the error down to about 7–9%.
“They can actually use this social media to learn and make better decisions.”
Adding social media data lowered it even further to 5–7%. Yet, the social media data alone was not enough. When the team plugged social media information into a poorly performing model, the accuracy could be even worse than the baseline model without the social media information.
The results suggest that both the data and methods are important. “By introducing social media data, we can do better,” Moreno says. “But it looks like the first step should be having better methods.”
Future research could explore in more detail why social media improves sales forecasts. Researchers could also perform similar studies to predict sales for individual products, rather than just total sales. And if data could be broken down by geographic area, the information could help companies decide how much of a particular product to carry in, say, Texas versus Idaho.
Moreno notes that the study’s results may not apply to all industries. Social media data is more likely to be relevant to products with highly uncertain sales or industries heavily influenced by trends, such as fashion and entertainment. But for consumer goods like breakfast cereal, sales are already fairly predictable, so adding Facebook data may not improve forecasts much.
Companies could also become more strategic about their social media posts, in order to specifically elicit information that will help guide their operations. For instance, more firms might adopt the practice of displaying potential products and deciding what to manufacture based on customers’ reactions.
“They can actually use this social media to learn and make better decisions,” Moreno says.