When a Bunch of Economists Look at the Same Data, Do They All See It the Same Way?
Skip to content
Economics Finance & Accounting Jan 7, 2022

When a Bunch of Economists Look at the Same Data, Do They All See It the Same Way?

Not at all, according to a recent study, which showed just how much noise can be introduced by researchers’ unique analytical approaches.

two people stand on a scatterplot grid

Lisa Röper

Based on the research of

Robert Korajczyk

Dermot Murphy

and coauthors

If you handed the same data to 164 teams of economists and asked them to answer the same questions, would they reach a consensus? Or would they offer 164 different answers?

Add Insight
to your inbox.

A new study put this exact proposition to the test. One hundred sixty-four teams of researchers analyzed the same financial-market dataset separately and wrote up their conclusions in 164 short papers. Teams were then given several rounds of feedback, mimicking the kind of informal peer-review process that economists engage in before they submit to an academic journal. All the researchers involved wanted to know how much variation would exist among their different papers.

It turns out, a lot.

Data can be messy, notoriously so. And so scientists and researchers have developed reams of strategies for cleaning and analyzing and ultimately harnessing data to draw conclusions. But this unusual study—an analysis of 164 separate analyses—suggests that the decisions that go into choosing how to clean the datasets, analyze them, and come to a conclusion can in fact add just as much noise as the data themselves.

In an increasingly data-driven world, this is important to keep in mind, according to Robert Korajczyk, a professor of finance at Kellogg. Korajczyk and a former Kellogg PhD student, Dermot Murphy, now a professor at University of Illinois Chicago, served as one of the 164 research teams involved in the project.

Kellogg Insight recently spoke with Korajczyk about the experience, and what researchers and the general public can take away from the study’s surprising conclusion.

This conversation has been edited for length and clarity.

Kellogg Insight: Can you start by explaining the data that you and the other 163 research teams were asked to analyze?

Korajczyk: Yes. Each research team was given a dataset that covers 17 years of trading activity in the most liquid futures contract in Europe, the Euro Stoxx 50. That was essentially 720 million trades. And there were six research questions that teams were asked to look at. For example, did pricing get more or less efficient? Did the markets get more or less liquid? And did the fraction of agency trades change over time?

KI: These are pretty fundamental trends that you would want to understand if you were trying to gauge the health of this market.

Korajczyk: Yes, absolutely. But the broader goal of the research was what really interested me.

KI: Namely, how different research teams would approach the same set of questions?

Korajczyk: Yes. These types of “crowdsource” projects have happened in other fields, but this is the first that I’m aware of in finance. And few projects are at the scale of this particular project. It’s more typical to have 15 or 20 teams. A hundred and sixty-four is really large. So my coauthor Dermot Murphy and I decided to team up and get involved.

KI: Talk to me about the 164 different papers that were submitted. What should we understand?

Korajczyk: There’s a statistical concept called “standard error,” which tells you about the uncertainty in a parameter estimate such as a mean. The standard error of a mean is going to be larger when data are noisy and it’s going to be smaller when there are more observations.

But then there is another kind of “error” or noise to take into consideration. And that’s all the decisions that go into getting to that point. There are a lot of different ways to measure market efficiency, for instance, so that’s one of the decisions that a research team would have to make. When you clean the data, how do you handle those outliers? Do you throw them out or do you change them to another value that is large but not as large? What will be the form of your statistical model? What software are you using? Are you a good coder or a bad coder?

All those choices that are made by the research team, as well as their inherent ability, go into creating new variation in the output. We call this the “nonstandard error.”

KI: And when the teams originally submitted their papers, these nonstandard errors were about as large as the standard errors.

Korajczyk: Right, so I guess one way to think about it is if you’re going to read a paper and say, “Okay, how much credence do I place on these results?” the standard errors tell you something about the noise in the data. But the researchers made a lot of choices that I may or may not have made. So maybe that noisiness in the results is actually double what it looks like from just looking at the standard errors.

KI: Did that surprise you?

Korajczyk: It doesn’t surprise me that there was variation. The size was larger than I thought it would be. There were also some clear outliers that seemed totally outlandish to me.

Another surprise was that some of these outlandish results were there in every round. At each stage you learn something about what reviewers think or what other teams have done, and you’re allowed to revise your paper with that knowledge. But even after peer review and the opportunity to see other teams’ papers, a lot of outlandish results stuck around.

In each stage, though, the dispersion across teams did go down somewhat.

KI: It seems there were some true philosophical differences in how the questions should be approached and how the analyses should be conducted.

Korajczyk: Absolutely. And in a sense this project actually constrained these differences. We were told, “Here are the data and you’re only allowed to use these data.” You weren’t allowed to grab other data that might be relevant for answering that question and add them to the database. That would have likely increased the dispersion across teams.

KI: There’s certainly a “researchers beware!” message to this work, as you determine just how much you can trust the conclusions in the literature. This only adds to growing concern among scientists about a “replication crisis.”

Are there certain changes that you think should make to account for these ubiquitous nonstandard errors? For instance, should academic articles allot more space to methods sections so that researchers can communicate more transparently about their choices?

Korajczyk: The standard has always been that someone who’s read your paper and decides to replicate it should be able to do that from what you’ve written in the paper. If you have truncated some outliers, they should know exactly how you truncated them. Now I can’t guarantee you that every paper is written that way. But that’s the standard of good writing, and that standard has been there for a long time.

But what a paper doesn’t normally tell people is, “I tried this specification and decide not to use it, and I tried that specification and decided not to use it. And, oh yeah, I should have controlled for this other variable.”

But there are some changes for the better. These days it is much more common to have lengthy appendices available on the journal’s website. These can go into much more detail about the robustness of the results. That can give the reader some confidence that you can look at the data in a lot of different ways and get the same results. Does everyone go through and read the 120-page appendix? No, but people who are very interested in that topic might. Another thing that’s getting more common is requiring researchers to post our code. That makes it easier to replicate results and determine whether they are robust.

KI: What should the general public make of this research? If I’m reading an article in Bloomberg or The Wall Street Journal that cites a new finance study, how seriously should I take those conclusions?

Korajczyk: Well, whether it’s finance research or medical research or psychology or sociology, it’s always helpful to be skeptical. If I’m listening to the news, for instance, one thing that news reports rarely tell you is the sample size of the study. Now, with Covid-19, this is changing somewhat, but knowing the sample size tells me a lot about whether I want to take this result seriously.

I also think it’s helpful to ask, “What are the incentives?” If it is someone trying to get tenure, there is a bias toward finding statistically significant results. If it is someone who works for a money-management firm, their financial incentives could be aligned with economically significant results going in a particular direction.

Finally, be cognizant of the fact that there are many different choices that researchers have to make. If you read, “we did X” in one line in a paper or footnote, it may not be as innocuous as it seems.

Featured Faculty

Harry G. Guthmann Professor of Finance; Co-Director, Financial Institutions and Markets Research Center

About the Writer

Jessica Love is editor in chief of Kellogg Insight.

About the Research

Menkweld, Albert J., Anna Dreber, Felix Holzmeister, Juergen Huber, et al. 2021. “Non-Standard Errors.” SSRN. November 23.

Read the original

Most Popular This Week
  1. Will AI Eventually Replace Doctors?
    Maybe not entirely. But the doctor–patient relationship is likely to change dramatically.
    doctors offices in small nodules
  2. What Is the Purpose of a Corporation Today?
    Has anything changed in the three years since the Business Roundtable declared firms should prioritize more than shareholders?
    A city's skyscrapers interspersed with trees and rooftop gardens
  3. What Happens to Worker Productivity after a Minimum Wage Increase?
    A pay raise boosts productivity for some—but the impact on the bottom line is more complicated.
    employees unload pallets from a truck using hand carts
  4. 3 Tips for Reinventing Your Career After a Layoff
    It’s crucial to reassess what you want to be doing instead of jumping at the first opportunity.
    woman standing confidently
  5. Why We Can’t All Get Away with Wearing Designer Clothes
    In certain professions, luxury goods can send the wrong signal.​
    Man wearing luxury-brand clothes walks with a cold wind behind him, chilling three people he passes.
  6. Why You Should Skip the Easy Wins and Tackle the Hard Task First
    New research shows that you and your organization lose out when you procrastinate on the difficult stuff.
    A to-do list with easy and hard tasks
  7. Which Form of Government Is Best?
    Democracies may not outlast dictatorships, but they adapt better.
    Is democracy the best form of government?
  8. 6 Takeaways on Inflation and the Economy Right Now
    Are we headed into a recession? Kellogg’s Sergio Rebelo breaks down the latest trends.
    inflatable dollar sign tied down with mountains in background
  9. How Are Black–White Biracial People Perceived in Terms of Race?
    Understanding the answer—and why black and white Americans may percieve biracial people differently—is increasingly important in a multiracial society.
    How are biracial people perceived in terms of race
  10. When Do Open Borders Make Economic Sense?
    A new study provides a window into the logic behind various immigration policies.
    How immigration affects the economy depends on taxation and worker skills.
  11. How Old Are Successful Tech Entrepreneurs?
    A definitive new study dispels the myth of the Silicon Valley wunderkind.
    successful entrepreneurs are most often middle aged
  12. How Has Marketing Changed over the Past Half-Century?
    Phil Kotler’s groundbreaking textbook came out 55 years ago. Sixteen editions later, he and coauthor Alexander Chernev discuss how big data, social media, and purpose-driven branding are moving the field forward.
    people in 1967 and 2022 react to advertising
  13. Why Do Some People Succeed after Failing, While Others Continue to Flounder?
    A new study dispels some of the mystery behind success after failure.
    Scientists build a staircase from paper
  14. How to Get the Ear of Your CEO—And What to Say When You Have It
    Every interaction with the top boss is an audition for senior leadership.
    employee presents to CEO in elevator
  15. Understanding the Pandemic’s Lasting Impact on Real Estate
    Work-from-home has stuck around. What does this mean for residential and commercial real-estate markets?
    realtor showing converted office building to family
  16. Immigrants to the U.S. Create More Jobs than They Take
    A new study finds that immigrants are far more likely to found companies—both large and small—than native-born Americans.
    Immigrant CEO welcomes new hires
  17. Podcast: What to Expect When Joining a Family-Owned Business
    There are cons—but a lot of pros, too. On this episode of The Insightful Leader, we’ll explore what it’s like to work at a family business when you’re not a family member.
More in Economics