When ancient Greeks had questions about the future, they consulted the Delphic oracle. Today individuals with queries on such issues as future stock prices, energy trends, football point spreads, and even next week’s weather consult experts in those fields. Many of those experts base their advice on scientific reasoning. But some rely on clever marketing, pseudoscience, and carefully calibrated predictions rather than genuine understanding of the areas on which they prognosticate. All too frequently those individuals maintain their reputations for predictive skill because it is surprisingly difficult to tell the difference between bogus and genuine experts. Now, however, a Kellogg team has produced a test that does precisely that in certain circumstances.
“Our test looks at informed and uninformed experts,” says Alvaro Sandroni, a professor of managerial economics and decision sciences who worked on the project with Nabil Al-Najjar, also a professor of MEDS, Jonathan Weinstein, an associate professor of MEDS, and Rann Smorodinsky, an associate professor at Israel’s Technion Institute. “If you know what’s happening, you will pass the test. If you don’t know, we’re not saying that you won’t pass the test, but there’s no absolute guarantee that you’ll pass.”
Sandroni continues, “It’s part of a research agenda underlain by a simple question: How do we know that an expert has information that we don’t and how do we know that science is based on something and goes beyond ordinary understanding of things?” Weinstein adds, “Expertise is very important to evaluate. We need to be able to tell whom to trust.”
How do we know that an expert has information that we don’t?
Al-Najjar puts the work into a broader context. The basic idea, he says, “is to understand the boundaries between parts of knowledge that are testable and others that are not. This is the first paper that introduces restrictions on the structure of beliefs that makes these beliefs testable.”
A Startling Finding
The research expands on a series of studies on testing expertise that have reached what the team’s paper calls a “most robust—and startling—finding…that all reasonable tests can be manipulated.” That fact may be counterintuitive, but researchers universally accept it. “It is possible to hide absolute ignorance through the language of probability, and showing up that ignorance is very difficult,” Sandroni summarizes. “You give a false impression of expertise when there is nothing but ignorance. Of course, you have to do it in a very special and specific way—extremely carefully and in a very precise manner—to be successful.”
Weinstein points out the finding’s implications for would-be testers of experts. “It means that we can’t have a perfect world with a perfect test,” he says. Tests with certain restrictions remain possible, however. The Kellogg team set out to identify limitations that are neither too restrictive nor too lenient, and to incorporate them into a test of expertise.
“We assume the worst case: that the false experts are very understanding of the test and how to manipulate it. We assume that there are master manipulators out there,” Weinstein says. “But we recognize that there’s a fine line. If your test is too restrictive, real experts will fail. But if it’s not fine enough, you’ll pass people who try to flimflam you.”
Learnability and Predictiveness
The test, developed using standard mathematical tools, relies on two key phenomena: learnability and predictiveness. “We look at predictions and what actually happened, and based on that, we want to know if the predictor knows something about what he’s trying to predict,” Sandroni says. Weinstein outlines the process in detail. “First of all, experts being tested have to set in advance an amount of time they’ll need to learn something about what they’re trying to predict; that’s learnability,” he explains. “Then, when they reach that deadline, they have to make a very specific prediction that can be checked—for example, that over fifty days the market will move up 80 percent of the time; that’s predictiveness.”
The team sums up those two requirements in its paper. “There must be a point at which [the expert’s] theory makes predictions that can be tested,” the researchers note. In addition, Sandroni says, “the expert has to give reasons for why he is predicting this way or that way—reasons based on a bunch of parameters. Then data is used to a certain point and the parameters are tested separately. The test follows customary scientific procedures to identify core elements using data and, having identified core elements, uses more data to confirm them. The point is that, in this particular context, our test cannot be manipulated while other similar-looking tests can be manipulated.”
As Weinstein’s example of market prediction shows, the test requires experts to make their prognostications on the basis of probabilities, rather than simple yes or no answers. Weather forecasters illustrate that criterion. They typically frame their forecasts in terms of the percentage likelihood of rain, snow, sunshine, or other meteorological phenomena. “This is becoming a proper way of presenting consultations in other fields, such as political analysts, medical studies, and sports betting—any area in which consultations come in terms of odds,” Sandroni says. The demand for percentages puts an automatic restriction on the test. It requires what Weinstein calls “a fairly long repeated data stream”—continuously variable factors such as financial information that moves with the market or point spreads that change on the basis of injury reports and game day weather forecasts.
How Much Leeway?
Restrictions such as that add to the value of the new test. “The more restrictions there are, the harder it is to manipulate the test, because you have to manipulate it in a certain way,” Sandroni points out. But the test also gives its subjects a certain amount of latitude. “We try to acknowledge that people can make accurate predictions even if they’re not precise,” Weinstein says. “We have to allow them some leeway. The issue is how much leeway to give.”
Sandroni emphasizes that the test focuses on individuals who set out to give false impressions of their predictive skills. “The honest uninformed expert would be very easy to differentiate from the honest informed expert,” he says. “We are looking at the dishonest expert who can maintain a false reputation by making strategic predictions.”
What is the take-away message of the research project? “If we make specific requirements on how long the expert has to learn and the precision with which he has to predict, then we can tell the difference between the flimflam artists and the people who are able to make that kind of prediction,” Weinstein says. Al-Najjar looks at the implications of the work for corporations and other institutions. “The tension between commitment and flexibility has been recognized by philosophers and strategists,” he explains. “This paper’s higher-level message is that this tension is rooted in the problems of testing and learning. Both testing one’s frameworks and learning from a changing environment are essential to a dynamic, adaptive organization.”
Related reading on Kellogg Insight