Featured Faculty
Professor of Operations
A. C. Buehler Professor; Professor of Operations
Lisa Röper
When training new employees, should you set them loose on a task by themselves or have them watch a colleague work?
Conventional wisdom says that it can be useful to have a newbie look over someone else’s shoulder. A salesperson might ride along on a sales call with a veteran, for instance, and a novice programmer, rather than laboring over a chunk of code on their own, might peek at how another coder did something similar.
Which is why, in 2014, eBay started using a software platform that allowed its data analysts to look at one another’s work. “I can just go in and see, ‘How did you do this?’” says Jan Van Mieghem, Harold L. Stuart Professor of operations at the Kellogg School. Then, depending on their needs, the programmers might study the techniques of the peers whose work they viewed, or even copy whole chunks of their code.
But does this kind of collaboration really make someone into a better coder?
The answer is complicated, Van Mieghem and Yue Yin, a PhD candidate at Kellogg, find in a new paper with Cornell’s Itai Gurvich, Stephanie McReynolds of Alation, Inc., and eBay’s Debora Seys. Using an extensive dataset capturing the detailed behavior of thousands of eBay data analysts, the team investigates whether viewing the work of others is a useful learning technique.
“We establish that learning by viewing isn’t necessarily more effective than learning by doing,” says Van Mieghem. Rather, it depends whom you’re learning from. The authors find that viewing the work of veteran programmers indeed helps analysts code faster. But viewing the work of inexperienced coders—even those who are perceived as programming superstars—can actually be detrimental to your productivity.
Yin and Van Mieghem think that this insight could be valuable to any organization hoping to get its newest members up to speed quickly. “Our work seems to indicate that there is huge value for the experienced people to showcase and share their work, especially with the folks who just started,” says Van Mieghem.
The project grew out of previous research in which Van Mieghem had also explored the value of collaboration.
After an article about that work was published in Kellogg Insight, Van Mieghem heard from a representative of Alation, an all-encompassing platform where programmers can write code and view one another’s work. The company wanted Van Mieghem to quantify the impact of that viewing, and offered to share data from one of their clients, eBay, whose data analysts had been using it.
“We show that as the cumulative number of queries the analyst has written increases, they get better and faster at writing the next query.”
— Jan Van Mieghem
In August of 2016, Yin and Van Mieghem flew to eBay headquarters in Silicon Valley. “We interviewed a bunch of data analysts, as well as their managers, to understand what these data analysts do,” says Yin.
As the researchers discovered, data analysts at eBay generate insights about customers and their behavior by writing “queries” in a programming language called SQL. An analyst might be tasked with finding out which were the most searched-for automotive items in South America in a given month, for example, or whether smartphone users prefer to buy items directly or bid in auctions. By writing specialized code to query databases, these analysts “generate business reports to show to managers, ‘Oh, here are the trends, here are the opportunities,’” Yin explains.
Alation provided the researchers with detailed data tracking the actions of 2,001 eBay data analysts between 2014 and 2018. For each of the 79,797 queries composed by the analysts in this period, the researchers were able to see when (down to the second) the analyst started and finished the query, how many queries that analyst had previously written at that point, and how many queries created by other analysts they had viewed.
Yin and Van Mieghem wanted to know which method improved an analysts’ programming skills faster: completing queries on their own (“learning by doing”) or looking at other analysts’ work (“learning by viewing”). To quantify improvement, the researchers calculated how long the programmer took to write a new query—the idea being that, as a programmer learns, they should be able to compose code more efficiently.
There were reasons to suspect that viewing others’ work could either help or hinder learning. Yes, collaborating could teach analysts how to use new techniques or solve thorny problems, but if analysts are simply copying and pasting the work of peers, they may not be learning the general, transferable skills that would make them better coders overall.
Indeed, the traditional idea of the “learning curve” suggests that only individual effort, repeated time and again, can impart new skills. “It says that the more often you execute a task, the better you get at the task,” explains Van Mieghem. Merely viewing the work of others may not replace those hours spent studying one’s own work and engaging in trial and error.
Moreover, viewing comes at a cost: every minute spent looking at someone else’s code is a minute that might have been spent thinking through a problem on your own, or consulting a programming textbook.
So which is the superior method for learning programming skills?
Applying statistical techniques to their dataset, and controlling for other variables (such as the analyst’s workload and experience level), the authors were able to parse out how learning by viewing stacked up against learning by doing.
The researchers first confirmed that learning by doing worked.
“We show that as the cumulative number of queries the analyst has written increases, they get better and faster at writing the next query,” says Van Mieghem. And the more specific that experience, the better. Productivity on a particular database grew especially fast when analysts had queried the same database repeatedly, rather than moving between different databases.
Viewing the work of peers, on the other hand, seemed to have very little impact on analysts’ efficiency overall. “It was kind of surprising to us,” says Yin.
The researchers began to wonder whose work the analysts were viewing. After all, not all peers’ queries were of equal caliber, and an analyst may be less likely to improve after looking at low-quality code.
So the team sorted the queries into groups based on two attributes of the author: how many queries the author had written, and the author’s reputation, measured both by their page rank on the search platform within Alation, and by monthly viewership of their queries.
“We should think about if performance can be evaluated based on the impact you have on peers—how others learn from you and what kind of improvement you give to the community.”
— Yue Yin
They used these attributes to define four types of authors: “All-stars” had both a large query portfolio and high social standing, while “non-stars” had neither. The “lone wolf” had written loads of queries but achieved low social standing; the “maven” was the opposite, someone who was well-regarded despite having written relatively few queries.
Yin and Van Mieghem’s team then took another look at learning by viewing, now accounting for whether the analyst was viewing the work of an all-star, non-star, lone wolf, or maven.
This time, they found that viewing the work of others had a strong—but highly variable—effect on learning.
Looking at code written by authors who were prolific query writers (all-stars and lone wolves) knocked significant time off an analyst’s future queries. “This result echoes a lot of results in the computer science and the learning literature,” says Yin.
However, looking at the work of mavens or non-stars actually led analysts to become less efficient programmers, who took longer to write queries. This suggests that viewing a piece of code written by a novice—even one with a sterling reputation—may often confuse analysts or set them on the wrong path, wasting their time.
“It actually may be counterproductive,” says Van Mieghem. “And that’s exactly what we found in the data.”
The findings demonstrate that actual experience matters much more than reputation for would-be teachers. “You can only be an influencer if you actually write code,” Van Mieghem says.
Beyond the world of programming, the study also hints at ways that all organizations could reap greater benefits from collaboration: experienced employees should be encouraged to share their work with everyone. “But maybe we should say to the rookies, ‘You guys can view [others’ work], but you don’t have to share yours yet,’” says Van Mieghem.
Furthermore, Yin thinks the study hints at a new way for managers to evaluate employees whose work is highly collaborative: “We always think performance should be evaluated by the work you have done. But we also should think about if performance can be evaluated based on the impact you have on peers—how others learn from you and what kind of improvement you give to the community.”
More generally, the research pushes back against the idea that people only learn by accumulating experience on their own. Collaboration, when done right, can lead to real learning, says Van Mieghem.
“When I teach, 60 percent of the work I assign is group work,” he says. “And some may say, ‘Well, is that the best way of proceeding?’ And I think our work gives more evidence that that is the right way.”