Operations Mar 1, 2023
At Their Best, Self-Learning Algorithms Can Be a “Win-Win-Win”
Lyft is using ”reinforcement learning” to match customers to drivers—leading to higher profits for the company, more work for drivers, and happier customers.
Sébastien Martin was working at Lyft as a postdoctoral fellow when Covid hit. Suddenly, there were massive changes in the number of passengers and drivers using the app, and the company tried to quickly adapt.
Lyft had always used an algorithm to match drivers and passengers, so they figured they could tweak it to make their Covid plan work. But it ended up being much harder than expected. “It showed the limit of the system,” says Martin, who is now an assistant professor of operations at the Kellogg School.
The main issue, Martin explains, is that simple algorithms—such as matching the closest driver to a passenger—actually don’t work that well.
It got Martin thinking about how the matching algorithm could be improved, even after rideshares recovered from the pandemic. What if the algorithm could teach itself how to better allocate drivers and then make those adjustments in real time?
He and a team from Lyft have accomplished just that. It took more than a year—an eternity at a tech firm, Martin says—to create an algorithm that could engage in “reinforcement learning.” And while designing the algorithm was difficult, so was getting buy-in across the company to even attempt this.
After all, with reinforcement learning, “you give away a lot of control,” Martin says. “A machine that can make decisions without telling you? Imagine if it’s making those decisions about work that’s your bread and butter.”
But the results were worth it: The company began making more money, drivers had more work, and passengers gave more five-star reviews. Plus, their project was named one of six finalists last month for the Franz Edelman Award, the most prestigious award in the field of analytics and operations research. If you’ve taken a Lyft in the last year or two, then this algorithm has helped you get matched to a driver, and the data from your trip in turn helped the algorithm improve.
Against the backdrop of growing apprehension about self-learning algorithms (think ChatGPT), the Lyft story shows that some of these tools truly do improve everyone’s lives, Martin says.
“It’s not always a zero-sum game,” of trade-offs between winners and losers, he says. “Passengers are happier. Drivers are busier. The platform is making more money. There is literally no downside.”
Why closest isn’t always best
For most people, especially those of us who have had to stand on a rainy corner waiting for a rideshare, it seems logical that sending the closest driver makes the most sense. But that is not always the case.
The issue is when it’s busy and drivers are in limited supply, Martin explains. When that happens, the closest driver to a passenger might be pretty far away. If you send that driver, they’ll be spending a lot of time “driving empty,” and the passenger is stuck waiting for a long time and may even cancel their ride while the driver is en route. And, crucially, it means that any new passengers that try to hail a ride will need to wait even longer because the available drivers are spending so much of their time trying to get to their next fare, meaning there are fewer and fewer drivers available to shuttle people around.
“It’s like a death spiral for platforms,” Martin says.
The ideal solution, then, would be a matching algorithm that could forecast what the situation will look like over the next few minutes. Will a new, closer passenger appear? Will traffic clear on a certain road making the drive faster? If the driver does pick up someone, will there be another passenger near the destination point making that next transition more efficient?
“The improvement comes from the fact that the drivers are better utilized.”
Essentially, the algorithm would need to be able to predict what will happen next. And that’s what Martin and the team at Lyft were able to teach it to do.
They did this by focusing on the “value” of available drivers at any given time, with that value being the estimate of how much money the driver will earn while they work that day. Then they trained their algorithm to continuously analyze what was happening in real time in order to train itself to anticipate what was most likely to happen next.
It’s similar to reinforcement-learning algorithms that play chess, Martin says. They are trained on millions and millions of actual chess games and are then able to use that knowledge to forecast their opponents’ next move.
The team tested their algorithm by creating experimental hours, where Lyft matched drivers and passengers using the reinforcement-learning algorithm, and control hours, where matching was done by Lyft’s regular algorithm.
After more than a year of refining, they found a new algorithm that bested the old one across all important measures. It generated the equivalent of more than $30 million dollars a year in increased revenue for the company, along with a corresponding increase in drivers’ earnings. Passengers were 3 percent less likely to cancel a ride request, and there were 13 percent fewer ride requests that resulted in having no available driver. At the same time, passengers’ five-star reviews also increased.
“There weren’t more people using Lyft,” Martin says. “The improvement comes from the fact that the drivers are better utilized.”
Beyond the math
Their success is the first documented case of a rideshare company using reinforcement learning. But designing the algorithm was not the only difficult piece.
“More important than the math is how do you do this within the company,” Martin says.
Reinforcement learning means that the humans involved don’t always know what’s going on. That becomes tricky for an organization in a number of ways, Martin says. For example, say the team that works on pricing wants to run its own experiment. They would want all other factors at the time to be kept constant so that they could understand their data. But if a matching algorithm is changing things on its own at the same time, it’s difficult to know how to interpret the data from the pricing experiment.
“It makes a lot of other things much more complicated,” Martin says.
Additionally, it makes it difficult for the team working on the algorithm to understand how to continue to innovate. “If humans lose a sense of what is happening, how can they keep innovating?” Martin asks. He is working with a PhD student, Yudi Huang, who is currently working with Lyft on precisely that question.
Furthermore, at Lyft, the development of this algorithm took more than a year. “A year is a long time for a tech company. Two months is a long time! It’s very rare to spend a year on something that doesn’t work for that long,” he says.
Ultimately, the team kept up its morale and was able to convince the rest of the company to let it keep experimenting. There was no high-tech strategy for this, he says. “It’s the same way you do things anywhere,” he says. “You talk to the right people. You earn the trust of people. You form a team that is excited and then you show proof that it works. It’s common in research to think that the idea itself is enough. But in an organization, it’s the process that leads to something happening.”
The fact that, at least in this case, the process led to a “win–win–win” situation is particularly exciting to Martin.
Each time the team tested a revised algorithm, they would watch a dashboard of important metrics that would turn red if the experiment was worse than the status quo and green if it was better.
The day they landed on their winning algorithm, “the screen was just green,” he says. “That’s really what optimization in operations is all about: finding that fully green thing.”
Emily Stone is senior editor at Kellogg Insight.
3 Things to Keep in Mind When Delivering Negative FeedbackFirst, understand the purpose of the conversation, which is trickier than it sounds.
Podcast: Workers Are Stressed Out. Here’s How Leaders Can Help.On this episode of The Insightful Leader: You can’t always control what happens at work. But reframing setbacks, and instituting some serious calendar discipline, can go a long way toward reducing stress.
What Went Wrong at Silicon Valley Bank?And how can it be avoided next time? A new analysis sheds light on vulnerabilities within the U.S. banking industry.
How Are Black–White Biracial People Perceived in Terms of Race?Understanding the answer—and why black and white Americans may percieve biracial people differently—is increasingly important in a multiracial society.
Will AI Eventually Replace Doctors?Maybe not entirely. But the doctor–patient relationship is likely to change dramatically.
Leaders, Don’t Be Afraid to Admit Your FlawsWe prefer to work for people who can make themselves vulnerable, a new study finds. But there are limits.
Which Form of Government Is Best?Democracies may not outlast dictatorships, but they adapt better.
What Went Wrong at AIG?Unpacking the insurance giant's collapse during the 2008 financial crisis.
What Happens to Worker Productivity after a Minimum Wage Increase?A pay raise boosts productivity for some—but the impact on the bottom line is more complicated.
At Their Best, Self-Learning Algorithms Can Be a “Win-Win-Win”Lyft is using ”reinforcement learning” to match customers to drivers—leading to higher profits for the company, more work for drivers, and happier customers.
When You’re Hot, You’re Hot: Career Successes Come in ClustersBursts of brilliance happen for almost everyone. Explore the “hot streaks” of thousands of directors, artists and scientists in our graphic.
Why Do Some People Succeed after Failing, While Others Continue to Flounder?A new study dispels some of the mystery behind success after failure.
Immigrants to the U.S. Create More Jobs than They TakeA new study finds that immigrants are far more likely to found companies—both large and small—than native-born Americans.
Take 5: Tips for Widening—and Improving—Your Candidate PoolCommon biases can cause companies to overlook a wealth of top talent.
Why Well-Meaning NGOs Sometimes Do More Harm than GoodStudies of aid groups in Ghana and Uganda show why it’s so important to coordinate with local governments and institutions.
How Has Marketing Changed over the Past Half-Century?Phil Kotler’s groundbreaking textbook came out 55 years ago. Sixteen editions later, he and coauthor Alexander Chernev discuss how big data, social media, and purpose-driven branding are moving the field forward.
How Peer Pressure Can Lead Teens to Underachieve—Even in Schools Where It’s “Cool to Be Smart”New research offers lessons for administrators hoping to improve student performance.
How Much Do Campaign Ads Matter?Tone is key, according to new research, which found that a change in TV ad strategy could have altered the results of the 2000 presidential election.
Take 5: How Fear Influences Our DecisionsOur anxieties about the future can have surprising implications for our health, our family lives, and our careers.