At Their Best, Self-Learning Algorithms Can Be a “Win-Win-Win”
Skip to content
Operations Mar 1, 2023

At Their Best, Self-Learning Algorithms Can Be a “Win-Win-Win”

Lyft is using ”reinforcement learning” to match customers to drivers—leading to higher profits for the company, more work for drivers, and happier customers.

person waiting for rideshare on roads paved with computing code

Riley Mann

Based on the research of

Sébastien Martin

Sébastien Martin was working at Lyft as a postdoctoral fellow when Covid hit. Suddenly, there were massive changes in the number of passengers and drivers using the app, and the company tried to quickly adapt.

Lyft had always used an algorithm to match drivers and passengers, so they figured they could tweak it to make their Covid plan work. But it ended up being much harder than expected. “It showed the limit of the system,” says Martin, who is now an assistant professor of operations at the Kellogg School.

The main issue, Martin explains, is that simple algorithms—such as matching the closest driver to a passenger—actually don’t work that well.

It got Martin thinking about how the matching algorithm could be improved, even after rideshares recovered from the pandemic. What if the algorithm could teach itself how to better allocate drivers and then make those adjustments in real time?

He and a team from Lyft have accomplished just that. It took more than a year—an eternity at a tech firm, Martin says—to create an algorithm that could engage in “reinforcement learning.” And while designing the algorithm was difficult, so was getting buy-in across the company to even attempt this.

After all, with reinforcement learning, “you give away a lot of control,” Martin says. “A machine that can make decisions without telling you? Imagine if it’s making those decisions about work that’s your bread and butter.”

But the results were worth it: The company began making more money, drivers had more work, and passengers gave more five-star reviews. Plus, their project was named one of six finalists last month for the Franz Edelman Award, the most prestigious award in the field of analytics and operations research. If you’ve taken a Lyft in the last year or two, then this algorithm has helped you get matched to a driver, and the data from your trip in turn helped the algorithm improve.

Against the backdrop of growing apprehension about self-learning algorithms (think ChatGPT), the Lyft story shows that some of these tools truly do improve everyone’s lives, Martin says.

“It’s not always a zero-sum game,” of trade-offs between winners and losers, he says. “Passengers are happier. Drivers are busier. The platform is making more money. There is literally no downside.”

Why closest isn’t always best

For most people, especially those of us who have had to stand on a rainy corner waiting for a rideshare, it seems logical that sending the closest driver makes the most sense. But that is not always the case.

The issue is when it’s busy and drivers are in limited supply, Martin explains. When that happens, the closest driver to a passenger might be pretty far away. If you send that driver, they’ll be spending a lot of time “driving empty,” and the passenger is stuck waiting for a long time and may even cancel their ride while the driver is en route. And, crucially, it means that any new passengers that try to hail a ride will need to wait even longer because the available drivers are spending so much of their time trying to get to their next fare, meaning there are fewer and fewer drivers available to shuttle people around.

“It’s like a death spiral for platforms,” Martin says.

The ideal solution, then, would be a matching algorithm that could forecast what the situation will look like over the next few minutes. Will a new, closer passenger appear? Will traffic clear on a certain road making the drive faster? If the driver does pick up someone, will there be another passenger near the destination point making that next transition more efficient?

“The improvement comes from the fact that the drivers are better utilized.”

Sébastien Martin

Essentially, the algorithm would need to be able to predict what will happen next. And that’s what Martin and the team at Lyft were able to teach it to do.

They did this by focusing on the “value” of available drivers at any given time, with that value being the estimate of how much money the driver will earn while they work that day. Then they trained their algorithm to continuously analyze what was happening in real time in order to train itself to anticipate what was most likely to happen next.

It’s similar to reinforcement-learning algorithms that play chess, Martin says. They are trained on millions and millions of actual chess games and are then able to use that knowledge to forecast their opponents’ next move.

The team tested their algorithm by creating experimental hours, where Lyft matched drivers and passengers using the reinforcement-learning algorithm, and control hours, where matching was done by Lyft’s regular algorithm.

After more than a year of refining, they found a new algorithm that bested the old one across all important measures. It generated the equivalent of more than $30 million dollars a year in increased revenue for the company, along with a corresponding increase in drivers’ earnings. Passengers were 3 percent less likely to cancel a ride request, and there were 13 percent fewer ride requests that resulted in having no available driver. At the same time, passengers’ five-star reviews also increased.

“There weren’t more people using Lyft,” Martin says. “The improvement comes from the fact that the drivers are better utilized.”

Beyond the math

Their success is the first documented case of a rideshare company using reinforcement learning. But designing the algorithm was not the only difficult piece.

“More important than the math is how do you do this within the company,” Martin says.

Reinforcement learning means that the humans involved don’t always know what’s going on. That becomes tricky for an organization in a number of ways, Martin says. For example, say the team that works on pricing wants to run its own experiment. They would want all other factors at the time to be kept constant so that they could understand their data. But if a matching algorithm is changing things on its own at the same time, it’s difficult to know how to interpret the data from the pricing experiment.

“It makes a lot of other things much more complicated,” Martin says.

Additionally, it makes it difficult for the team working on the algorithm to understand how to continue to innovate. “If humans lose a sense of what is happening, how can they keep innovating?” Martin asks. He is working with a PhD student, Yudi Huang, who is currently working with Lyft on precisely that question.

Furthermore, at Lyft, the development of this algorithm took more than a year. “A year is a long time for a tech company. Two months is a long time! It’s very rare to spend a year on something that doesn’t work for that long,” he says.

Ultimately, the team kept up its morale and was able to convince the rest of the company to let it keep experimenting. There was no high-tech strategy for this, he says. “It’s the same way you do things anywhere,” he says. “You talk to the right people. You earn the trust of people. You form a team that is excited and then you show proof that it works. It’s common in research to think that the idea itself is enough. But in an organization, it’s the process that leads to something happening.”

The fact that, at least in this case, the process led to a “win–win–win” situation is particularly exciting to Martin.

Each time the team tested a revised algorithm, they would watch a dashboard of important metrics that would turn red if the experiment was worse than the status quo and green if it was better.

The day they landed on their winning algorithm, “the screen was just green,” he says. “That’s really what optimization in operations is all about: finding that fully green thing.”

About the Writer

Emily Stone is senior editor at Kellogg Insight.

Most Popular This Week
  1. 3 Things to Keep in Mind When Delivering Negative Feedback
    First, understand the purpose of the conversation, which is trickier than it sounds.
  2. Podcast: Workers Are Stressed Out. Here’s How Leaders Can Help.
    On this episode of The Insightful Leader: You can’t always control what happens at work. But reframing setbacks, and instituting some serious calendar discipline, can go a long way toward reducing stress.
  3. What Went Wrong at Silicon Valley Bank?
    And how can it be avoided next time? A new analysis sheds light on vulnerabilities within the U.S. banking industry.
    People visit a bank
  4. How Are Black–White Biracial People Perceived in Terms of Race?
    Understanding the answer—and why black and white Americans may percieve biracial people differently—is increasingly important in a multiracial society.
    How are biracial people perceived in terms of race
  5. Will AI Eventually Replace Doctors?
    Maybe not entirely. But the doctor–patient relationship is likely to change dramatically.
    doctors offices in small nodules
  6. Leaders, Don’t Be Afraid to Admit Your Flaws
    We prefer to work for people who can make themselves vulnerable, a new study finds. But there are limits.
    person removes mask to show less happy face
  7. Which Form of Government Is Best?
    Democracies may not outlast dictatorships, but they adapt better.
    Is democracy the best form of government?
  8. What Went Wrong at AIG?
    Unpacking the insurance giant's collapse during the 2008 financial crisis.
    What went wrong during the AIG financial crisis?
  9. What Happens to Worker Productivity after a Minimum Wage Increase?
    A pay raise boosts productivity for some—but the impact on the bottom line is more complicated.
    employees unload pallets from a truck using hand carts
  10. At Their Best, Self-Learning Algorithms Can Be a “Win-Win-Win”
    Lyft is using ”reinforcement learning” to match customers to drivers—leading to higher profits for the company, more work for drivers, and happier customers.
    person waiting for rideshare on roads paved with computing code
  11. When You’re Hot, You’re Hot: Career Successes Come in Clusters
    Bursts of brilliance happen for almost everyone. Explore the “hot streaks” of thousands of directors, artists and scientists in our graphic.
    An artist has a hot streak in her career.
  12. Why Do Some People Succeed after Failing, While Others Continue to Flounder?
    A new study dispels some of the mystery behind success after failure.
    Scientists build a staircase from paper
  13. Immigrants to the U.S. Create More Jobs than They Take
    A new study finds that immigrants are far more likely to found companies—both large and small—than native-born Americans.
    Immigrant CEO welcomes new hires
  14. Take 5: Tips for Widening—and Improving—Your Candidate Pool
    Common biases can cause companies to overlook a wealth of top talent.
  15. Why Well-Meaning NGOs Sometimes Do More Harm than Good
    Studies of aid groups in Ghana and Uganda show why it’s so important to coordinate with local governments and institutions.
    To succeed, foreign aid and health programs need buy-in and coordination with local partners.
  16. How Has Marketing Changed over the Past Half-Century?
    Phil Kotler’s groundbreaking textbook came out 55 years ago. Sixteen editions later, he and coauthor Alexander Chernev discuss how big data, social media, and purpose-driven branding are moving the field forward.
    people in 1967 and 2022 react to advertising
  17. How Peer Pressure Can Lead Teens to Underachieve—Even in Schools Where It’s “Cool to Be Smart”
    New research offers lessons for administrators hoping to improve student performance.
    Eager student raises hand while other student hesitates.
  18. How Much Do Campaign Ads Matter?
    Tone is key, according to new research, which found that a change in TV ad strategy could have altered the results of the 2000 presidential election.
    Political advertisements on television next to polling place
  19. Take 5: How Fear Influences Our Decisions
    Our anxieties about the future can have surprising implications for our health, our family lives, and our careers.
    A CEO's risk aversion encourages underperformance.
More in Operations