It’s the familiar demand of sales and services businesses played out over the Web: Executives want to ensure that their customers obtain maximum satisfaction while keeping the costs of doing business as low as possible. That problem looms particularly large for online organizations. The combination of growing demand for e-commerce and the increasing complexity of the Internet’s infrastructure has put intense pressure on information technology (IT) planners to harmonize the two requirements in inventive ways. An innovative research project headed by Wuqin Lin, associate professor of managerial economics and decision science at the Kellogg School, provides guidance for doing just that.
Lin’s team applied nonlinear programming and probability theory to the issue of allocating Web systems’ capacity most effectively. The result: “We tell people how to solve this problem,” Lin says. “We tell them to focus on bottlenecks. That’s the key idea. Anyone involved in the design of infrastructure for e-commerce can benefit from it.”
The difficulty of both guaranteeing customer satisfaction and holding down e-commerce costs stems from the intricacy of the Internet’s infrastructure. “You have different types of resources—Web browsing servers, application servers, and database servers,” Lin explains. “As an IT planner, you have to decide how many different types of machines you will need to provide a certain quality of service, such as answering 95 percent of customers’ questions within five minutes.”
Satisfying customers in this instance is critical in two ways: First, individuals who wait too long for a vendor’s Web site to respond to their query will likely log off. In addition to a lost sale, it can lead to negative publicity stemming from bad-mouthing by disappointed customers. Time can also equal money; some online vendors offer specific guarantees of service quality and must pay a penalty if they fail to respond in a specified amount of time.
Lin started the project when he was working as an intern at IBM’s Watson Research Center. IBM scientists Zhen Liu, Cathy Xia, and Li Zhang collaborated on the study. The group looked at online customers’ satisfaction from two points of view. First, they studied the traditional measure of service quality—the average amount of time the Web site takes to respond to a customer. They also looked at the “tail distribution” of response times. This focuses on the chance that a customer will have to wait for a reply far longer than the average response time. Lin describes it as “the likelihood that a call will be answered in twenty seconds, twenty minutes, or any other length of time.”
The study’s inclusion of tail distribution is new and significant. “Very few papers on resource allocation of Web systems have studied the tail distribution,” Lin explains.
The significance of this extension can be demonstrated simply: Suppose there are two Web stores, A and B. Each time a customer visits store A, he needs to wait approximately three minutes for a response from the Web site. If a customer visits Web store B, depending on the day, he waits zero, three, or six minutes with probabilities of 10 percent, 80 percent, or 10 percent, respectively. The waiting times and the associated probabilities are summarized in Table 1. The average waiting time for both customers is the same between Web stores A and B. However, if every customer loses their patience and closes their browser after four minutes, store B loses the last 10 percent of their potential sales each day. According to earlier studies, the above two allocations result in the same sales amount but Wuqin Lin and his co-authors show that they are not equivalent, highlighting the importance of bottlenecks.
Table 1: Probability of waiting time for a new customer
That extra measure of customer satisfaction is just one of the complexities the team faced while trying to develop an approach to maximizing service and minimizing cost. A typical e-commerce system consists of multiple clusters of machines, each of which handles a particular service function. Front-end Web servers, for example, deal with requests for static Web pages. Application servers process requests for dynamic pages and obtain or update information from the system’s database server. Each specific transaction can involve multiple visits to multiple clusters.
Each server can also react to customers’ requests in two ways. Some servers use a first come, first served protocol to deal with transactions. Others participate in processor sharing—a form of multitasking that divides a specific job into several components dealt with by different servers. To some extent, designers of Web systems can select which protocol to use. “The choice will have an impact on the performance,” Lin says. “But sometimes you don’t have a choice.”
However the e-commerce system is designed, it has to perform a sequence of several actions in response to a customer’s arrival online. A single visit might involve several browsing search requests followed by an order to buy a product. Interspersed among those actions are delays caused by the network as it processes information or by the customer as he or she thinks about the details of the order.
Lin and his team developed a complete picture of the online service environment by mathematically modeling each facet of the service in turn. Their model gives IT designers a mathematical template that they can apply to individual online systems of different sizes, complexities, and service needs. The modeling also reveals a simple truth. “The performance of the whole system,” explains Lin, “largely depends on the bottlenecks.” So the message for the designer is to identify the bottlenecks and plan the system capacity based on these bottlenecks.
Lin, Wuqin, Zhen Liu, Cathy H. Xia, and Li Zhang (2005), “Optimal capacity allocation for Web systems with end-to-end delay guarantees,” Performance Evaluation, 62: 400-416.