Inefficient incentives and nonprice allocations: Experimental evidence from big‐box restaurants

Funding information Social Sciences and Humanities Research Council of Canada Abstract Queues are puzzling because they are consistent with wasted profit in equilibrium. Standard rationales trace the puzzle to the pricing of goods. This article uses field experimental evidence from large‐scale restaurants to trace the puzzle to the pricing of labor. The customary wage contract in these settings fosters congestion and longer queues because it can encourage workers to emphasize the quality rather than quantity of output. To study this problem, the field experiment pays waiters bonuses for customer volume on days with excess demand, in addition to the tips and hourly wages they customarily receive. The experimental contract shortens queues substantially, generating surplus gains for consumers with no discernible cost in terms of perceived service quality. Workers earn more via the bonuses and because they earn more in tips. Short‐run profits increase by at least 49%. There is no discernible reduction in long‐run profit. The firm reverted to the baseline contract on excess demand days after many months of evidence, even after acknowledging the gains from the experimental contract. The evidence suggests the puzzle may partly be explained by inefficient wage contracting.

informal arrangements are operational, but that the gains are small relative to the experimental contract. I also assess whether the firm generally makes good choices. I show they do, and are particularly skilled at keeping costs down, suggesting that something other than poor decision making explains the initial use of the customary contract.
Despite generally making good choices, and becoming fully aware of a more profitable alternative, the firm decided against permanent adoption of the experimental contract on excess demand days. What explains this decision? There are two factors, according to the CEO. The first is the money. The dollar cost was $200 per shift (20 workers × $ 10 per worker), which was financed externally by a grant that aims to bridge the gap between academia and the business community. The firm would rather not finance it themselves, despite knowing that it generates $2,000 more in revenue per shift. The second factor is "workers earn enough already"; they earn a fair wage that obliges them to act in the firm's interests. These factors suggest the CEO has an aversion to material spending, particularly when it pays workers more than their perceived worth.
The baseline experimental results contribute to a literature that examines the effects of individual incentive pay (Bellemare & Shearer, 2013;Copeland & Monnet, 2009;Jayaraman, Ray, & de Véricourt, 2016;Lazear, 2000;Paarsch & Shearer, 1999;Shearer, 2004), 3 and that focuses more specifically on the effects in jobs where workers carry out multiple tasks (Muralidharan & Sundararam, 2011;Dumont, Fortin, Jacquemet, & Shearer, 2008;Brickley & Zimmerman, 2001). Unlike previous studies, this article examines the effects of a scheme that effectively pays workers to lower the quality of their output. It shows that such a scheme can make workers, the firm, and consumers no worse off.
The decision to not pay for customer volume in the pre-experimental period alone begs questions about contractual efficiency, as the decision violates the informativeness principle (Holmstrom, 1979(Holmstrom, , 1982Shavell, 1979), wherein all easy-to-observe signals of worker effort are written into wage contracts. 4 Such concerns are reinforced by the decision to not pay for customer volume in the postexperimental period. In this regard, the article contributes to a literature that questions the optimality of decisions by the firm. Early contributions debated the assumption that firms maximize profit (Hall & Hitch, 1939;Machlup, 1946). Recent studies have examined the optimality of behavior by professional athletes or their coaches (Abramitzky, Einav, Kolkowitz, & Mill, 2012;Romer, 2006), by sellers of tickets to sporting events (Sweeting, 2012), cable companies (Byrne, 2015), firms in spot markets for electricity (Hortasçu & Puller, 2008), rental car companies (Cho & Rust, 2010). This article examines the optimality of wage contracting in big-box restaurants.

| CONTEXT
The franchises are part of a long chain of higher end big-box restaurants. They are only open for dinner and located in the suburbs of Toronto. 5 The franchises are isolated from competing firms, having their own dedicated building, such that consumers must drive to get there. Consumers cannot be solicited from the street, therefore. The franchises serve approximately 2,500-3,000 customers on average per week.
Prices and product offerings are set centrally by the chain. They are the same for all days of the week. They are fixed over an extended period of time, usually lasting more than a year. Prices and product offerings are the same for all stores within the same broad geographic area.
Approximately half the weekly customer volume comes from busy days (Friday and Saturday evenings). Busy days almost always have queues for seating during the high season, which begins in September and ends in May. Upward of 100 customers can walk away in a single night after learning the expected wait time for a table. The revenue loss from having 100 customers walk away is substantial. Much of this amounts to lost profit because labor costs are effectively fixed.
Figures 1a and 1b uses data from busy and slow days during 2006 and 2007 (pre-experiment) to illustrate the importance of customer volume. Figure 1a plots revenue per customer against customer volume, showing it holds constant at roughly 35 dollars, and implying a relatively small opportunity cost to increasing volume. Figure 1b plots the wage bill per customer against customer volume, showing labor is essentially a fixed cost, as the average wage bill declines steeply before stabilizing below 2 dollars per customer. It pays, therefore, to allocate workers to shifts in a way that spreads the wage across as many customers as possible.
There are 2-4 managers on duty per shift. Table and customer assignments are usually done in consultation with other managers and support staff (greeters, e.g.). Managers have direct authority over busboys, telling them which tables need to be bussed and when. Managers rotate through these duties. They earn a percentage of the firm's accounting profits.

| Worker incentives
Workers are paid tips and a fixed hourly wage, equaling the minimum wage for servers in Ontario. Tips are at the discretion of customers and effectively proportional to the revenue each worker generates. Tips are not shared with other waiters, but they are shared with the support staff. Workers transfer 4% of the revenue they generate to the support staff at the end of each shift. 7 To fix ideas while keeping things simple, assume tips equal τrn, where τ is the tip rate, r is revenue per customer, and n is customer volume. Workers have several channels for increasing tip earnings at a cost: they can try to increase the tip rate by, for example, spending more time socializing with customers; they can try to generate more revenue from customers by, for example, convincing them to purchase add-on goods; they can try to serve more customers by, for example, moving faster or discouraging purchases of desserts.
In the opinion of managers, the busy-day losses stem from worker tendencies to focus on customer service. In theory, the problem can stem from the formal wage contract. The contract is problematic either because it overemphasizes personal service (at the expense of customer volume) or because it provides weak incentives for customer volume. The contract can overemphasize personal service because better service can yield higher tip rates and more revenue from each customer. Better service means less to the firm because they do not accrue, at least not fully, the gains from higher tip rates. Because better service often comes at the expense of customer volume, the worker serves fewer customers than the firm would prefer. The contract can provide weak incentives if tips are simply insufficient for encouraging the customer volume the firm prefers.
Our problem departs from classic incentive problems in an important way. In the classic multitasking incentive problem Holmstrom and Milgrom (1991), the contract pays the worker for performance in the easy-to-measure task (e.g., quantities), unless it comes at the expense of performance in the hard-to-measure task (e.g., quality). 8 In our setting the customary contract pays the worker relatively more for their performance in the hard-to-measure task: service quality. This could come at the expense of their performance in the easy-to-measure task: customer volume. In the classic problem the firm wants more focus on the hard-to-measure task. They would pay for it but, because the task is hard measure, are unable to. In our setting the firm wants more focus on the easy-to-measure task. They could easily pay for it, but choose not to.
The firm has incentivized customer volume historically but has typically avoided schemes which involve large outlays of money. They often run contests where the prize is in-kind (a free drink) or reward workers with earlier start times and better tables (for customer volume) if they have proclivity for generating customer volume without sacrificing much service quality. Further to this point, the firm regularly reminds workers of the importance of customer volume, even going so far as to discourage them from offering customers the dessert menu, encouraging them instead to politely deliver the bill once the main course is done.
That the goal is well-known, and even incentivized, reinforces the research design. A standard concern with experiments inside firms relates to whether treatment effects reflect implicit incentives to appease bosses, rather than the explicit incentives experiments typically offer. If the worker believes treatment nonresponse can cost them the job, for example, then treatment effects will reflect career concerns as well as the response to the bonus. This is less of a concern in our setting because workers knew the goal well before the experiment began. 9

| EXPERIMENTAL CONTRACT
The bonus amounted to b n n I n n ( − ) ( > ) * *, where b is the bonus rate, n* is a performance standard for customer volume, and I is the indicator function. The bonus rate was chosen so that a worker who exceeded the performance standard by one standard deviation would earn between $20 and $30, or more than 10% of average daily earnings. 10 The performance measure and standard, n and n*, were both adjusted for average shift length (hours worked) and average section size (the number of seats a worker is responsible for). The performance standard was calculated using historical data. 11 It helps prevent workers from earning the bonus without changing their behavior.
The experiment was run at one of the franchises during the high season (September until May). The experiment had two treatment blocks. In the first block, workers at the treated franchise were paid bonuses on every Friday and Saturday in November and January of the 2009-2010 season. In the second block, treated-franchise workers were paid bonuses on every Friday and Saturday in late April and May of 2010. The second treatment block differs from the first in that the performance standard was tailored to the individual worker on the basis of the table section they were assigned that evening. The standard was increased if the worker's tables facilitated customer volume historically, and decreased if the tables impeded customer volume historically.
In early October of 2009, the CEO informed workers that someone would be conducting interviews at the firm. The ethics review board requested that I identify myself as an unpaid researcher from the University of Toronto, inform workers that the general purpose of the interviews was to understand factors underlying wait times for customers, and request their participation in the study. The CEO asked me to handle administration of the experiment.
At the time of the interviews, workers were unaware that they were to receive bonuses for good performance starting in November of 2009. During this month, I also made daily appearances at the firm. The purpose was to familiarize workers with the experimenter and to reduce the chances of experimental outcomes reflecting efforts to appease the experimenter. My presence during the control period helps because if workers were trying to appease me, by for example, undertaking activities that usually reduce wait times, they were probably already doing it before the experimental contract was implemented.
In November of 2009, workers were informed about the performance bonuses on the day of (after they arrived) their previously scheduled shift. To smooth the transition to the experimental contract, the bonuses were introduced as an otherwise typical contest. To ensure that workers understood the scheme, they were asked to demonstrate their understanding in the context of several hypothetical examples before the start of their shift. To ensure that one dollar in tip earnings (which are paid immediately) was equivalent to one dollar in experimental earnings, workers were paid (privately) once the shift was completed. To minimize the influence of sorting on the empirical results, the length of the treatment period was not revealed to workers. Each worker's experience followed a similar pattern on subsequent treated days.
While workers were unaware of the start or end date for the experimental contract, they may form and revise expectations concerning whether and when the contract is available. These expectations can affect worker and manager decisions pertaining to the shifts workers work, and thereby compromise estimates of the pure incentive effect of the experiment. Online Appendix Table A1 uses data on worker bids for shifts and their acceptance by the firm to speak to the concern that the experimental contract altered the matching of workers to shifts (cf. Ackerberg & Botticini, 2002). It shows that the treatment effect on bids is weak, in line with what I was told by workers, namely that other commitments (other jobs, school, family) drive decisions about whether and when to work. It also shows a weak effect on accepted bids, in line with what the CEO instructed managers to do during the study, namely to go about their business as they normally would.
The transfer of tips to the support staff raises questions about the effect of the experimental contract on the support staff, as well as the implications for the experimental results. There are two issues. The first relates to whether the experimental contract elicited a direct response from the support. They had incentive to respond because the contract can increase revenue, and because more revenue implies more transfer income. The second relates to whether workers paid support staff side payments in exchange for help with customer volume.
The support staff showed little to no interest in the study, many were completely unaware of the particulars of the experiment, having said things like "whatever you're doing with the waiters, it is working." They at best responded indirectly to the extra volume the experimental contract generated. 12 The scale makes side payments implausible in practice because it would lead to significant congestion. The firm would moreover fire any employee caught doing this.
Note that the research design either treats all workers in a shift or none. Complementarity in production stopped us from randomizing within shifts. If the treatment motivates workers to move faster, and the extra speed alleviates congestion, control workers would find it easier to serve more customers. The congestion would then lead to underestimates of the true effect of the bonuses.

| DESCRIPTIVE STATISTICS
Managers and greeters log detailed records of the number of customers who leave after learning the expected wait time. I use this variable to measure excess demand. The mean and standard deviation are 33.9 and 34.8, respectively. The maximum is 180. Note that the number of leavers measures excess demand with error, as consumers may leave before learning the expected wait time, perhaps because they infer lengthy wait times upon observing long queues. Later I will discuss the implications on treatment effect estimates for the number of leavers.
I will sporadically make use of tip rate information from bill-level data, which is based on every bill paid by either credit or debit (more than 75% of bills). I will estimate the relationship between tip rates and worker effort and time use. These regressions can be problematic if the payment method varies systematically with personal attributes of customers. This is a problem if, for example, the attributes are used by the firm to assign pay-by-cash customers to poorservice workers, and pay-by-cash customers generally tip less. Then we could overstate the relationship between tip rates and worker efforts and time use. Quasi-random matching stops these selective assignments from happening.
Tip rates and service quality measures from the preexperimental period are summarized in Online Appendix Table  A2. The tip rate mean is 14.4% with a standard deviation of 4.5 percentage points. More than seven items are sold at a price per item of just over 6 dollars. Bills take about 90 min on average, with about 20 min between bill settlement and the first order on the next bill, and 20 min between the last dessert order and bill settlement (time to linger). 13 As I will explain, these five variables (average price, items sold, time with and between customers, and time to linger) are reasonable measures of worker inputs when aggregated to the daily level. economic crisis that began in 2008 raises doubts about whether this was the case. The crisis might have led to a smaller change in 2008-2009 relative to 2009-2010, as big spenders may have stopped visiting the franchise in November of 2008 and January of 2009, or the customers who continued to visit spent less than usual. Either way, the crisis could bias the estimates upward. 15 Data from a comparable franchise helps with the concern. Both franchises were opened by the same ownership group. The franchises have identical menus, and thus prices, compensation schemes, variable costs, procedures, organizational structures, and even similar physical layouts. The franchises are located in adjacent subdivisions, being about 30 min apart by car. The main difference between the franchises is scale, as the treated franchise is bigger. Online Appendix A1 explores the similarities and difference in greater detail, explaining and showing how the scale only generates intercept differences between the franchises. Ultimately, with the control franchise, the treated franchise this year and last (and 2006-2007), our full sample consists of about 120 workers, and more than 4,400 worker-calendar date observations. 16 Table 1 summarizes the unconditional effects explicitly. Rows 1 and 2 describe the effects on the trade off between customer volume and revenue per customer. Rows 3 and 4 describe the effect on the money payoffs for workers and the firm. Rows 5 and 6 describe earnings from the experiment, as well as the share of workers who earned the bonus. Moving left to right, the table summarizes the sources of variation we will use to interpret the effects causally. Note that the treated franchise targets 20 workers per shift on days with excess demand. The control franchise has fewer workers per shift because, as noted elsewhere, the primary difference between the franchises is scale. Table 1 shows the effects on revenue and tip earnings are robust. The revenue increase ranges from $92 to $163. The increase in tip earnings ranges from $12 to $15. The estimates imply a more than 10% gain in money payoffs for workers and the firm.
Row 2 shows workers serve more customers. Around 3 more in November 2009 and January 2010 than they did in October of 2009. Moreover when the difference is compared with the difference for 2008-2009. The estimate varies between 3.3 and 3.5, depending on the sample we use. It is always statistically significant. The effect on revenue per customer is less clear. Columns 3 and 4 imply workers generate less revenue from each customer served. Column 5 shows an increase in revenue per customer. Two of the estimates are statistically insignificant. The ambiguity ultimately highlights the importance of data from multiple franchises. Comparing (3)  Outcomes by the calendar date provide good measures of worker behavior because customer idiosyncrasies are averaged out at this level of aggregation. The idiosyncrasies are averaged out because of quasi-random matching of customers with workers, and because on days with excess demand there are lots of customers. The matching process implies that two workers, with open tables, are equally likely to draw a good customer. Because each worker sees lots of customers, differences in draws balance out over the course of a shift. The average customer is less likely to differ systematically across workers in the same shift.

| CONSUMER SURPLUS
I assume the number of leavers at franchise f on date d are generated by ) fd indicates the availability of the performance incentive. X fd includes an indicator for tailored performance standards and the total number of customer arrivals. Arrivals helps us control for level differences in consumer demand across franchises, as depicted in Online Appendix Figure A1. γ d includes fixed effects for the day of the week, week, season, as well as weather controls. The influence of the weather is the same for the two franchises because they are located within 1 hr driving distance of each other.
Leavers fd measures the true number of leavers with error. It counts the number of consumers who left after hearing the wait time, not the number who left before hearing the wait time. Miscounts are common. Both types of measurement error are more likely during busy service periods. While the error should correlate positively with the true number of leavers, the implications for estimates of β is ambiguous, differing depending on how the error varies with the experimental contract. However, if the experimental contract decreased miscounts or the number who left before hearing the wait time, perhaps because consumers are seated more quickly, then this measurement error will lead to underestimates of the effect on leavers.
Estimates are found in Table 2. Columns 2 differs from Column 1 in that it includes fixed effects for the franchise. A comparison of the columns tells us how much of the overall effect comes from within the treated franchise (because the control franchise is never treated) and from differences between the treated and untreated franchises. The comparison provides suggestive evidence on the role of scale in the franchise-level response to the experimental contract because, as noted in the last section and Online Appendix A1, scale is the primary difference between the franchises. I base inference on robust standard errors, but the inferences are robust to several other approaches, including the wild cluster bootstrap clustered at the franchise level.
Column 1 shows 15.88 fewer leavers under the experimental contract (p < .01), amounting to a 47% reduction relative to a mean of 34. This evidence is consistent with the idea that the customary contract is an important driver of congestion and queues. Column 2 shows 7.48 fewer leavers once we condition on the franchise (p > .1). Approximately half the overall decrease can be attributed, therefore, to the within treated franchise response. The other half comes from between franchise variation, generated by differences in scale, for example. Shorter queues should generate surplus gains for low value consumers, as some of these were reallocated from their outside option (another restaurant, eating at home, e.g.) to a more preferred option. Shorter queues should also generate surplus gains for high value consumers, who would have stayed even with longer queues. These consumers gain via savings on costly waiting time.
To evaluate the potential cost of the surplus gains, I estimate where b indexes the bill. 17 Note that bills can differ depending on the worker who handles it, the customer, franchise, date, and table. T b indicates whether the bill was handled at the treated franchise on a treated day. X b includes fixed effects for time of day, day of the week, calendar week, and franchise. q b is a compact representation of effort and time use measures. The interaction terms reflect the indirect experimental effect that operated via changes in effort and time use. Note that the specifications exclude time between customers and time to linger in the interest of parsimony and because their coefficient estimates are fragile statistically. Estimates are found in Table 3. Column 1 gives the base effect, Column 2 includes effort and time use, and Column 3 their interaction with the treatment. The time and effort measures are standardized, their coefficients measure the effect of a one standard deviation increase. Tip rates increased by about a 10th of a percentage point, but this increase is not statistically different from 0. Column 3 suggests customers became less sensitive to the service measures. The coefficient on average price went from −0.14 to 0.03 (a statistical 0). The coefficient on quantities (items sold) fell from 0.24 to 0.13. There was no change in the coefficient for time with the customer.
A couple of factors explain the negligible effect on tip rates and customers' diminished sensitivity to prices and quantities. First, the bonuses made workers move faster. This is consistent with the reduction in the time between customers and observations made by managers and support staff. Second, customers appeared to spend less time waiting. By shortening wait times, the experiment may have improved customer perceptions of the overall service quality, and may even have improved customer perceptions of the service quality of the worker T A B L E 3 Changes in perceived service quality who served them. In either case, the results suggest that the cost to the surplus gains for consumers was marginal at most.

| WORKER SURPLUS
I assume outcomes y ifd for worker i is generated by (1) α i is a worker fixed effect, γ d includes fixed effects for the day of the week, the calendar week, and season, and X ifd includes a count of the days worked in our sample, the average days in sample for coworkers who work that day, the worker's start time, and fixed effects for the tables they were assigned. Note that days in sample excludes days from after the start of the treatment. γ d reflects trends that influence the franchises in similar ways, such as the weather. Coworker days in sample proxies for the help from others (Drago & Garvey, 1998;Itoh, 1991). It presumes seasoned coworkers are better at helping while, at the same time, balancing their own responsibilities. Start times and table fixed effects help account for differential opportunities to produce and earn more. Early starts and better tables (e.g., ones near the kitchen) give the worker better opportunities to serve more customers. Some tables afford customers greater privacy and comfort.
The parameter of interest is β. β should be positive when the dependent variable is customer volume. β should be negative when the dependent variable is revenue per customer or service quality, if there is a trade off with customer volume. β can be positive or negative when the dependent variable is total revenue. Table 4 reports estimates for revenue and tip earnings. Column 1 reports results for specifications that only include worker fixed effects. The remaining columns illustrate the influences of controls. Columns 1 through 4, and 6 through 9, make no distinction between the two treatments, even though earning a bonus was more difficult when workers had individual performance standards. Columns 5 and 10 separate the effects of the two treatments, and thus of the standard and bonus rate.
Columns 5 and 10 have three notable patterns. First, making no distinction draws down estimates of the incentive effect. From Column 4-5, the effect on revenue increases from $88 to $113. From Column 9-10, the effect on earnings increases from $6 to $9. Second, tailoring the standard decreases output and earnings. It decreases revenue by $78, and tip earnings by $10. That said, the estimates imply workers and the firm were no worse off monetarily.
My preferred specification for the remainder of this section includes binary variables for the performance incentive and tailored standard, fixed effects for the worker, day, week, and season. It excludes inconsequential variables, like days in sample (own and peers). It excludes start times and table fixed effects because they are inconsequential and bad controls, reflecting the direct treatment effects on workers and indirect effects that operate through managerial behavior.

T A B L E 4 Production and earnings under the experimental contract
Revenue Earnings Performance incentive available 163*** 141*** 84*** 84*** 107*** 10*** 11*** 6* 6* 9** (23) Note: The unit of observation is the worker-franchise-date combination. All regressions include fixed effects for the worker. Standard errors (in parentheses) are clustered on the worker, with ***p < .01, **.01 < p < .05, and *p < .1. Table 5 examines the effects on more direct measures of worker behavior: the top panel reports the effect on revenue per customer and customer volume; the middle the effect on revenue per item (average price), items sold (quantities), time with and between customers; the third the effect on base good and add on sales, and on the time to linger. The second and third panels are nested in accordance with the input or task the variable affects.
Workers served 2.5 (p < .01) more customers under the experimental contract, but generated 68 cents less (p < .1) revenue from each. The left nest of the middle panel (under Revenue per customer) shows workers sell 0.1 fewer items to each customer they serve (p < .1). The same nest of the third panel shows workers sell 0.7 fewer items in the part of the bundle that most customers order (p < .05), and 0.3 fewer dessert items (p < .1). The estimates imply the trade off from serving more customers includes a small change in the revenue they generate from each.
The right nest of the second panel shows the experimental contract induces workers to spend 3.4 fewer minutes with customers (p < .01). The right nests of the second and third panels show less time between customers, and less time for customers to linger, but that p > .1 in both cases. The estimates imply a negative relationship between customer volume and time per customer. Table 6 evaluates whether these patterns are consistent with reductions in service quality. The bottom row correlates the tip rate with the time and effort measures. The estimates come from Column 3 of Online Appendix Table A3, which shows the estimates are largely robust to the inclusion of various controls. The estimates there help to some extent with concerns that "time" and "effort" are equilibrium outcomes. To make the comparison obvious the top row replicates relevant treatment effects from Table 5.
Customers tip less when they pay higher prices. They tip more when they consume more items, the bill takes more time, there is more time to linger, and when there is more time between customers. The signs are the opposite of the signs for the treatment effects on these inputs. Opposite signs support a reduction in service quality under the experimental contract. Figure 3 visualizes the evolution of revenue in the treated franchise in the treated and control seasons (2009-2010 and 2008-2009). In between the two treatment blocks, revenue returned to pretreatment levels, and to its levels from the same period last year. There was then a large increase during the second treatment block, similar to the first block increase.  Figure 3 has three implications. First, it implies the results are not a consequence of transitory responses to treatment, including responses that arise because of Hawthorne, placebo, or experimenter demand effects. Second, it implies the problem with workers is not that they unaware of greater earnings opportunities via less time with individual customers. If they were previously unaware, and learned it from the experiment, then they should continue to deliver worse service after the contract is taken away. Third, in suggesting that the experimental contract is compensating workers for their effort costs, it implies workers are no worse off under the experimental contract, at least in relation to a worker preference that depends on money earnings and effort costs.

| PROFITS
The profits from a shift are where the sum is taken over all workers, R is revenue, F are fixed costs, and ψ denotes the share of revenue allocated to variable (food) inputs. Our primary interest is in measuring the percentage change in profits where I is the experimental incentive cost per worker, δ is the share of every dollar earned that becomes profit (after accounting for fixed costs). The owner told me the variable margin ψ p on each product category p. We will replace ψ with the maximum across all categories. The owner also told me some recent values for δ. We will take the minimum of these values because the minimum is a lower bound on the profitability of each dollar earned. The values for δ were similar across product categories-the choice for the lower bound has little to no effect on the calculation. After plugging in the reported margins, we obtain %ΔΠ 49% ≈ . 18 This section elaborates on why 49% might provide a good approximation for the change in overall profit. Note: This table shows that the experiment induced behaviors which are typically bad for tip rates. The top row reports estimates from a single regression of tip rates and measures of worker effort and time use. The bottom row reports estimates from 5 regressions: of the average price on a treatment dummy, items sold on a treatment dummy, time with customers on a treatment dummy, etc. The bottom-row estimates are taken from Table 5. ***p < .01, **.01 < p < .05, and *p < .1. A revealed preference argument implies that we should not expect a significant negative effect on repeat business. 19 Why would the owner let us conduct the experiment if he expected otherwise? I asked him about it. This is what he said Consumers are smart. They sort themselves into days that best suit their needs. Regulars avoid busy days, as they prefer more attention from the worker. On busy days we get one-timers, consumers who dine out once a year and who prefer a place that is lively and busy.

F I G U R E 3
His statement has two messages. First, busy-day customers are unlikely to return, at least not for a while, no matter the service quality. Second, because of their inexperience, busy-day customers find it more difficult to detect slight changes in service quality. In line with his statement, when I asked if we could pay for customer volume on slow days, the answer, unequivocally, was no. Table 7 corroborates the owner's claim that "diners" visit on slow days and thus that the negative effect on repeat business should be moderate. Slow-day customers pay more in tips, pay more for each item they buy, and buy more items. They are also more likely to linger at the table. It also takes longer to reseat their tables. Somewhat surprisingly, workers spend more time overall with busy-day customers (Column 4). Column 6 makes this less surprising, as it shows extra time with customers is more than balanced out by the extra time it takes to reseat the table. It is also less surprising because slow days have less congestion in the kitchen.
In the Online Appendix I use credit card information to provide more direct evidence of a moderate effect on repeat business. The data allows me to group customers by a "type" and follow types over time. The data identifies the customer by the type of card (Gold, Platinum, etc.), the financial institution or bank that issued the card, as well as the financial service provider (Visa, Mastercard, etc.). Online Appendix Figure A2 shows that these repeat types follows a similar evolution, across the franchises, after the end of the first treatment.
Several factors can explain the moderate effects on repeat business. First, the surplus gains to consumers under the experimental contract may offset or even outweigh the losses from reductions in some aspects of personal service quality. This could explain the moderate effect and could even increase repeat business. Second, the time horizon for the data may be inadequate for properly capturing the long-run effects on repeat business. The fact that the busy-day customers visit infrequently alone supports this possibility, as in some cases it can take years before the customer returns. Third, and relatedly, it may be difficult to fully capture the long-run costs of the experimental contract. For example, consumers might tip less once they find out about the bonuses, even if there is no reduction in perceived service quality. They may simply feel less pressure to tip because they know the firm is covering more of worker effort costs.
I can say little about the long run effects of adjustments by competing firms in the absence of exogenous variation in competitive pressure. It is important to note, however, that for various reasons-including have the strong brand name and loyal customer base of a big-box restaurant, and (perceived) product differentiation-the firm is a leader in local product markets. It generates excess demand on busy days because it is a choice destination for many consumers. Shorter queues allow the firm to keep the lower value consumers among this group. From this perspective, it will be difficult for other firms to erode all the gains from the experimental contract.

| PROPENSITY FOR GOOD DECISIONS
The evidence implies nonnegligible surplus gains from the experimental contract on excess demand days. Why then would the firm initially use the customary contract on these days? The simplest answer might be that the firm is run poorly. I present evidence against this argument. However, I also explain that despite having a propensity for decisions which enhance productive efficiency, and despite becoming fully aware of a more profitable alternative that benefits consumers without imposing significant costs on workers, the firm reverted to using the customary contract on excess demand days following the experiment. I briefly discuss the rationale for reversion.

| Implicit contracts
Managers can assign early start times to workers who tend to serve more customers. They can also assign high-volume workers to more tables or tables that turn over more quickly. I evaluate whether the historic use of these instruments on excess demand days reflects attempts to increase revenue. I will use the ratio of the worker's historical revenue to that of their coworkers (in a shift) to measure their relative productivity. I will then evaluate whether the firm takes relative productivity into account when assigning start times and tables on excess demand days. If they do then it would support productive efficiency as a central goal. The top panel of Table 8 reports estimates for regressions of start times and table assignments on relative productivity during the pre-experimental period. Columns 2-5 examines the effects on the quality and quantity of the seats. Booth, bench, and chair seats measure quality in the sense that customers tend to prefer booth seats to benches to chairs. Columns 6 examines the effects on how, given the table assignment, easy it is for workers to serve more customers. The historical turn rate is based on the ratio of the historical customer volume of the table to the number of seats. 20 Column 7 examines the effect on whether the firm fills the table of the worker. Specifically, it reports the effect on the share of seats that go ununused. This column provides a check on the exogeneity of the procedure that matches customers with workers. Column 8 looks at work hours.
Columns 1, 5, and 8 shows high-productivity workers are assigned earlier start times, more seats, and end later. The evidence in Columns 2-4 and 6 show relative productivity has modest effects on the quality of seating assignments, and on the extent to which the assignments facilitate volume. The estimates suggest start times, table assignments, and hours are instruments for increasing revenue. The top panel Table 8 ultimately has two takeaways. First, it seems that the firm was trying to use implicit contracts to increase profit. Second, informal incentives are insufficient for solving the firm's problem. If they were, the experiment would not have had the effect that it did. This raises concerns about whether the experimental results reflect managerial responses. Accordingly, the bottom panel reports the experimental contract's effect on various assignments by managers. Most estimates imply nonresponse by managers. The lone exception is with start times, which shows workers started 15 min earlier under the experimental contract. Having said that, Online Appendix A.2 explains that the main results are unaffected by controls for start time.

| Transfers
The firm told me the transfer rate helps the chain deal with growth in the minimum wage for waiters. A higher transfer rate allows the chain to delay wage hikes for the support staff (not all of whom are paid the minimum). The firm thus uses minimum wage increases to pass support staff costs onto waiters. This is why the transfer rate is positive, and has risen steadily over the past few years. A few years ago it was 3%. In our sample it was four.
The increase is consistent with the response of a profit-maximizing principal who faces a limited liability constraint (Sappington, 1983). The introduction or increase of the minimum wage constrains the set of feasible transfers. 21 It forces the firm to pay workers more in every state of nature. The best response of the firm is to increase the transfer rate, so that workers transfer more in good states, and to therefore alter worker incentives on the margin. The response moves the relationship away from the first best, even if workers are risk neutral. 22

| Reversion
Given their propensity for efficiency, and despite becoming aware of a surplus enhancing alternative, the firm reverted to the customary contract on excess demand days. Why? While there are several explanations, including implementation costs (Ferrall & Shearer, 1999), costs to switching contracts across slow and busy days, worker morale costs (Englmaier & Wambach, 2010;Fehr & Schmidt, 1999), the most informative may be the one offered by the CEO. After the experiment I reminded him that an extra $10 of incentive pay per worker ($200 per shift) delivers about 100 dollars more in revenue per worker ($2000 per shift). His immediate reaction was "is there a cheaper way to do it?" He added that he felt workers were already earning a fair wage. My takeaway from this conversation was that this particular CEO is very careful with money, especially when it came to spending more than his reference point of what something is worth.

| CONCLUSION
Queues are puzzling for economists because they are consistent with wasted profit in equilibrium. The puzzle has inspired a generation of research that tries to rationalize queues via conventional profit-maximizing pricing models. These rationales almost always trace the presence of queues to some feature of the goods market, such as a consumer preference for goods that are valued by other consumers (Becker, 1991). The present article traces the presence of queues to a feature of the labor market, namely the wage contracts of workers who facilitate transactions with consumers. In doing so, it accounts for an important reality of production settings where queues are common, their susceptibility to externalities due to congestion.
The article does this using field experimental evidence from big-box franchise restaurants, where an experimental contract paid workers bonuses for customer volume in addition to the tips and hourly wages they normally receive. The treatment was run on busy days when excess demand is predictable. Queues were shorter under the experimental contract. Shorter queues generate surplus gains for high value consumers who were already willing to wait and for low value consumers who would have otherwise pursued their outside option. There was no discernible effect on tip rates, suggesting no significant cost to the surplus gains in terms of service quality. This is not so surprising because shorter waits may have compensated customers for marginal reductions in service quality and because workers tended to move faster during the experiment. The treatment increased worker earnings by 10%, and the evidence suggests that they experienced nonnegative surplus gains. The treatment increased revenue by 10% and profit by at least 49%. I find no evidence of an effect (positive or negative) on profit in the longer run. Altogether the evidence suggests that the experimental contract may be Pareto improving.
It would be of practical interest to compare different mechanisms for exploiting queues. Conventional wisdom among industry insiders attributes long queues to sales of items like dessert, which increase revenue and profit marginally while increasing service times substantially. By this token, an alternative to paying workers for customer volume would be to increase dessert prices sufficiently during predictably busy service periods. Of course this raises questions about how high the price would have to be to generate gains comparable to the ones found here. Comparable gains through price increases may be prohibitive from the perspective of repeat business or reputation among consumers more broadly. To this end, and to properly compare mechanisms, better long run measures of customer satisfaction are needed.
This article ultimately complicates rather than solves the puzzle. It raises questions about why neither wage contracts nor goods prices nor some combination thereof are used to deal with queues. It begs for an answer that explains why a well-functioning firm would use the customary contract on excess demand days initially, why such a firm would revert to the customary contract on these days even after becoming aware of a more profitable alternative, as well as the prevalence and practice of uniform contracting more generally.

ACKNOWLEDGMENTS
I thank my supervisors, Dwayne Benjamin, Gustavo Bobonis, and Nicola Lacetera for invaluable guidance during the course of this study. I thank Josse Delfgaauw, Robert Dur, Arvind Magesan, and Bauke Visser for detailed and insightful comments on later versions of the paper. The paper has benefited from conversations with Victor Aguirregabiria, Iwan Barankay, Michael L. Bognanno, Branko Boskovic, David P. Byrne, Yoram Halevy, Joshua Lewis, Hugh Macartney, Robert McMillan, Daniel Parent, Imran Rasul, Vincent Rebeyrol, Carlos Serrano, Aloysius Siow, Trevor Tombe, and Tom Wilkening. The research was supported by the Canadian Labour Market and Skills Research Network, and the Social Sciences and Humanities Research Council (of Canada). All omissions and errors are my own.

ENDNOTES
1 See Lott and Roberts (1991) for a discussion of the costs of charging customers explicitly for the service time.
2 Recent legislation tells us just how "customary" is the contract. In 2008 the U.S. federal government introduced legislation that integrates tips in the calculation of the minimum wage. Under the new legislation, the firm need to only compensate the worker up to the point where tips plus the hourly wage equals the mandated minimum. Before the new legislation, the hourly wage had to equal the mandated minimum.