Approximating Purchase Propensities and Reservation Prices from Broad Consumer Tracking

A consumer's web&#8208;browsing history, now readily available, may be much more useful than demographics for both targeting advertisements and personalizing prices. Using a method that combines economic modeling and machine learning methods, I find a striking difference. Personalizing prices based on web&#8208;browsing histories increases profits by 12.99%. Using demographics alone to personalize prices raises profits by only 0.25%, suggesting the percent profit gain from personalized pricing has increased 50&#8208;fold. I then investigate whether regulations intended to prevent price gouging increase aggregate consumer surplus. Two feasible regulations considered offer at best modest improvements.


INTRODUCTION
Consumers are now tracked on the web by thousands of tracking and telecom companies. 2 These tracking data may be sold for the purpose of targeting advertisements and personalizing prices. In this article, I investigate whether web-browsing data improve predictions of purchase propensities, and whether they increase profits from personalized pricing compared to the scenario where only demographics are available for such targeting.
Historically, advertisers and sellers have had a limited ability to target advertisements and prices due to the nature of advertising transmission and the high costs of acquiring data beyond basic demographics. For example, television advertisers would place ads on shows that catered primarily to particular demographic groups, but individual viewers of a given show could not be separately targeted and shown different ads. Similarly, firms might offer discounts off the regular price to seniors, women, students, or other easily verifiable demographic groups. In recent years, however, robust markets for large data sets on individual behavior, popularly referred to as "big data," have developed. These data can be used to form a hedonic estimate of individuals' purchase propensities and reservation prices. Thus, on the Internet, ads and prices can be targeted to individuals instead of broad demographic groups.
Concerns related to data collection and targeting, such as privacy concerns or concerns about the equity of personalized pricing, are influencing policy. For example, the European Union recently introduced substantial data use regulations, the General Data Protection Regulation (GDPR), which requires Web sites to inform consumers of the specific uses of their data (e.g., personalizing prices) and to obtain their consent. In the United States, President Obama used his address to the Federal Trade Commission (FTC) in January 2015 to express his intent to introduce new legislation, a Consumer Privacy Bill of Rights, which would institute a framework for promoting transparent use of data that is limited by consumers' consent. 3 Excessive privacy invasions and personalized pricing were identified as two major motivating concerns in a White House Report that same month (Office of the President, 2015). Bergemann et al. (2015) validates these concerns by proving that there are a wide range of potential impacts of improved price discrimination on consumer welfare, ranging from eliminating consumer surplus entirely to passing the entire surplus gained from price discrimination to consumers. Intuitively, firms may extract a greater share of surplus when setting personalized prices close to reservation prices. However, consumer surplus can rise if previously unserved customers are offered lower prices when prices are personalized. 4 This article empirically investigates the impacts of price personalization.
The closest article on behavioral targeting using web-browsing data is Goldfarb and Tucker (2011), which investigates the impact of a law limiting web tracking on stated purchase intentions among banner advertising viewers. They find a 65% reduction in advertising effectiveness. 5 In this article, using a revealed preference approach, I also find evidence suggesting web-browsing histories are very effective for identifying avid consumers.
The prior literature on personalized pricing, beginning with Rossi et al. (1995Rossi et al. ( , 1996 and Chintagunta et al. (2005), among many others, typically considers personalized pricing based on past purchase history of the same product. 6,7 The basic idea is intuitive: If a consumer buys a product frequently, or was previously willing to pay a high price, he or she likely has a high reservation price for the product and can be charged higher prices in future interactions. One can thus use a small set of variables that are intuitively excellent predictors of willingness to pay to set prices. 8 The models described in their papers were designed for contexts like pharmacies and grocery stores, where their methods are widely used today to generate personalized coupons, a less conspicuous form of personalized pricing. A concurrent paper, Dube and Misra (2017), also incorporates machine learning techniques and validates counterfactual predictions of the profit gained from personalized pricing via a field experiment. Their paper is complementary, although it has a somewhat different focus: It investigates the impact of price discrimination in business-to-business transactions, uses a much smaller set of explanatory variables (excluding online browsing histories), and uses an estimation algorithm too burdensome to be applied to data sets with very large numbers of explanatory variables. 9 Finally, other recent papers have examined the impact of telematics-based monitoring on price discrimination in insurance markets (Reimers and Shiller, 2019) and digital disintermediation on negotiated prices (Peukert and Reimers, 2018).
This article presents a method for estimating purchase propensities and optimal individuallevel prices from big observational data sets. The method is then used, in one context, to estimate individual purchase propensities and aggregate profits from personalized pricing if nearly 5,000 web-browsing variables are available. Of course, these analyses cannot, by themselves, prove that price discrimination has recently become more effective. Therefore, I compare these results to the analogous outcomes when only demographics-which have long been available-can be used for such targeting.
I employ a two-step estimation procedure. First, I use a least absolute shrinkage and selection operator (LASSO) penalized logit model to estimate individuals' purchase probabilities at observed, nonpersonalized prices. Next, I estimate individual-specific utility parameters in a multinomial logit framework to match the predicted purchase propensities from the first step. Note this method estimates heterogeneous preferences using variation across individuals, and parameters are identified even if only a single market (time period/location) is observed. Due to limitations of available data (lack of price variation, specific tier choice unobserved), not all parameters are identified without additional moment conditions, so I introduce a supply-side optimal pricing condition and an aggregate tier choice moment condition to address this issue.
The estimated individual-specific model parameters govern the relationship between prices and expected individual-level demand (e.g., the probability a particular consumer chooses each of Netflix's products). The slope of expected demand reflects the level of statistical uncertainty when predicting a consumer's choice. Given estimated individual-level demand, I calculate individual purchase propensities at observed prices, simulate counterfactual prices and profits under personalized pricing, and evaluate possible regulatory interventions.
Netflix provides an auspicious context for study. First, analyzing these questions requires individual-level data on both web-browsing histories and all purchases of a particular item, information that rarely appears together in data sets available to academic researchers. 10 However, Netflix subscription status can easily be imputed from web-browsing histories, implying that all needed data reside in available online browsing data sets. Second, because purchases occur online, Netflix could implement personalized pricing based on web data. Third, because interactions with consumers take place online, Netflix could preempt consumer resentment by framing personalized pricing as automatically applied customized coupons-a strategy currently employed by some large online retailers-because coupons are already widely used and tolerated by consumers (Venkatesan and Farris, 2012).
Estimates from the first step are used to investigate the effectiveness of browsing data for targeting advertisements. I find that using-web browsing histories to predict which consumers subscribe yields a higher true positive rate, and lower false positive rate, in a sample of 5,000 households held out from the estimation sample. Web-browsing data are thus more effective than demographics for finding consumers that are reasonably likely to make a purchase.
Simulations further reveal that incorporating web-browsing behaviors substantially raises the amount by which personalizing markups raises profits. Profits are only 0.25% higher if using demographics alone to personalize markups, but 12.99% higher if using all data. These findings suggest web-browsing data make personalized pricing substantially more appealing to firms.
In principle, price discrimination can raise both firm profits and consumer welfare (Bergemann et al., 2015). However, aggregate consumer surplus is found to be lower when markups are personalized, suggesting regulations may be appealing. I simulate outcomes under two different feasible types of regulation to investigate whether price regulations might yield a scenario where both firms and consumers are better off than they are under uniform pricing. In both cases, I find that firms charge somewhat higher prices to less avid consumers to satisfy regulations, preventing regulations from meaningfully raising consumer welfare.
The remainder of the article is organized as follows. Section 2 describes the context and industry background. Section 3 describes the data. Section 4 presents the model, and Section 5 provides estimation details. The main results of the article are then presented in Section 6. A brief conclusion then discusses concerns over perceived unfairness of personalized prices and explains how firms have circumvented these concerns and begun personalizing prices.

BACKGROUND
Netflix, a DVD rentals-by-mail provider, was very popular in the year studied, 2006. Over the course of the year, 11.57 million U.S. households subscribed at some point (Netflix, 2006). This implies that about 16.7% of Internet-connected households consumed Netflix during 2006. 11 Netflix was differentiated from competing alternatives in at least three important ways. First, consumers' main alternative was to travel to a brick-and-mortar store (incurring a travel cost), instead of having movies arrive at their residences. 12 Second, by operating a few large warehouses outside of cities on cheaper land, Netflix was able to stock a much larger variety of videos than local brick-and-mortar rental outlets. Third, Netflix developed a well-regarded recommendation algorithm, which reduced consumers' search costs on the platform. A recent study (Gomez-Uribe and Hunt, 2016) suggests that 80% of viewing choices are attributable to the recommendation algorithm, whereas only 20% of viewing is attributable to consumer search. Given these advantages, Netflix is expected to have had some pricing power, at least during the period studied.
Netflix's subscription plans can be broken into two categories. Unlimited plans allow consumers to receive an unlimited number of DVDs by mail each month, but restrict the number of DVDs in a consumer's possession at one time. Limited plans set both a maximum number of DVDs the consumer can possess at one time and the maximum number sent in one month.
In 2006, there were seven plans to choose from. Three plans were limited. Consumers could receive one DVD per month for $3.99 monthly; two DVDs per month, one at a time, for $5.99; or four per month, two at a time, for $11.99. The unlimited plan rates, for one to four DVDs at a time, were priced at $9. 99, $14.99, $17.99, and $23.99, respectively. 13 None of the plans allowed video streaming, since Netflix did not launch that service until 2007 (Netflix, 2006).
Key statistics for later analyses are the marginal costs of each plan. The marginal costs for the unlimited plans were estimated using industry statistics and expert guidance. They are assumed to equal $6.28 for the one DVD at-a-time plan, $9.43 for the two DVDs at-a-time plan, and $11.32 for the three DVDs at-a-time plan. 14 3. DATA The data for this study were obtained from ComScore, through the Wharton Research Data Services (WRDS) interface. The data contain, for a large panel of computer users, demographic variables and the following variables for each Web site visit: the top-level domain name, time visit initiated and duration of visit, number of pages viewed on that Web site, the referring 11 The total number of U.S. households in 2006, according to Census.gov, was 114.384 million (http://www.census.gov/hhes/families/data/households.html). About 60.6% were Internet-connected, according to linear interpolation from the respective numbers of connected homes in 2003 and 2007, according to the CPS Computer and Internet Use supplements. 11.57/(0.606 × 114.384) × 100 ≈ 16.7.
12 Blockbuster did offer a DVD-by-mail service starting in 2004 (https://tinyurl.com/yc9fmpwx). However, the service was marginalized because it competed with the core business that had invested in local brick-and-mortar outlets. The program only grew after incorporating in-store exchanges starting in November 2006. Subscriptions increased quickly, reaching two million in total by January 2007 (Netflix, 2006). 13 A very small number of buyers were observed paying $16.99 per month for the three DVDs at-a-time unlimited plans. These observations were interspersed over time, suggesting it was not due to a change in the posted price. 14 A former Netflix employee recalled that the marginal costs of each plan were roughly proportional to the plan prices (i.e., the marginal cost for plan j approximately equaled x × P j , where x is a constant). I further assume that the marginal cost of a plan is unchanging and thus equal to the average variable cost. With this assumption, one can find x by dividing total variable costs by revenues. According to Netflix's financial statement, the costs of subscription and fulfillment, a rough approximation to total variable costs, were 62.9% of revenues, implying x = 0.629. Subscription and fulfillment include costs of postage, packaging, cost of content (DVDs), receiving and inspecting returned DVDs, and customer service. See Netflix (2006) for further details. The implied markup over variable cost is 1 0.629 = 0.59. The markups implied by financial reports from neighboring years were similar but slightly lower (0.53 in 2007, 0.46 in 2005) Web site, and details on any transactions. 15 For further details on this data set, refer to previous research using this data set (Moe and Fader, 2004;Montgomery et al., 2004;Huang et al., 2009).
Netflix subscription status can be imputed in these data. For a small sample of computer users observed purchasing Netflix on the tracked computer during 2006, subscription status is known. For the rest, I assume that a computer user is a subscriber if and only if he or she averaged more than two subpage views within Netflix's Web site per visit. The reasoning behind this rule is that subscribers have reason to visit more subpages within Netflix.com to search for movies, visit their queues, rate movies, and so forth. Nonsubscribers cannot access as many pages because they cannot sign in. According to this rule, 15.75% of households in the sample subscribe. This figure is within a single percentage point of the estimated share of U.S. Internetconnected households subscribing, found in Section 2. This small difference may be attributed to approximation errors in this latter estimate and ComScore's sampling methods.
For a small subset (a few hundred) of consumers, their Netflix transactions are recorded in the ComScore data, and hence their choice of Netflix tier (one, two, or three DVDs at a time) is observed. I use their tier choices to infer the aggregate share choosing each Netflix tier, assuming this group is representative of the larger population of Netflix subscribers. 16 Specifically, the aggregate share choosing each tier is inferred by multiplying the fraction of consumers choosing any of Netflix's offerings (in the entire sample) by the fraction of consumers selecting each tier among the sample of consumers whose specific tier choice is recorded (in the small sample of consumers, who all subscribed).
I derived several web behavior variables from the raw data. These include the percent of a computer user's visits to all Web sites that occur at each time of day and on each day of the week. Time of day is broken into five categories: early morning (midnight to 6 a.m.), mid morning (6 a.m. to 9 a.m.), late morning (9 a.m. to noon), afternoon (noon to 5 p.m.), and evening (5 p.m. to midnight).
I then cleaned the data by removing Web sites associated with malware, third-party cookies, video rental chains, and pornography, leaving 4,600 popular Web sites to calculate additional variables. 17,18 The total number of visits to all Web sites and to each single Web site were computed for each computer user.
The cross-sectional data set resulting from the above steps contains Netflix subscription status and a large number of variables for each of the 61,312 computer users. 19 These variables can be classified into three types: standard demographics, basic web behavior, and detailed web behavior. Variables classified as standard demographics were race/ethnicity, children (Y/N), household income ranges, oldest household member's age range, household size ranges, census region (North, South, East, West), and area and population density of their ZIP code tabulation area from the 2010 Census. Variables classified as basic web behavior include the percent of online browsing by time of day and by day of week, and a broadband indicator. Summary 15 In our correspondence, ComScore representatives stated that demographics were captured for individual household members as they complete "a detailed opt-in process to participate," for which they were incentivized. 16 In unreported tests, I confirmed that this sample had a similar predicted propensity to subscribe to Netflix (according to Equation 10 using the full data set), as subscribers whose exact tier choice was unobserved, which might suggest the sample is akin to a random sample. However, it is not possible to test explicitly whether this sample of consumers is random for the purposes of the model, that is, it is not possible to verify whether this sample had the same propensity to consume higher quality tiers as the population at large. If the shares selecting each tier are atypical in this sample, it will bias estimates of the marginal intrinsic utilities for higher quality tiers, in the model described in Section 4. 17 yoyo.org provides a user-supplied list of some Web sites of dubious nature. Merging this list with the ComScore data reveals that such Web sites tend to have very high (≥ 0.9) or very low (≤ 0.1) rates of visits that were referred visits from another Web site, relative to sites not on the list, and rarely appear on Quantcast's top 10,000 Web site rankings. Web Sites were removed from the data accordingly, dropping sites with low or high rates referred to or not appearing in Quantcast's top 10,000. Manual inspection revealed that these rules were very effective in screening out dubious Web sites. In addition, video rental Web sites were dropped.
18 Pornography might contain valuable information, but might also require listing Web site names in publication that might offend some readers. 19 ComScore's data set was a rolling panel. Computers not observed for the full year were dropped. A couple hundred computer users with missing demographic information were also dropped. statistics for the demographic and basic browsing variables are shown in Table 1. Note the substantial variation in demographics that could be used to personalize prices. Variables classified as detailed web behavior indicate number of visits to a particular Web site, with one variable for each of the 4,600 Web sites, as well as the household's total Web site visits and transactions on the web during 2006. The yearly frequency of visits to individual Web sites, averaged across consumers, are shown in Figure 1. The most popular Web site in 2006, msn.com, was visited about 200 times a year by the average person, slightly more than once every other day. However, visits decline quickly with rank. The 50th ranked Web site (usps.com) was visited by the average customer about 2.2 times per year, partly because patrons visit it infrequently, and partly because 61% of households in the sample never visited the site in 2006. Subsequent sections analyze whether such variations in behaviors across consumers, presumably indicative of preferences and habits, are useful for segmenting consumers and personalizing prices.
Prior to estimation, each explanatory variable is normalized to have a mean of zero and a standard deviation of one. This is common when using machine learning techniques that employ regularization.  The data were randomly split into two samples of individuals: a training sample of 56,312 individuals and a holdout sample of the remaining 5,000 individuals. The first, an estimation sample, is used for estimating model parameters. The model's fit is then tested using the sample of 5,000 consumers held out from estimation.

MODEL AND IDENTIFICATION
This section describes the strategy for estimating individual-level expected demand for Netflix's vertically differentiated products. The model presented is designed for the context studied. In other contexts, with richer data, one may be able to estimate a more flexible model.
In the chosen context (Netflix in 2006), prices did not vary across consumers or time. This alleviates endogeneity concerns, because prices cannot be correlated with changes in unobserved product attributes if prices do not change at all. Hence, the typical concern that endogenous prices will bias the price sensitivity parameter does not apply in this context. However, without price variation, price sensitivities are not identified from the demand-side model alone. Furthermore, without data on each individual's choice of quality tier, individual-specific marginal utilities for higher tiers are not identified.
In order to address these issues, I introduce a supply-side model and aggregate tier choice moment conditions (and model restrictions) that are used to augment the demand-side model. The augmented estimation model assumes that the researcher has ex ante knowledge of marginal costs. The supply-side model yields an optimal-pricing first-order condition used to estimate consumers' mean price sensitivity (assuming marginal costs are known). The augmented model also assumes that consumers agree on the marginal intrinsic utilities provided by higher-quality tiers. Aggregate tier choice shares are used to estimate the mean marginal intrinsic utilities for higher-quality tiers when individual-level tier choice is not consistently observed. Details are below.

4.1.
Demand. The aim of the model is to estimate the relationship between prices and expected demand (i.e., probability of purchase) at the individual-consumer level. As in Mc-Manus (2008), a heterogeneous agent logit framework is used to model demand for vertically differentiated products. 20 Specifically, consumer i's conditional indirect utility for product j equals where P j denotes product j 's price, and α and ν i denote price sensitivity and individual i's intrinsic valuation for the product line, respectively. 21 δ j represents the marginal intrinsic valuation for higher tiers (δ 1 , corresponding to the lowest quality tier, is assumed to be equal zero, because it is not separately identified from the mean of ν i ). ij is an i.i.d. error term following the type 1 extreme value distribution. 22 Finally, σ denotes the standard deviation of the error term. The mean utility of the outside good is normalized to zero: In the model, the parameters ν i , δ j , and α represent estimable components of consumer preferences. Together, they determine consumer i's expected reservation price, based on information available to the firm. The error term ( ij ) reflects remaining uncertainty. As the error's scale (σ ) declines, prediction of which product provides highest utility (product choice) improves (i.e., product choice probabilities move closer to the extremes [0,1]). Analogously, as the scale of all remaining parameters (ν i , δ j , α) increases, holding error scale (σ ) fixed, predictions of consumers' choices improve. 23 In practice, the latter (scale of remaining parameters [ν i , δ j , α]) determines predictive ability because the error scale (σ ) is typically normalized (to one) for the model to be identified. Following convention, I subsequently assume σ = 1. Hence, if predicted-choice probabilities are near the extremes, zero and one, then the model will reflect these strong predictions by estimating a large scale for observable components (ν i , α, δ j ). This would imply that the components of an individual consumer's demand inferred from their browsing history are large compared to the unobserved components of their demand, implying that precise targeting is feasible.
The probability consumer i chooses product j is where J denotes the set of inside products, and δ and P denote the product-tier vectors [0, δ 2 , δ 3 ] and [P 1 , P 2 , P 3 ], respectively.
Correspondingly, the probability that consumer i chooses one of the inside products (Netflix's products), as opposed to the outside good (no Netflix product), is 20 Similar models are often referred to as "random coefficient" models. However, coefficients are not random in this context; the model explicitly estimates separate coefficients for each observed individual. 21 In other contexts, a model that interacts the consumer preference parameter ν i with tier quality δ j may be more appropriate. However, in this context, I found consumers with large subscription probabilities were not substantially more likely to choose higher quality tiers (in contrast to a priori expectations under a multiplicative model), in a small sample of consumers whose specific tier choice was observed. 22 I assume follows the type 1 extreme value distribution with location parameter equal to the negative of Euler's constant and scale parameter equal to one. 23 Suppose that on estimable parameters (excluding error ), product j is preferred to product k (i.e., where F () denotes the cumulative distribution function (CDF) of error term . Increasing the scale of estimable parameters by multiplying them by the same constant (> 1) increases the positive difference between (αP j + δ j ) and (αP k + δ k ), which increases F (αP j + δ j − αP k − δ k + ij ) (i.e., probability product j is preferred to product k). Hence, increasing the scale of estimable parameters ν i , α, and δ j raises the probability of choosing the product providing higher utility based on estimable measures toward one, and reduces the probability that the other product(s) are chosen toward zero, implying better predictions.
The demand model is used to construct two sets of moment conditions: (1) (machine learning) estimated probabilities each individual i subscribes less the corresponding multinomial logit model's predictions (ŝ ij =0 (X i ) − s ij =0 (ν i , α, δ, P)) and (2) aggregate product tier shares less

4.2.
Supply. The supply-side model closely follows supply-side models in typical random coefficient models, with one important difference. Typically, price sensitivity is identified from the demand side, and the supply-side optimal-pricing first-order condition is used to recover estimates of marginal costs. However, because prices do not vary in the studied context, mean price sensitivity is not identified from the demand side. Instead, ex ante information on marginal cost is assumed (see Section 2), and the supply side is used to estimate mean price sensitivity.
The equation for profit can be simplified based on conversations with the former vice president of marketing at Netflix. He stated that Netflix's prices followed a simple pricing rule: Each tier had roughly the same percent markup over marginal cost. Specifically, P j (θ) = (1 + θ)c j , where c j is the marginal cost of tier j and θ is a markup parameter. Information in Netflix's annual report implies θ = 0.59, and the marginal costs for the three tiers are thus $6.28, $9.43, and $11.32. 24 Under this pricing strategy, the expression for profit is where M is the mass of consumers, is the fixed cost, and s j is the aggregate share selecting tier j .
The first-order condition yields where ds j dθ equals where P(θ) denotes the price vector as a function of markup (P(θ) = (1 + θ) × [c 1 , c 2 , c 3 ]), f (ν i ) denotes the density of the individual-specific parameters, and follows the logit-model framework. 25 Specifically, The third moment condition used in estimation is the derivative of profits with respect to markup ( dπ dθ ), from Equation (5). It should equal zero, assuming that the firm is choosing a markup to maximize profits. 24 See Section 2 for details.

Moment Conditions.
To recap, the aim is to estimate preference parameters (ν i , α, δ j ). There are three sets of moment conditions used in estimation in the typical case where the data are less than ideal.
The objective function used to estimate these parameters is: The first set of moment conditions is the difference between estimates of individuals' probabilities of subscribing to any tier of Netflix's products less the corresponding multinomial logit-model predictions. The second is the difference between the aggregate share selecting tier j (ŝ j ) and the corresponding model prediction (s j ), ∀j , where s j = s ij (ν i , α, δ, P(θ))f (ν i )dν i . The last moment condition is the derivative of profits with respect to markup. Estimated demand parameters (ν i , α, δ) minimize the objective. 26 4.4. Identification and Restrictions. Note that the implied probability of subscribing (s ij =0 (ν i , α, δ, P)) is monotonically increasing in ν i . Hence, conditional on the values of other parameters, ν i are identified by the first set of moment conditions, which equate the predicted probability each consumer subscribes (to any of the inside tiers) under the structural framework with the corresponding machine learning prediction from the first step. 27 In the model presented above, there is an analytic solution for ν i . 28 Note that identification of ν i does not require observations from multiple markets-ν i are identified by variation in subscription probabilities across individuals.
The incremental valuations for higher tiers (δ 2 , δ 3 ) are identified by the second set of moment conditions, which match observed aggregate tier shares (ŝ j ) and model predictions (s j = s ij (ν i , α, δ, P(θ))f (ν i )dν i ). Note, a unique set of δ j satisfy these moment conditions, because the predicted aggregate share choosing a given tier j is monotonically increasing in δ j .
Next, note that information on tier choice probabilities under a single price schedule is not sufficient to identify all preference parameters (ν i , α, δ j ). If one both (i) reduces the price sensitivity (α) by constant A, and (ii) adds a constant A × P j to consumer i's intrinsic utility for each inside product j (ν i + δ j ), then the mean utility for each tier j (and hence choice probability for each tier) remains unchanged. 29 However, additional information on demand elasticities identifies α and (ν i + δ j ) separately. Note, if holding choice probabilities s ij (ν i , α, δ, P) fixed to satisfy the first two sets of moment conditions (i.e., holding fixed conditional indirect utility by adjusting ν i + δ j to compensate for any changes in α), then demand elasticities depend on only the price sensitivity α. For example, in the logit model, an individual's own price elasticity equals ( α, δ, P)), implying a monotonic relationship between α and demand elasticity. Thus, only one set of α and (ν i + δ j ) can rationalize both the choice probabilities and elasticities. Any added information on choice elasticities allows separate identification of α and ν i + δ j .
In our context prices do not vary. Hence, price sensitivity (α) and intrinsic values for the inside products (ν i + δ j ) are not separately identified from the demand side. One can address this issue by using information from the supply side, assuming a constant markup. Assume that for any change in α the parameters ν i + δ j are adjusted accordingly so that subscription probabilities (s ij ) are kept the same, and the first and second sets of moment condition remain 26 The model is just identified. Hence, at the optimum, the objective function equals zero. 27 See Lewbel (2019) and Matzkin (1992). 28 When the individual-specific term ν i enters additively into the utility function, and the error is i.i.d. and follows the type 1 extreme value distribution, an analytic solution exists: satisfied. Then, larger negative values of α imply more elastic demand and thus lower optimal markup θ (over ex ante known marginal costs). Only one value of α implies observed markups are optimally set, satisfying the third moment condition (dπ/dθ = 0). 30 4.5. Estimation Routine. I search for parameter values minimizing the objective function stated above in Equation (8). In order to speed computation, I search over the representative price sensitivity (α) and marginal utilities for higher tiers (δ j , for j > 1), nesting within a search for the individual-level parameters (ν i ) equalizing the logit model predicted probabilities of selecting an inside product (s ij =0 (ν i , α, δ, P)) and the corresponding estimates from the LASSO penalized logit model (ŝ ij =0 (X i )). 31

ESTIMATION
The model is estimated in two steps. The first step-the main focus of this section-estimates purchase probabilities for individual consumers. In the second step, remaining model parameters are estimated as described in the preceding section.
A logistic regression with LASSO regularization is used to estimate the probability each consumer subscribes to any one of Netflix's services (one, two, or three DVDs at a time), as a function of individual-level observables (e.g., browsing data). The penalized likelihood function equals where s ij =0 (X i ) denotes the model-predicted probability of subscribing, I(buy) indicates whether consumer i subscribes, λ is the LASSO penalty parameter, β k denotes the value of parameter k, and The estimation routine searches for parameters (φ, β) to maximize the penalized likelihood for a given penalty parameter. The penalty parameter (λ) is estimated by maximizing the out-ofsample likelihood using twofold cross validation.
A set of 4,633 explanatory variables are considered. In all models, 18 demographic variables are considered, including (i) indicators for children, race, Hispanic ethnicity, census region (North, West, and South); (ii) ordinal group number (e.g., 18-to 20-year-olds are considered group number 1), the group number squared, and indicator for censored from above, for each of the following: age, income, and household-size groupings; and (iii) linear measures of the area and population density of the household's ZIP code tabulation area. Additionally, some models include the remaining 4,615 explanatory variables summarizing individuals' web browsing, including indicators for the browsing habits listed in Table 1, the intensity of web use (number site visits), and its square, and the number of visits at each of 4,600 Web sites. All explanatory variables are normalized to have mean 0 and standard deviation of 1, prior to estimation. 30 Combining Equations (5), (6), and (7), and solving for α yields an analytic solution for the unique value of α satisfying the last (optimal pricing) moment condition: . 31 An analytic solution for ν i exists:  (4) shows the corresponding percentage point increase in subscription probability arising from a one standard deviation increase in the respective variable, averaged across consumers.
A useful feature of LASSO models is that the procedure selects potentially meaningful explanatory variables, setting coefficients on other variables to zero. 32 In the full model, a large number, 938 of the 4,633 considered explanatory variables, remain and have nonzero coefficients. Of these, only five are demographic variables, suggesting that the web-browsing data provide richer information on consumers' tastes.
The top 30 variables in the full model, ordered by coefficient magnitude, are reported in Table 2. Because all variables are normalized prior to estimation, coefficient size provides a measure of the influence of the variable on estimated subscription probabilities. Note that the 30 most influential explanatory variables do not include any demographic variables, suggesting demographic variables provide little marginal information about subscription probabilities. Also note that the list does not include any basic browsing behaviors like timing or intensity of web use. Rather, a household's tendency to visit particular Web sites seems to contain the most information about their affinity for Netflix.
The intuition linking Web site visits to an innate affinity for Netflix is obvious for some. Game-Fly (#1), Audible (#4), and Amazon (#30)-all positive predictors of Netflix subscription-indicate a preference for receiving products by mail, suggesting a higher valuation for Netflix, which at that time delivered DVDs by mail. 4chan.org (#7) is a site for anime enthusiasts interested in content that is typically available at Netflix-which operates large warehouses-but may not be available at the local brick-and-mortar outlets comprising Netflix's competition in 2006. The top sites also include a site catering to consumers interested in technology startups (#26, techdirt) and a movie review Web site (#9, imdb.com). In order to address the concern that movie review Web sites might be complements for Netflix, raising concerns of reverse causality, the model is reestimated, excluding all Web sites categorized as related to "movies" or "TV" (including imdb.com, DVDfab.com, and slysoft.com), according to Alexa web rankings. 33 These results are reported in Subsection 6.4.
The impacts of the most influential variables are large. The last column of Table 2 shows the corresponding marginal change in subscription probability, on average across consumers, occurring when visits to the Web site increase by one standard deviation. For example, users who visit GameFly.com one standard deviation more often than average are 11.8 percentage points more likely to subscribe to Netflix at observed prices, implying a 75% (11.8/16 × 100) higher probability of subscribing than the mean consumer, whose subscription probability is about 16%. The logit demand model parameters (ν i , α, δ j ) are then estimated via generalized method of moments, as described in Section 4. Recall that estimated subscription probabilities (s ij =0 (X i )) from the first stage of the model, the LASSO penalized logit model described in this section, are an input into the second stage, which estimates the structural parameters (ν i , α, δ j ). Thus, the structural model parameter estimates depend on the set of variables used to estimate subscription probabilities s ij =0 (X i ) in the first stage. I estimate model parameters (ν i , α, δ j ) twice, once using the estimated probability each consumer subscribes based on demographics alone, and a second time using the estimated probability each consumer subscribes when all variables (demographics and web-browsing data) are used. Profits and welfare under counterfactual pricing schemes can be simulated for each of these sets of parameter estimates, and are compared in Subsection 6.2. 6. RESULTS 6.1. Efficacy of Targeting. One can analyze how well various data types identify likely consumers using only the first stage of the model, which predicts the probability a consumer subscribes from a large set of variables without relying on the assumptions of the structural model. Several tests yield the same conclusion: Web-browsing data yield superior predictions of consumers' choices.
I first employ a conventional method used to evaluate fit in the machine learning literature, the receiver-operating characteristic (ROC) plot (Zweig and Campbell, 1993). The ROC plot shows the true positive rate (share Netflix subscribers that are predicted to subscribe) against the false positive rate (share of nonsubscribers predicted to subscribe). The plot is created by varying an arbitrary threshold. For example, the true and false positive rates can be calculated using a 75% threshold, that is, assigning individuals as predicted subscribers if their subscription probability, according to the model, exceeds 75%. The ROC plot shows the true positive and false positive rates for an array of different thresholds values. A 45 degree line implies the model holds no predictive value. Curves farther above the 45 degree line imply better fit. The area under (the ROC) curve (AUC) yields an intuitively ordinal measure of fit that can be used to compare various models. An AUC value of 0.5 implies the model has no predictive value. AUC values closer to the maximum possible value, one, imply better fit.  The ROC plot generated using the sample held out from estimation is shown in Figure 2. Visual inspection confirms that the curve for the model predicting subscription using all variables, including browsing histories, lies much farther above the 45 degree line, compared with the curve for the model using only demographic variables, suggesting the model based on all variables yields a much better fit. The AUC of the full model, 0.71, is substantially larger than the AUC of the demographics-based model, 0.55, confirming that the model with web-browsing variables yields much better predictions of which consumers subscribe. Figure 3 illustrates the extent of heterogeneity that can be captured by the different variable sets. The consumer type, defined as ν i |α| , is shown on the x-axis. It represents the expected intrinsic utility for the lowest-quality tier, normalized by the price sensitivity. Note, the consumer type omits the unobserved error i1 ; A consumer's actual valuation for the lowest quality tier equals α . Also note that the consumer type ( ν i |α| ) can be negative, just as intrinsic utilities in classic random coefficient models are negative when purchase probabilities are somewhat small-only those consumers that have a large enough unobserved utility error ( ) choose to purchase the respective product. Note that the range of consumer type values is much wider when all information, including web-browsing histories, is used to predict which consumers are most likely to purchase. 6.2. Personalized Pricing Counterfactuals. This section simulates counterfactual environments in which Netflix implements personalized pricing, proper second-degree price discrimination, or both. Specifically, optimal profits under each pricing scheme are simulated separately, first using demographics alone, and then using all variables to explain a consumer's willingness to pay. 34 They are then compared with simulated profits under the status quo environment, where Netflix employed constant percent-markup pricing.    Table 3 shows the percent increase in profits from personalizing markups (θ i ), where percent markup differs across individuals, but for a given individual is the same across products. 35 When all variables are used to personalize markups, profits increase 12.99%. 36 If personalized markups are based only on demographics, the increase in total profits is much less, 0.25%. The nascent availability of these data thus increases the likelihood that firms will implement price 35 Percentages instead of absolute profits were reported because simulated variable profits in the status quo case depend on the demand estimates, which can vary slightly depending on which set of variables were used in estimation. In practice, the two status quo profit estimates were quite close, within about half of a percent of each other. 36 In this calculation, variable costs are defined as the "cost of revenues" reported in Netflix's 2006 Annual Report (Netflix, 2006). The "operating expenses" in the 2006 financial report are assumed to be fixed costs. These definitions imply the variable costs were about $627 million, and the fixed costs were about $305 million. Revenues in 2006 were about $997 million, implying variable profits were about $370 million, and total profits were about $65 million. Multiplying the percent change in variable profits by 370/65 yields the percent change in total profits. NOTES: Linear pricing denotes a consistent percent markup (θ) over cost across all three tiers. Second-degree PD (price discrimination) denotes that the percent markup (θ j ) may differ across the tiers. Personalized denotes consumers are charged different prices based on their browsing histories and demographics. Bootstrapped standard errors in parentheses.
discrimination, because adding web-browsing data substantially increases the amount by which personalizing markups raises profits. Personalizing markups reduces aggregate consumer surplus by 0.50%. 37 However, most consumers receive lower prices when prices are personalized, and hence the majority are slightly better off.
Next, as a comparison, I consider second-degree price discrimination, where all consumers face the same prices, but there is a separate markup for each product tier (θ j ). Suppose that when a consumer arrives, preferences are private (i.e., the firm has no information on individualspecific preferences). Under second-degree price discrimination, the markups for each tier are designed so that consumers self-select the tier with an incremental price about equal to their willingness to pay for the incremental quality. 38 Despite the fact that all consumers face the same price schedule, consumers end up paying different prices per unit of quality and the firm's expected profits earned from a randomly arriving consumer increase.
Personalization and second-degree price discrimination are not mutually exclusive, and combining the strategies by setting a separate markup (θ ij ) for every combination of consumer and product tier may increase profits further. When a consumer arrives, the firm may have some information on his or her preferences (e.g., based on their web-browsing history). However, a consumer's exact preferences remain private information. Based on available information and the remaining uncertainty, the firm can form an expected distribution of parameters governing a consumer's preferences. From the firm's perspective, an arriving consumer represents a group of heterogeneous consumers with private preferences, from which the arriving consumer's actual realized preferences are drawn. Just as classic second-degree price discrimination can increase expected profits from an anonymous consumer, combining personalization with the mechanics of second-degree price discrimination can raise profits when consumers retain some private information about their preferences.
The results are shown in Table 4. Changing from the status quo case (constant percent markup over cost, θ) to second-degree price discrimination (separate markup θ j for each tier) increases profits by 22.48%. Switching instead to personalized markups (θ i ) raises profits by 12.99%. Combining the strategies by setting a separate markup (θ ij ) for each combination of consumer i and tier j raises profits by 36.8%.
Second-degree price discrimination raises profits by raising the price of the lowest quality tier (≈ 18%) and reducing the price of the highest tier (≈ 6%). Increasing the price of the lowest tier reduces profits from that tier, but reduces the price differences between higher tiers and 37 Percent change in consumer surplus equals CS personalized −CSstatus quo CSstatus quo × 100, where under the logit modeling assumptions, consumer surplus for a given pricing scheme equals i ln(1+ k∈J exp (αP k +ν i +δ j )) α . 38 In classic second-degree price discrimination theoretical models, if the firm chooses qualities for an arbitrarily large set of vertically differentiated goods, each consumer pays exactly their full willingness to pay for marginal quality. With a small discrete set of goods, some consumers may pay less than their full valuation for marginal quality. NOTES: In each panel, personalized prices for each tier are plotted against consumer type. The top left panel shows the range of prices (for each tier) across consumers when the percent markup over cost is personalized using demographic information. The top right panel shows prices across consumers when the percent markup is personalized using the full set of variables. The bottom left panel shows the range of prices across consumers when the firm personalizes the prices of each tier (personalized second-degree price discrimination) using demographics, and the bottom right shows the corresponding range of prices when prices of each tier are personalized using the full set of available variables. the lowest tier and thus loosens the incentive compatibility constraints. 39 More consumers are willing to pay the resulting smaller incremental price to acquire a higher quality tier, shifting some consumers to higher tiers that have larger absolute markups. Overall, profits increase. Similar patterns occur when personalizing separate markups for each tier (personalized seconddegree price discrimination).
Using the full set of variables to personalize markups substantially increases the range of prices charged to different individuals for the same product. Figure 4 shows price against consumer type, separately by tier, in four scenarios. The first scenario, appearing in the top left panel, shows that if the percent markup is personalized using demographic variables, the highest offered prices are about 6% higher than the lowest offered prices, for each respective tier. If markups are personalized using the full set of variables (top right panel), the highest prices are 67% higher than the lowest prices offered, for each tier, and 55% higher than the prices offered when markups are not personalized. The 99th percentile consumer is offered a price about 16% higher than the nonpersonalized price, and the 95th percentile consumer is offered a price that is 5% higher. The 84th percentile consumer faces essentially the same prices whether or not markups are personalized. This is not that surprising, because under status quo prices, only the 16% of consumers with highest valuations actually subscribed to Netflix. All consumers 39 The derivative of variable profit from tier j with respect to its price equals: i It is negative for the first tier. The derivative of the profit for tier j , for j > 1, with respect to the price of the first tier is i with lower predicted valuations are offered lower prices when markups are personalized, some receiving discounts as large as 7%.
The bottom panels of Figure 4 show the ranges of prices offered to different consumers when prices are fully personalized, that is, when the firm personalizes prices separately for each tier using demographics (left panel) and using all variable (right panel). Note that compared to the case where the firm only personalizes markups (top right panel), when the firm personalizes the prices of all tiers separately (bottom right panel) the price of the lowest quality tier is noticeably higher for consumers with high predicted valuations, to encourage captive consumers to select a higher-quality tier yielding a larger absolute markup. This is confirmed in Figure 5. Raising prices for the lowest-quality tier encourages high-value consumers to switch to the higher quality tiers, which have larger absolute markups. Even though raising the price of the lowest-quality tier results in those consumers who are (wrongly) predicted to have high valuations leaving Netflix, the gain from encouraging others to switch to more profitable tiers more than offsets these losses.
6.3. Regulations. In principle, price regulations could yield personalized prices that raise both profits and aggregate consumer surplus, compared with the status quo case, a uniform markup charged to all consumers. This can be demonstrated with a simple example. Suppose personalized markups are offered to consumers whose optimal personalized markup is below the flat markup observed, and all other consumers continue to receive the same markup as before. No consumers are offered higher prices. Thus, no one is worse off. However, consumers offered a lower price are better off than before, so aggregate consumer surplus rises. Personalizing prices to a subset of consumers, as opposed to no consumers, increases the firm's profit as well.
In practice, however, mutually beneficial regulations may be elusive. In a world with price discrimination, regulators may not be apprised of the uniform pricing markup, or consumer demand in general. Hence, the simplest regulation-explicitly setting a price/markup ceilingmay be challenging to implement well.
Instead, two more feasible price regulations are considered. The first type of feasible regulation will be referred to as a discount penetration regulation, as it limits the percentage of consumers receiving a discount off a list price chosen by the firm. If the discount penetration ceiling is set to 100% (no regulation), then the firm sets a high list price and offers each consumer a personalized discount. This is equivalent to fully personalized pricing. At the other extreme, a discount penetration ceiling of 0% prohibits the firm from offering a discount to any consumer and is therefore equivalent to banning price discrimination. If regulators set the discount penetration rate somewhere in between, at, say, 85%, then the firm could offer as many as 85% of consumers a discount off the list prices.
The impacts of price regulations capping the discount penetration at n% are simulated. The simulation consists of an inner step and an outer loop. The outer loop performs a grid search over possible markup ceilings (list prices) to find the markup ceiling that maximizes profits. The inner step selects which consumers to offer discounts off the list prices in order to maximize profits. First, one calculates the expected profits for each consumer if that consumer was offered the list prices. Consumer are then ordered according to the difference between the expected profits they generate at the list prices and at the optimal personalized markup. In order to satisfy the price regulation while minimizing the impact on profits, the (100 − n) percent of consumers with lowest difference are offered the list prices. The remaining n% receive personalized discounts. Summing expected profits across consumers yields the total expected profits for a given guess at the markup ceiling (list prices) in the outer loop. The markup ceiling maximizing profits (in the outer loop) is chosen.
This entire simulation process is repeated for a range of discount penetration regulation limits, ranging from n = 0% (same markups for all consumers) to n = 100% (no regulation). The impact of these regulations on the firm is intuitive. Profits are monotonically declining in the strength of price regulations (inverse of discount penetration rate).
The simulations reveal that prohibiting price discrimination entirely maximizes consumer surplus. However, consumer surplus can be higher absent regulation than with limited regulation. Reducing the percent of consumers allowed to receive personalized discounts from 100% (no regulation) to 97% reduces consumer surplus by 0.13%. For discount penetration cap regulations between 1% and 96%, consumers in aggregate are better off than they are under full price discrimination, but worse off than they are under uniform pricing. Thus, although it is theoretically possible for less stringent regulations to induce personalized prices that increase consumer surplus, such an outcome is not realized. Consumer surplus is maximized when price discrimination is banned entirely.
In order to understand why consumers are unable to benefit from the less stringent regulations, consider the impact at the individual consumer level. Consider the case where the discount penetration cap is set 85.43% (i.e., the percent of consumers whose profit-maximizing personalized markup is below the uniform markup observed empirically). The profit-maximizing markup ceiling (determining list prices) is 72%, which is higher than the observed uniform markup of 59%. This higher markup aims to extract surplus from avid customers. However, to adhere to the regulations' quota, the firm must offer this markup to 14.53% of consumers. The list price markup (72%) thus must be applied to some consumers who would be offered lower personalized prices absent regulations. Not surprisingly, the optimal personalized markups for some of these consumers are close to the list markup. However, the firm also charges list prices to some consumers who are unlikely to buy at any price, consumers with much lower optimal-personalized markups, because these consumers generate little profit for the firm regardless. The firm is better off using these low value consumers to satisfy the regulations' list   Figure 6. Notice that limited regulations result in some consumers paying higher prices than they would absent regulations, and some paying higher prices than they would if price discrimination were banned entirely. Thus, some consumers are made worse off by this limited regulation, regardless of whether the alternative is full regulation (banning price discrimination), or no regulation. Next, I investigate the impacts of a second feasible price regulation that limits the depth of the discount, that is, a maximum difference in prices across consumers, instead of limiting the fraction of consumers receiving a discount. In this scenario, the firm is free to set the maximum markup (price), and provide personalized discounts to any number of consumers. However, the depth of the discount cannot exceed some set threshold. In order to evaluate this type of regulation, I simulate outcomes under different discount depth limits between 5% and 50%, in increments of five.
Although outcomes are slightly different under a regulation that limits discount depth, the conclusions are essentially unchanged. Regulating markup personalization by setting a maximum discount depth weakly reduces firm profits. When the discount percentage is capped at 10%, aggregate consumers surplus is slightly higher, 0.09% higher. However, for other limits considered, aggregate consumer surplus is lower, and in most instances lower than it is without regulation. Again, the reason is that consumers predicted to have low willingness to pay actually face higher prices under the regulation, and are made worse off. Referring to the bottom panel of Figure 6, one can see personalized prices when the discount depth is capped at 15%. Compared to the first feasible price regulation (limiting the fraction of consumers receiving a discount), the least avid consumers (consumer types ( ν i α ) below −6) receive lower price offers. However, the lowest price offered to anyone when discount depth is capped at 15% is 3.3% larger than the lowest price offered under the first regulation, and 4% larger than lowest price offered absent regulation. The firm's optimal behavior under these regulations is not necessarily surprising. Firms sacrifice profits from the least avid consumers to satisfy regulations, while preserving the ability to profit by raising prices to avid consumers. In fact, 42% of the gains from personalization are attributed to the 0.5% of consumers predicted to have the highest valuations for Netflix, 59% of the gains come from the 1% most avid, and 78% of the gains from the 5% most avid. Note that because only 16% of Internet-connected consumers subscribed, the top 0.5% of consumers account for over 3% of subscribers, and the top 5% of consumers account for nearly one third of actual subscribers. Still, these findings show that most of the profit gains from personalizing markups arise from increasing prices to the most avid consumers.
In summary, discriminatory pricing can in theory increase both producer surplus and aggregate consumer surplus. Regulators, therefore, might attempt to use limited regulations to reach such an outcome, instead of prohibiting personalized pricing entirely. However, simulations reveal that less stringent price regulations may either yield very modest gains to aggregate consumer surplus, or, in many cases, actually reduce consumer surplus. Moreover, these regulations have the unintended consequence of increasing prices for the least avid consumers, consumers that would otherwise be offered the largest discounts. Less stringent regulations are thus unappealing in some ways. 6.4. Robustness Checks. Table 5 shows that the increase in profits from personalized pricing is robust to some modeling assumptions. The first concern is that Netflix may have underpriced in the short run to grow the business, and hence the static optimal-pricing conditions may rely on a questionable assumption. However, I find that even if one assumes that the optimal prices were double the observed prices (implying a much higher markup), and reestimates the model, the increase in profits from price personalization is roughly the same, at least in percentage terms. The second concern is that movie review Web sites such as IMDB.com and RottenTomatoes.com might be complements for Netflix's products. If so, visiting them might not just indicate a higher intrinsic affinity for Netflix, but also indicate that a consumer already subscribes. Although this may be true, the impact of movie review Web sites on the main results is relatively small. Dropping all Web sites categorized by Alexa Web Rankings as related to movies or TV, along with Wikipedia, and rerunning the model lowers the percent gain from price personalization from 12.99% to 10.49%. Even with such sites dropped, personalized pricing based on browsing histories is much more effective than personalized pricing based SHILLER SOURCE: https://tinyurl.com/y827fuzr. NOTES: Screenshot taken on Amazon.com on September 21, 2018, while logged into a user's account. In order to redeem the coupon, the consumer merely needs to click the box. At checkout, the price is listed at $63.99, instead of the originally listed price of $79.99. on only demographics. Finally, I repeat personalized pricing simulations excluding the most and least avid consumers, the top and bottom half percent, out of concern that functional form assumptions may cause imprecision in estimates at the extremes. Excluding these extreme consumers, the profit increase from personalized pricing is estimated to be 0.075% when only using demographics, and 6.9% when browsing data are used to personalize prices. Although the gain to personalized pricing falls when the extreme consumers are excluded, the main conclusion remains the same: Web-browsing data are much more effective than demographic data for personalizing prices.

DISCUSSION AND CONCLUSION
This article finds, in the context of Netflix, that web-browsing data substantially improve targeted advertising and increase profits from personalized pricing, relative to when only demographic data are available. Better ad targeting may benefit advertisers, firms, and even consumers, but the efficiency and equity effects of widespread personalized pricing are less well understood. Most textbooks espouse the efficiency of personalized pricing based on partial equilibrium analysis. However, I find that the benefits accrue to firms, not consumers in aggregate, and feasible regulations allowing limited amounts of price discrimination do not appear to yield large benefits. 40 A related question is whether it is fair for consumers to pay different prices for the same product. Kahneman et al. (1986) finds personalized pricing was viewed as unfair by 91% of respondents. Yet, the prevalence of third-degree price discrimination (and coupons) suggests firms can profit by (effectively) offering different prices to different groups, and therefore consumers are willing to pay different prices than others for the exact same good.
Still, perceived fairness remains an important business consideration, and Amazon's multiple attempts at personalized pricing are an instructive example. In the year 2000, customers who discovered they were being offered different prices on Amazon reacted with fury. 41 In response, Amazon refunded price differences and stopped personalizing prices. Over the last decade, a variety of firms began employing personalized pricing but were more careful than Amazon about framing. Personalized prices were not called their true name, but rather were labeled "customized coupons" or "personalized discounts." They are not coupons in the traditional sense; they are automatically applied when the consumer reaches the Web site and typically require no action on the customer's part (or sometimes just one click). They are merely called "coupons" or "discounts" to address concerns over perceived unfairness. In 2017, Amazon followed suit. Specifically, Amazon began allowing third-party sellers to offer coupons to consumers who had viewed or purchased certain products, designated by Amazon's internal product codes (ASINs). 42 Figure 7 shows an example of the seller interface allowing targeted couponing. Using the coupon requires trivial effort from the targeted consumer, they merely click the coupon box, as shown in the example in Figure 8. Hence, Amazon allows third-party sellers to offer different prices to consumers based on their browsing histories at Amazon. It may not be widespread at Amazon (yet), but its existence and growing use comprises a conspicuous example of a major retailer embracing personalized pricing, but under a different name to address concerns over perceived unfairness.