How Video Rental Patterns Change as Consumers Move Online

How will consumption patterns for popular and “long-tail” products change when consumers move from brick-and-mortar to Internet markets? We address this question using customer-level panel data obtained from a national video rental chain as it was closing many of its local stores. These data allow us to use the closure of a consumer's local video store as an instrument, breaking the inherent endogeneity between channel choice and product choice. Our results suggest that when consumers move from brick-and-mortar to online channels, they are significantly more likely to rent “niche” titles relative to “blockbusters.” This suggests that a significant amount of niche product consumption online is due to the direct influence of the channel on consumer behavior, not just due to selection effects from the types of consumers who decide to use the Internet channel or the types of products that consumers decide to purchase online. This paper was accepted by Pradeep Chintagunta, marketing.


Introduction
A variety of papers and articles have documented large differences between the types of products purchased online and offline (e.g., Brynjolfsson, Hu, and Smith 2003;Anderson 2006;Brynjolfsson, Hu, and Simester 2011). However, these differences in purchase patterns across channels could be solely due to selection effects: heterogeneous consumers may sort into channels based on their tastes, or consumers may choose the channel based on the types of products they want to purchase. Our objective in this paper is investigating whether observed differences in consumption patterns between online and offline markets are solely due to selection, or whether the nature of the channel has some impact on consumer's choices.
Answering this question requires the use of customer-level panel data on online and offline purchases and an exogenous shock affecting consumers channel choices. In this paper we use just such a dataset for DVD rentals. We base our empirical analysis on household-level panel data from a large video rental chain that closed many physical locations during our study period. The market for DVD rentals has traditionally exhibited -superstar‖ effects, where a few top-selling products take the majority of all revenues (Rosen 1981); our focus is to empirically examine how the introduction of online commerce has changed the share of transactions taken by superstar versus niche DVDs.
In terms of descriptive statistics, we note that the top 100 most popular DVDs make up 85% of in-store rentals for our focal company but account for only 35% of the company's online rentals. However, we cannot use these simple statistics to conclude that online markets change consumer behavior because these statistics may be solely due to selection effects. Our approach to study whether online markets change consumer behavior is to examine how household-level rental patterns for popular and niche titles change when the exit of physical stores forces consumers to move from offline to online channels.
Specifically, we use the exit of physical stores as an instrumental variable for the online versus offline channel choice. This instrument exploits transportation cost changes experienced by individuals located near the exiting stores. Our findings indicate that characteristics of the online channel cause superstar DVD titles to take a smaller share of the market as consumers shift from offline to online marketplaces.
From the standpoint of theory, online markets may (or may not) transform markets that have traditionally exhibited -superstar‖ effects. Various supply-side or demand-side mechanisms can cause long tail or superstar markets (Brynjolfsson, Hu, and Smith 2006;Brynjolfsson, Hu, and Smith 2010). For example, the selection of products available from the Internet channel is often much wider than the selection available at physical stores.
Online marketplaces can offer a larger selection of products than traditional physical stores can because the online channel has lower storage and inventory costs and there are no shelf space limitations. As a consequence, the concentration of overall sales across products may tend to decrease as transactions shift from offline to online channels.
Online channels could also change consumers' product choices even when the sets of products offered online and offline are identical. In part, this might happen because the ways consumers search for products online and offline are fundamentally different. At physical stores, finding a popular product may be easier than finding a niche product, even when both are available. Popular products typically occupy more prominent shelf space in physical stores versus niche products that are relegated to less visible positions.
In online marketplaces, search tools may be used to promote the discovery of niche titles tailored to individual customers' preferences. However, personalization and recommendation engines and other search tools could also increase the concentration of product sales. For example, top 10 seller lists may tend to reinforce the popularity of already popular products. Similarly, recommendation systems may increase the concentration of product sales because they base their recommendations on actual sales and there is limited data for products that have low historical sales (Fleder and Hosanagar 2009;Oestreicher-Singer and Sundararajan 2011).
The answer to how consumption patterns change when consumers move online is important for both the academic literature and for managerial practice. If the observed differences between online and offline markets are solely due to selection, without an effect on aggregate consumption, then there is no need for producers to change their behavior-and in our context in particular, motion picture studios should continue to focus on producing blockbuster titles. However, if using online markets changes consumer behavior, then producers may wish to reexamine their current strategies and shift their production toward more -long tail‖ products.

Literature
Our results contribute most directly to a small empirical literature studying the effect of information technology on sales concentration patterns. While the -Long Tail‖ was considered one of the best ideas of 2005 by industry observers (Businessweek 2005), it is important to note that there is no general agreement in the academic literature regarding how online commerce will affect the concentration in product sales. In this literature, Elberse and Oberholzer-Gee (2007) use aggregate data by title to study how online commerce affected the distribution of sales in the United States' home video industry from 2000 to 2005. They find that, although the number of product choices increases, by the end of their study period superstar products comprised a larger proportion of sales than before. 1 Brynjolfsson, Hu, and Simester (2011) examine the concentration of product sales for a retailer of women's clothing selling through both Internet and catalog channels. Using cross sectional data on sales, aggregated by item and channel, they find that the concentration of product sales is lower for the Internet channel than for the catalog channel.
Our paper is also related to the research examining grocery shopping using householdlevel data for households that shop interchangeably at online and offline stores from the same grocery chain (e.g., Chu, Chintagunta, and Cebollada 2008). In this literature, our paper is most closely related to Pozzi (2012), who finds that brand exploration is more prevalent in physical stores than online. As consumption goods, however, groceries are substantially different than DVD rentals because groceries are typically consumed more repeatedly as compared to DVDs. 2 1 Partly motivated by this result, Bar-Isaac, Caruana, and Cuñat (forthcoming) formulated a model in which a reduction in search costs generates both superstar and long tail effects. 2 Our results also contribute to a growing literature on the impact of popularity and recommendation information on online sales of niche and popular (Tucker andZhang 2011, Fleder andHosanagar 2009, and Our examination of superstar and long tail effects uses individual-level panel data including information on consumers' transactions from both online and brick-and-mortar channels. We use these data to analyze how individuals change their consumption patterns when they are induced by store closures to move from in-store to online consumption.

Data and Setting
Our data come from a large video rental company that operates both brick-and-mortar stores and online DVD rental channels. For a monthly flat rate subscription, customers in our data can rent DVDs online and receive them in the mail, and then exchange these DVDs either through the mail or at a physical store.
The selection of DVD titles available for rental at these physical stores is a subset of the selection of titles available for rental online. While a typical store has a rotating selection of approximately 2,000 titles, the online channel has over 100,000 titles. The Internet channel has a much larger DVD selection than the selection available at physical stores because the online channel has lower storage and inventory costs. 3 Due to these storage capacity limitations, our focal company's physical stores stock more copies of new releases than of older titles. Inventory costs are also lower online than in physical stores because the company we study ships DVDs to its customers from a small number of centralized warehouses, compared with a substantially larger set of physical stores. Thus, as these shipping locations reach a much larger number of consumers than a physical store would, the law of large numbers indicates that the company can reduce inventory costs by more accurately predicting demand from the online channel.
Our data cover DVD rental activity from both the online and in-store channels for all subscribing customers, and include more than 49 million rental transactions for the thirty Oestreicher-Singer and Sundararajan 2011). In contrast to our study, these studies do not examine sales from physical stores or cross channel choices. Our paper is also related to Waldfogel (2012) who documents a decrease in the degree of music sales concentration in a few artists. 3 Storage costs are even lower for video streaming services; however, video streaming was in its infant stages of development during our study period, and our focal company did not offer a video streaming service during our period of analysis.
week period from October 2, 2009 through April 29, 2010. Although consumers without a monthly subscription can rent DVDs from our company's physical stores, our data only include the information from consumers with a monthly subscription that allows the rental of an unlimited number of DVDs. 4 Our customers maintain an online queue of DVD titles they wish to watch, and when they return a DVD, the company sends the next DVD title from that queue to the subscriber's home.
Our data include the renting subscriber, DVD title, transaction date, and whether the DVD was delivered by mail or exchanged at a physical store. In addition, we have the zip code for each subscriber, the address for each physical location operated by this firm, and the closing date for the locations that were closed during our study period. For approximately 56% of the subscribers we also have exact addresses, which, when combined with the address for each store in our sample, allows us to calculate the distance between each of these customers and their closest store. 5 Table 1 presents the summary statistics for our data. The DVD rental market has experienced important changes during the last decade. Industry trends show that traditional physical stores have been displaced by online DVD rental services, and more recently by video streaming services and by physical kiosks. We do not know exactly how the number of subscribers changed during our period of analysis, because subscribers may not rent every week and we do not have a list of subscribers indicating when they signed up for service or canceled their subscriptions. But the decrease in the number of subscribers renting DVDs by the end of our study period shown in Table 1 suggests that the number of subscribers decreased during this period. Our data come from a company that closed 15.2% of its physical stores during our period of analysis (see the last column of Table 1). Our focal company did not open new physical locations during the thirty-week study period, and thus the number of physical store locations is entirely driven by store exit. The substantial change in the number of physical rental stores will play a central role in our identification strategy since we will use store exit as an instrumental variable for online versus offline channel selection by consumers.
Rentals via mail represent 68.3% of all rentals, and DVD exchanges at physical stores represent the remaining 31.7%. Averaging our information across subscriber-week observations with positive DVD rentals (subscribers may not rent every week and our data only record the rental instances), subscribers rented an average of 2.25 DVDs per week: 0.71 DVDs from the store and 1.54 DVDs by mail.
Importantly, a monthly subscription fee allows the rental of a certain number of DVDs at a time, but subscribers do not pay a price each time they rent a DVD from either the online or offline channel. Thus, for subscribers, the DVD rental price is neither a driver of the online versus offline channel selection, nor of the specific DVD title choice.

Popular and Niche Product Definitions
A stream of prior research, while focusing on examining the potential of information technologies to transform the distribution of sales across products and channels, has classified products as either niche or popular. Products are typically classified as niche when they are less likely to be stocked in physical stores, or are only available after incurring a high search cost. In spite of this definition, because of data restrictions, prior studies classified products as niche or popular based primarily on product sales from the online channel, and not on product sales from the offline channel. Our data have the advantage that they allow us to define the popularity of DVD titles during a week using information from both online and offline rental channels.
Classifying goods as either niche or popular based on online sales might be problematic if the distribution of sales across products online and offline are different. We know that firms choose which products to stock online and offline, and furthermore know that product availability by channel may influence consumers' channel choices. As an example of the possible problems that can occur when classifying products based solely on online sales, suppose that consumers buy a product online only when this product is not available at the physical store. If this situation is common for a given product then this product could be classified as popular using online sales, even though many consumers are buying it online precisely because it is not available in physical stores and therefore could be considered a niche product. Additionally, some online retailers, because they face low competition from physical stores, may specialize in selling only niche products that are less likely to be available at physical stores. For these retailers classifying top-selling products as popular and the remaining products as niche may be incorrect.
Using our data, we can only provide comparisons of transactions online and offline for a single product category (DVD rentals) and within a single firm. However, within this major firm, we can compare the extent to which the selection of DVDs rented online is different than the selection of DVDs rented offline. We do this by using online and offline data aggregated at the national level, and computing the total number of rentals for each DVD title and each channel during each week. We then rank DVDs by popularity, computing two separate weekly ranks of DVDs using either online or offline rental information.
Our definition also allows for the popularity of a DVD to vary from week to week: a DVD that is popular at the beginning of the study period can become niche by the end the study period since DVDs have short-lived popularity cycles. For example,91 (228,347) different DVD titles are among the top 10 (top 50, top 100) DVD titles for at least one week during our thirty-week study period. DVDs approximately 70% of the titles are included in both rankings and the remaining 30% of the DVD titles are included in only one ranking.

Figure 1: Commonality Between Online and Offline Popular Titles
Differences between the online and offline rankings of DVD titles may be due to selection effects and may also be due to other cross-channel differences on the demand side or on the supply side. For example, these differences could be driven by preference heterogeneity between consumers who disproportionately choose the online versus the offline channel, different display and promotional activities across channels, or differences in the selection and inventory of titles available from the offline and online channels. Using our data it is difficult to disentangle the degree to which each factor may contribute to the differences between online and offline rankings. This is partly because we do not observe inventory or title assortments online and offline, and moreover title assortments may vary across our company's physical stores. 6 In this section we have noted that the specific titles that are popular in the online channel are somewhat different than the titles that are popular in the offline channel. Using our data we can generate popularity lists based on both online and offline rentals, and we will examine whether popular titles defined in this way change as consumers move online.

Superstars: Online versus Offline DVD Rental Distributions
In Table 2 we see that superstar DVD titles take a substantially larger share of total rentals offline than they do online. For example, the top 100 DVD titles in our sample represent 84.6% of in-store rentals, but only 35.1% of online rentals.
Although the statistics in Table 2 may be suggestive of how consumption patterns change when consumers move from offline to online markets, we must be cautious when interpreting Table 2. From these statistics alone we cannot conclude that online commerce decreases the superstar nature of the DVD rental market, since these differences in rental patterns across channels could be solely due to selection effects. For example, different rental concentrations may be due to heterogeneous tastes of the consumers who rent primarily online versus consumers who rent primarily offline. These differences could also be explained by selection effects due to consumers' impatience. If consumers are impatient regarding watching a newly-released DVD and do not wish to 6 Having acknowledged the limitations of our data for distinguishing between alternative explanations for the differences in the titles included in the rankings online versus offline, Table EA1 in the extended online appendix may provide some relevant information regarding the selection of titles available from each channel.  wait for the DVD to arrive in the mail, then consumers who typically rent from both channels may select to rent from the physical store more often than from the online store when they wish to watch popular versus non-popular DVDs.
Our objective in this paper is to examine whether consumption patterns change as consumers move online, or whether the observed differences in online and offline consumption are primarily due to selection effects. Although in this paper we do not seek to identify why consumers change consumption patterns when they move online, beyond selection effects, the differences in the statistics in Table 2 may be an effect of the channel. We argued in the introduction that the literature has identified a variety of long tail effects arising from both demand and supply factors. Some demand and supply factors are more specific to our context. For example, the focal company's different display and promotional activities across channels may partly explain the statistics in Table 2. Popular products occupy a disproportionate amount of prominent shelf space in the company's physical stores as compared to the online channel. In addition, the queue system for video rentals may also partly explain the statistics in Table 2. Consumers who do not frequently update their online queue of DVDs may end up watching older and less popular titles when they move to the online channel. 7 7 Two other possible -channel related‖ explanations for our results are worth mentioning. First, one might wonder whether our results are influenced by stock-outs being more common online than in stores, driving customers to long tail titles because of the unavailability of (otherwise preferable) popular titles online. To partially test for this supply side mechanism, we monitored stock-outs for a matched set of newly released -popular‖ titles at three physical stores and through the company's online channel. Our brief examination showed that, if anything, stock-outs of popular titles are somewhat more common in physical stores than online, suggesting that our results are not due to this supply side mechanism. Similarly, one might wonder whether the increase in online subscribers over time as stores closed increased the number of stock-outs for popular titles online relative to what these customers would have experienced in physical stores, leading to a similar shift in consumption unrelated to customer preferences. While we do not observe inventory levels in our data and thus cannot strictly rule out this possibility, we confirmed that our focal company's policy was to shift physical inventory of popular titles from closing stores to the online channel (while selling off some -long tail‖ inventory when a store closed). This shift, combined with the increased availability associated with a single centralized warehouse/queue versus multiple queues in physical stores should, if anything, have increased the relative availability of popular titles online as stores closed. Moreover, even if the increase in online usage led to a temporary reduction in the availability of popular titles, our focal company's access to wholesale DVD suppliers should have allowed them to respond to this shift quickly by ordering more DVDs. Finally, as we noted above, our data suggest that some customers may have cancelled their subscription when their local stores closed which, combined with the shift of popular DVDs from physical stores to online inventory should have further increased the relative availability of popular DVDs online relative to in stores We now present our empirical approach for examining whether consumers change rental patterns as they move online, or whether the differences in online versus offline consumption are primarily due to selection effects.

Econometric Model
As noted above, our objective is to study whether changes in which rental channel consumers use affects their selection of DVD rental titles, and establishing whether online markets affect the consumption of superstar versus long tail DVD titles. We showed in Table 2 that the distributions of DVD rentals online and offline are quite different. Superstar DVD titles in particular take a substantially larger share of all rentals made in physical stores than they do online. However, while these distributions are suggestive about what would be expected when consumers move from offline to online markets, we cannot use these statistics alone to conclude that the rental channel changes a household's selection of DVD rental titles. Specifically, the different rental distributions online and offline in Table 2 could be explained solely by selection effects. Cross-section regressions would suffer from a similar problem, because these regressions obtain empirical identification from comparing DVD rental selections across heterogeneous consumers.
Our empirical approach then is to control for unobserved heterogeneity using panel data, exploiting changes in DVD rental activity across time and across rental channel for each household. For each household i in each week t we define the following variables: is the share of the number of superstar DVD title rentals (weekly top 10, top 50, and top 100) divided by the total number of rentals, and is the share of the number of rentals made offline, divided by the total number of rentals made both online and offline.
We then use these variables to estimate the following fixed effect model: The variable in Model (1) represents the total number of DVD rentals made by household i in week t. The coefficient in Model (1) measures how weekly changes in the share of DVDs rented from the physical store relate to weekly changes in the share of popular DVD rentals. We control for the weekly total DVD rentals from both online and offline channels because our objective is to examine the effect of channel choice conditional on the total amount of rental consumption (the online appendix presents results not controlling for weekly total DVD rentals). The model includes fixed effects for each household and for each week , and includes zip code-specific trends .
By using a longitudinal model we can -difference out‖ the time invariant unobserved characteristics of each household (for example household fixed effects capture income levels or household sizes that are unlikely to change substantially during a seven month period). The week fixed effects capture aggregate changes over time, such as changes in DVD rental consumption that can be caused by school breaks or seasons. To account for pre-existing trends at the level of the zip code, Model (1) also includes zip codeidiosyncratic trends. For example, these idiosyncratic trends may account for marketlevel changes, such as changes in Internet or cable television usage that might have affected rental consumption patterns during our study period. Identification in Model (1) arises from deviations from zip code-level trends in changes in the DVD rental selection and the rental channel within households from week to week.
While our panel data approach allows us to control for the time invariant tastes of each household, and therefore accounts for the sorting of heterogeneous consumers into channels, Ordinary Least Squares estimates of Model (1) may still provide a misleading measurement of how the rental channel affects the selection of DVD rentals when a household's desire for popular versus non-popular DVDs changes over time. For example, consumers may choose to rent a popular DVD title from the physical store in weeks when they feel impatient about watching a popular newly released title and do not wish to wait for the DVD to arrive in the mail. The rental channel is a choice; and individuals' changes in their desires to watch popular versus non-popular DVDs may influence their channel selection, creating an endogeneity problem. In order to identify how changes in the rental channel affect the overall selection of DVD rental titles we need to observe changes in individuals' shares of offline rentals that are not caused by weekly changes in the desire to watch popular versus non-popular DVDs.
To break this endogeneity problem, we use the exit of physical stores as an instrumental variable. The rationale for using the exit of physical stores as an instrument is that the exit of a store, by changing the transportation cost of traveling to the store for the individuals that previously rented DVDs from the closing store, increases the relative cost of renting DVDs from the physical channel. In turn, the increase in the relative cost of renting DVDs from the physical channel may induce consumers to shift their rentals from the offline to the online channel. Our instrument is valid as long as it affects channel selection and can be excluded from Model (1). Specifically, the exit of a physical store is a valid instrument even when store closures are not random and are possibly related to a decrease in the local aggregate demand for DVD rentals, as long as store closures are unrelated to relative rental demands for popular versus non-popular DVD titles.
We will use two alternative models to test whether households change the share of transactions made from physical stores when the stores in their geographical market exit.
First, following Brynjolfsson, Hu, and Rahman (2009), we assume that the transportation cost of traveling to the physical store increases when the number of physical stores in the zip code decreases. Brynjolfsson, Hu, and Rahman (2009), however, treat zip codes as isolated markets. By computing distances in miles among the zip codes' centroids using data from the United States Census we can extend Brynjolfsson, Hu, and Rahman (2009) to account for changes in the number of physical stores located in adjacent zip codes.
Specifically, we estimate the following first stage model: where j equal to 1 represents the zip code where household i resides, j equal to 2 (3, 4, 5, and 6) represents zip codes with centroids located between zero and five (five and ten, ten and fifteen, fifteen and twenty, and twenty and thirty) miles away from the centroid of the zip code where household i resides. We also note that the number of stores in a zip code changes over time through store exit.
We obtained the latitude and longitude for all physical stores and for the fraction of all consumers where we have the consumer's address. For these consumers we can compute the distance to the closest store in each week, and estimate the following model: where represents the geodesic distance between the location of household i and the closest physical store in week t. Note that the distances to the closest store change over time for households living near closing stores.
The focal company may naturally close its least successful stores, and the selection of which stores to close may be related to local demographic characteristics or to changes in the local market environment. However, we believe that the exit of stores is unlikely to be affected by individuals' high frequency changes in their relative desires to rent popular versus non-popular DVDs, in which case our instrument is orthogonal to the error.
Moreover, the zip code-specific trends in our regression control for pre-existing trends at the level of the zip code that might have induced stores closures. These trends may include trends induced by demographics, Internet or cable connectedness, or the local market environment. In sum, our instrument is valid if the high-frequency timing of store closure is unrelated to the relative desire to rent popular versus non-popular titles. We also show in online appendix EA2 that store closures are unrelated to the number of highspeed Internet providers in each zip code. 8 It is important to note that our company faces competition from other DVD rental companies, and during our study period other companies rented DVDs exclusively -by mail,‖ from kiosks, and from physical stores. 9 The entry of DVD rental kiosks (and even the mere existence of rental kiosks prior to our study period) and competition from other physical DVD rental stores might be thought to represent a challenge to our identification strategy because households living near closing stores may rent popular DVD titles from other companies while they continue to rent niche DVD titles from our focal company. If this happens we would observe a change in the relative demand for popular versus niche titles associated with the exit of stores, when the unobserved consumption bundle of niche and superstar titles from both the focal company and the competitors might remain unchanged. This would invalidate the use of our instrument.
However, as noted above, our data only include information from consumers with a -Rent by Mail‖ subscription, and these subscribers do not pay a separate price each time they rent a DVD from either the online or offline channel. Although in theory the consumers in our data may simultaneously have subscriptions with our focal company and may be willing to pay extra to have a separate subscription with other competing companies, or to rent DVDs from competitive outlets, we are doubtful that this is a common practice. In this regard, we note that the market leader in rentals via kiosks, Redbox, considers that, -people who use the kiosks tend to be casual viewers who don't want to be tied down to subscriptions or membership fees‖ (Green 2009). More importantly, in the Appendix, we use historical data on Redbox locations to show that our conclusions in the main text are robust to restricting the analysis to locations without a local Redbox kiosk.
Of course, it is also true that households living near closing stores may decide to cancel their subscriptions from the focal company, and begin renting from other companies (e.g., from Netflix) in which case their rentals will not be recorded in our data. For this reason, in the results below, we analyze the sensitivity of our results to attrition using a balanced panel of consumers. per zip code is 0.52. We will use these statistics below in interpreting our regression results. †In the data, zip codes have between zero and three stores.

Table 3: Additional Summary Statistics
The sizes of the coefficient estimates on the fraction of offline rentals indicate that a household that decreases the fraction of DVDs rented from the physical store from twenty eight percent to zero, as might be the case when physical stores are eliminated as a choice for consumers (note that the mean of the share of offline rentals in Table 4 is 0.28), would decrease the fraction of top 10 (top 50, top 100) DVD rentals by 10.2 (13.9, 13.1) percentage points. These effects are substantial. For example, Standard errors in parentheses are clustered by household.
The mean of the dependent variable is 0.20 in Column I, 0.39 in Column II, and 0.47 in Column III. * significant at 10%; ** significant at 5%; *** significant at 1% The sign of the coefficient estimates on total DVD rentals per week is negative and economically small. The negative sign may be unsurprising because individuals may tend to rent top DVDs first, and rent DVDs that are further down the popularity distribution during weeks when they increase the number of DVD rentals. Renting an additional DVD in a week reduces the fraction of top 10, top 50, or top 100 DVD rentals by between 0.6 and 1.2 percentage points.

Instrumental Variable Results
The regressions in Table 5 present our first stage results examining how channel choice is affected by the exit of stores. The results in Column I of Table 5 show that when one store exits from a zip code, consumers living in that zip code decrease their share of offline rentals by an average of 2.2 percentage points (or approximately 7.8% of the transactions made at physical stores). This result is expected because, by increasing the transportation cost, the exit of a store from a zip code increases the relative cost of renting from the physical store for households that reside in that zip code.
Our first stage results are consistent with the prior literature showing that the likelihood of purchasing products online decreases as the number of stores in the zip code increases (Brynjolfsson, Hu, and Rahman 2009). However, Brynjolfsson, Hu, and Rahman (2009) treat zip codes as isolated geographic markets, while we can also provide information regarding the size of the geographic market.
Column I of Table 5 shows how the impact of the exit of a store on channel choice dissipates for households living further away from the closing store. The results in Column I of Table 5 indicate that the closure of a store within the zip code where a household resides has an impact on the household's channel choice that is seven times larger than the impact of the closure of a store in other zip codes with centroids located less than five miles away from the centroid of the zip code where the household resides.
The results in Column I of Table 5 also show how the exit of stores in zip codes located further away have no impact on households' channel choices.
We also use the geodesic distance from consumers' locations to the closest physical stores as an alternative instrument. We acknowledge that some consumers may use stores that are not the closest to their home address (e.g., stores nearby their working location or in the way when running errands), but we still believe that using the closest store to the home address is useful as an approximation for the transportation costs of using the offline channel. Comparing unconditional means, households living less than one mile away from a physical store make 29.4% of their rentals offline and households living more than 20 miles away from physical stores make 10.3% of their rentals offline.
Column II of The mean of the dependent variable is 0.28 in Column I and 0.25 in Column II. * significant at 10%; ** significant at 5%; *** significant at 1% when the new closest store for these households is ten miles away. Moreover, the sizes of the coefficients indicate that households that reside near a closing physical store will decrease the transactions made from physical stores to approximately zero when the new closest store for these households is thirty miles away.
The results for the second stage of Model (1) in Table 6 still show that individuals increase the fraction of popular DVD rentals when they rent more DVDs from the physical store. The first three columns use Column I in Table 5 for the first stage regression and the last three columns use Column II in Table 5 for the first stage regression. In the first three regressions in Table 6 Table 6 the sizes of the coefficient estimates on the fraction of offline rentals indicate that when a household decreases the fraction of DVDs rented from the store from twenty five percent to zero (note that the mean of the share of offline rentals for the last three columns of Table 6 is 0.25), the fraction of top 10 (top 50, top 100) DVD rentals decreases by 10.4 (9.9, 8.9) percentage points. 11 We note that excluding the square of the distance from the first stage (or alternatively including higher order polynomials) causes no substantial change in the second stage results.  Comparing the Instrumental Variables results in Table 6 with the OLS results in Table 4, we observe that the size of the coefficient estimates on the fraction of offline rentals are similar for top 10 titles and smaller, but still significant both economically and statistically, for top 50 and top 100 titles.
As in Table 4, the coefficient estimates on total DVD rentals per week in Table 6 are negative and economically small.  However, the value of a subscription may be greater as the distance to an offline store decreases, since having a physical store nearby provides the additional value of exchanging DVDs at the store. Since attrition in our data is likely correlated with the exit of physical stores, attrition might bias our instrumental variable results. Tables 8 and 9 present first and second stage regressions analogous to those in Tables 5 and 6, but using the balanced sub-sample of our data.

Sensitivity of Results to Attrition
The results for both the first stage and second stage regressions in Tables 8 and 9 using the balanced sub-sample are similar than those in Tables 5 and 6 using the entire sample.
This similarity suggests that our previous results using the entire sample are not driven by changes in the profile of customers over time.  The mean of the dependent variable is 0.30 in Column I and 0.28 in Column II. * significant at 10%; ** significant at 5%; *** significant at 1%

Regressions in the online appendix show various other sensitivity tests for our results
including results that exclude total DVD rentals as a covariate in Tables 4, 5, and 6   (Appendix EA3, Tables EA3, EA4, and EA5), results that only use subscribers where we have the consumer's physical address in Table 4, Column I of Table 5 and Columns I   through III of Table 6 (Appendix EA3, Tables EA6, EA7, and EA8), and results examining rentals of -top 2,000‖ titles (those that are likely to be stocked in both physical and online channels) (Appendix EA4). Our main results are robust to each of these considerations.

Discussion
As the proportion of commerce conducted online increases, will producers and retailers need to re-evaluate their investment and inventory choices? Answering this question is complicated by selection effects surrounding the types of consumers who purchase online and the types of products that consumers choose to purchase online. While early research has observed a large proportion of sales online in niche products, it is unclear whether this observation is merely a reflection of the characteristics of the consumers who select the channel, of the types of products that consumers select to purchase online versus offline, or whether it might reflect a change in consumption patterns caused by the characteristics of the Internet channel.
Breaking this endogeneity requires an exogenous shift in the cost of purchasing online, and the ability to observe customer-level purchase decisions by channel before and after the shift. Our data provide us with just such an opportunity. Our data document customerlevel rental decisions before and after a customer's local video rental store closes, and our empirical analysis suggests that when consumers move online they are much less likely to rent blockbuster titles than they were previously.
While our objective in this paper has been to examine how channel selection affects consumption patterns, our results showing how the impact of store exit on channel choice varies depending on where consumers live relative to the closing store, also complement and extend the prior literature on transportation costs and channel selection (e.g., Brynjolfsson, Hu, and Rahman 2009;Forman, Ghose, and Goldfarb 2009). Specifically, our data are substantially more granular than those used in previous work studying this question. Although our results extend the prior literature on channel selection and transportation costs, our examination of how changes in home-store distances affect channel selection in this paper is presented in the context of a first stage regression and is not the main research focus. Conducting a more detailed investigation of how transportation costs influence the online versus offline channel selection using our data is a potential avenue for further research.
Our main result, indicating that when consumers move to online channels they decrease their likelihood of renting popular titles, is of course not without limitations. Importantly, while we examine how consumption changes when consumers move online we do not examine why this change in consumption patterns occurs. Moreover, our results only provide evidence concerning a specific market, and are not necessarily generalizable to other environments. Online commerce could have heterogeneous impacts across industries, and transform different markets into either a -long tail‖ or -superstar‖ market based on the specific nature of each industry. For example, a mechanism that may partly underlie our results is that our focal company does not display or promote popular products as heavily in the online channel as it does in the brick-and-mortar channel.
While this characteristic is typical across various online versus brick-and-mortar channels, it might be more pronounced in the specific market we study than in other markets.
The queue system for online consumption is also specific to our setting. If consumers were more likely to experience stock-outs in the online channel relative to the physical channel, or if the relative prevalence of stock-outs increased over time as stores closed, our results could be explained by supply-side effects related to stock availability versus demand-side effects related to customer preferences. However, our limited checks of stock levels in both channels described above suggest that, if anything, popular titles have a higher availability online than in physical channels. Similarly, as we noted above, our focal company's practice of moving inventory for popular titles from physical stores to the online channel as stored closed, combined with the increased efficiency of a single online warehouse queue relative to multiple queues in physical stores and with the observed reduction in total subscribers as stores closed should, if anything, increase the relative availability of popular titles online as stores close. 12 Nonetheless, our inability to observe channel-specific stock levels in our data represents a limitation of our analysis.
Identifying how our observed effects are determined by supply versus demand side factors is an important avenue for future research.
Similarly, even focusing on a single market, our results could vary over time. For example, early adopters of -Rent by Mail‖ subscriptions might be more interested in niche DVD titles than late adopters are. In this regard, we believe that our analysis of a mature market provides a more useful examination of the market-level impact of online commerce on product concentration than an analysis of a nascent market would.
Although our period of analysis is too short to examine how the results change over time, this examination is also a potential avenue for future research.
To summarize, our results show that there is a change in consumption patterns caused by the characteristics of the Internet channel. As a consequence, our finding that online channels may shift DVD consumption away from blockbuster titles and toward more niche titles may have implications for movie studios and movie producers. Specifically, movie studios have typically faced a market where a small number of hits made up the vast majority of industry profits. Our results suggest that this historical pattern of highly concentrated transactions in a handful of titles might have been driven by the characteristics of the offline channel, and that studios may wish to shift their resources relatively toward more -long tail‖ titles as consumers move online. There is obviously a need for more research to be in a position to predict the degree to which Internet markets change the incentives of movie producers and allow for the production of more niche titles; and we believe this paper is a first step in that direction. 12 We also note that our results in the extended appendix show that our long tail effects are robust to excluding data for the top 5 or top 10 most popular titles (Appendix AE5).

Appendix: Sensitivity of Main Results to Redbox Kiosk Locations
In this appendix we use historical data on the location of Redbox kiosks as of April 28, 2010 (the end of our study period is April 29 2010). We obtained these data from AggData LLC, a company that provides data to businesses and organizations. 13  Resides‖ is 0.52 in Table 5 in the main text compared to 0.20 in Table A1). Standard errors in parentheses are clustered by household.
The mean of the dependent variable is 0.26. * significant at 10%; ** significant at 5%; *** significant at 1% Standard errors in parentheses are clustered by household.
The mean of the dependent variable is 0.20 in Column I, 0.39 in Column II, and 0.47 in Column III. * significant at 10%; ** significant at 5%; *** significant at 1%

Appendix EA1: Popularity Rankings Based on Online and Offline Information
In Section 3.1 we explained the convenience of computing rankings using information from both the online and offline channels in examining consumption patterns when consumers move from offline to online. Using our data on DVD rentals, we noted that the specific titles that are popular in the online channel are somewhat different than the titles that are popular in the offline channel. We also explained that the differences in the titles making up the rankings online versus offline may occur due to selection effects and other cross-channel differences on the demand-side or on the supply-side. However, partly because we do not observe inventory or title assortments online and offline, we have limited ability to measure the degree to which each factor contributes to these differences.
Although the focus of this paper is not on identifying the factors explaining the differences in online versus offline title popularity (nor is our focus on examining the specific demand or supply factors underlying the changes in the concentration of rental transactions when consumers move online), Table EA1 may provide some useful information regarding the selection of titles available from each channel.
Column I (II, and III) in Table EA1 shows the offline rankings of titles that are ranked in the top 10 (50, 100) in the online channel. Consistent with Figure 1 in the main text, Column I indicates that a weekly average of 7.3 DVD titles that are top 10 in the online channel are also top 10 in the offline channel. Column I of   Table   EA1 shows that all titles that are top 10 offline are ranked among the top 5,000 online.
Column I (Column IV) of Table EA1 also shows that approximately two thirds (one third) of the 2.7 titles that are top 10 online (offline) but not offline (online) are ranked between the top 10 and top 20 offline (online). 14 While the different rankings online and offline may arise from selection effects or other demand differences online versus offline, they may also be explained by stock-outs for specific titles in the online or offline channels. For example, a top 10 title in the online channel may be ranked between top 10 and top 20 in the offline channel when there are fewer copies of this title in the offline channel leading to stock-outs at physical stores. In addition, a title may be available from some but not all brick-and-mortar stores; for example, a title may become available from various brick-and-mortar stores on different dates. Similarly, a top 10 title in the offline channel may be ranked between top 10 and top 20 online when there are stock-outs online.
Columns II and III of Table EA1 show that in any given week there are top 50 or top 100 titles online are not ranked top 5,000 in the offline channel. Conversely, Columns V and VI of Table EA1 show that all top 50 and top 100 titles online are ranked among the top 5,000 in the offline channel.

Appendix EA2: Online Connectedness and Store Closures
We collected data from various sources in order to investigate whether store closures are more likely to occur in places with higher online connectedness. We obtained data on the number of high-speed Internet service providers at the zip code level from the Federal  15 We use data from June 2008 because the FCC discontinued the reporting of these data at the zip code level after this date. http://transition.fcc.gov/Bureaus/Common_Carrier/Reports/FCC-State_Link/IAD/hzip0608.pdf income by zip code using data from tax returns from the Internal Revenue Service. 16 We also collected data at the zip code level on average family size, median age, race, population, and area size from the 2010 U.S. Census. 17 Using our data on physical stores' locations and closing dates we constructed a dummy variable equal to one for stores that closed during our study period and zero for stores that remained open at the end of our study period. In Table EA2 we use a logit regression model to investigate the factors affecting the probability of store closures. These results show that the effect of the number of high-speed Internet providers on the likelihood of a store closure is statistically insignificant. This result may give more confidence regarding the exclusion restriction in our instrumental variable regressions. 18 16 http://www.irs.gov/uac/SOI-Tax-Stats---Individual-Income-Tax-Statistics---Free-ZIP-Code-data-(SOI) 17 http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml 18 Even if increases in Internet penetration had caused the exit of physical stores, omitting Internet penetration from our regressions in the main text might be a concern only if changes in Internet penetration change the relative desires to rent popular versus non-popular DVDs. Our regressions in the main text include zip code specific time trends that may account for zip code specific trends in Internet penetration. Our finding in this appendix indicating that the number of high-speed Internet providers does not affect the likelihood of store closures further strengthens the use of our instrumental variable.

Appendix EA3: Robustness
Tables EA3, EA4, and EA5 present regression results analogous to those in Tables 4, 5, and 6 in the main text, but without including Total DVD Rentals as a covariate in the regressions . The results from Tables EA3, EA4, and EA5 are similar to the results in   Tables 4, 5, and 6 in the main text. The mean of the dependent variable is 0.15. * significant at 10%; ** significant at 5%; *** significant at 1% Standard errors in parentheses are clustered by household.
The mean of the dependent variable is 0.20 in Column I, 0.39 in Column II, and 0.47 in Column III. * significant at 10%; ** significant at 5%; *** significant at 1% The mean of the dependent variable is 0.28 in Column I and 0.25 in Column II. * significant at 10%; ** significant at 5%; *** significant at 1%   Table 4 in the main text, but limiting the sample to the observations used in the regressions where we have consumers' actual addresses (the observations used in Columns IV through VI of Table 6 in the main text). The results from Table 4 in the main text using the entire sample are similar to the results in Table EA6 using the restricted sample. 19 Table EA7 presents first stage regression results analogous to those in Column I of Table   5 in the main text, but limiting the sample to the observations used in the regressions 19 Note that the number of observations in Table EA6 is similar but not identical to the number of observations in Columns IV to VI of Table 6 in the main text; the explanation is that centroids' latitudes and longitudes from the Census are missing for a few zip codes.   where we have consumers' actual addresses (the observations used in Column II of Table   5 in the main text). The results in Column I of Table 5 in the main text using the entire sample are similar to the results in Table EA7 using the restricted sample. Table EA8 presents second stage regression results analogous to those in Columns I through III of Table 6 in the main text, but limiting the sample to the observations used in the regressions where we have consumers' actual addresses. The results in Columns I through III of Table 6 in the main text using the entire sample are similar to the results in  The focus of this paper is not on examining whether changes in rental concentrations are caused by demand versus supply side long-tail effects. However, in this section we attempt to provide some clues regarding the extent to which differences in title selection across channels causes a change in rental concentrations when consumers move online.
In Tables EA9, EA10, and EA11 we present regressions analogous to those in Tables 4,   5, and 6 in the main text, but controlling for the share of rentals taken by the titles below (and also above) the top 2,000 titles-the top 2,000 titles are more likely to be available from both channels. These regressions seek to account for the possibility that consumers may change the proportion of rentals of top 2,000 versus below top 2,000 titles as they move online. Specifically, Tables EA9 and EA11 report estimates of the following model Includes fixed effects for both weeks and individuals, and ZIP code-specific trends.
Standard errors in parentheses are clustered by household. * significant at 10%; ** significant at 5%; *** significant at 1% the share of rentals of DVD titles below (and also above) the top 2,000 titles.
The results in Tables EA9 and EA11 are similar to those in Tables 4 and 6 in the main text, and show that the long tail effects persist when holding fixed the share of rentals of DVD titles below the top 2,000 titles-when consumers do not change the proportion of rentals of top 2,000 versus below top 2,000 titles as they move online.
While the results from this section may suggest that the long tail effects that we find are not primarily caused by differences in the selection of titles available online versus offline, in this paper we do not focus on examining whether our results are caused by demand versus supply side effects. In the main text we listed several factors preventing us from separating demand and supply side effects (e.g., display of popular products at the brick-and-mortar channel versus the online channel; inventory and title selection online versus offline; online and offline stock-outs; the queue system for online consumption Standard errors in parentheses are clustered by household.
The mean of the dependent variable is 0.20 in Column I, 0.39 in Column II, and 0.47 in Column III. * significant at 10%; ** significant at 5%; *** significant at 1% The mean of the dependent variable is 0.28 in Column I and 0.25 in Column II. * significant at 10%; ** significant at 5%; *** significant at 1%  Variables regression results, and in these tables Columns I through IV report estimates for DVD titles ranked between 6 and 10, 11 and 20, 21 and 50, and 51 and 100 respectively. When interpreting the size of the coefficients on the share of offline rentals it should be noted that the means of the dependent variables are different across columns and tables. For example, Column I of Table EA12 indicates that when consumers decrease the share of rentals made at physical stores from 0.28 to zero (when consumers move entirely to the online channel) they decrease the share of rentals of titles ranked between 6 and 10 by 4.0 percentage points (0.14 times 0.28). Noting that the mean of the dependent variable in Column I of Table EA9 is 0.069 then indicates that when consumers move entirely to the online channel there is a 58% percentage decrease in the rental of titles ranked between 6 and 10 (0.040/0.069). Columns II through IV of Table   EA12 indicate that when consumers decrease the share of rentals made at physical stores from 0.28 to zero they decrease the share of rentals of titles ranked between 11 and 20, 21 and 50, and 51 and 100 by 59%, 38%, and 23% respectively.
The OLS regression results in Table EA12 show that for the various ranges of titles ranked among the top 100 that we considered there is a decrease in the share of fringe popular title rentals and an increase of niche title rentals when consumers move online.
Conversely, only the first two columns in Tables EA15 and EA16 presenting the Instrumental Variable second stage regression results show that fringe popular titles decrease at the expense of niche titles when consumers move online. Columns I and II in Table EA15 (Columns I and II in Table EA16) indicate that when consumers decrease the share of rentals made at physical stores from 0.28 (0.25) 20 to zero they decrease the share of rentals of titles ranked between 6 and 10, and 11 and 20 by 59% and 24% respectively (71% and 37% respectively). Columns III and IV in Table EA15  represent 21.3% of all rentals from the store, and the titles ranked between top 51 and top 100 take 7.0% of the rentals from the store; see Table 2 in the main text).
Because our focal company does not display or promote the very popular products as heavily in the online channel as it does in the brick-and-mortar channel, the results in this section may provide some indication that promotional effects and the ways that people search at physical stores versus online are an important mechanism underlying our results. In addition, stock-outs of popular titles are frequent both online and offline for our focal company, and long tail effects might result if stock-outs online were more frequent than at brick and mortar stores. However, to the extent that stock-outs from the online channel are concentrated around the top 5 or top 10 most popular titles in every week the results in this section may suggest that the long tail effects that we find are not primarily due to stock-outs. 21 21 In addition, our analysis in footnote 7 in the main paper suggests that stock outs are not more frequent online than offline. Standard errors in parentheses are clustered by household.
The mean of the dependent variable is 0.069 in Column I, 0.080 in Column II, 0.109 in Column III, and 0.082 in Column IV. * significant at 10%; ** significant at 5%; *** significant at 1% Standard errors in parentheses are clustered by household. * significant at 10%; ** significant at 5%; *** significant at 1% Standard errors in parentheses are clustered by household. * significant at 10%; ** significant at 5%; *** significant at 1% Standard errors in parentheses are clustered by household.
The mean of the dependent variable is 0.069 in Column I, 0.080 in Column II, 0.109 in Column III, and 0.082 in Column IV. * significant at 10%; ** significant at 5%; *** significant at 1% Standard errors in parentheses are clustered by household.
The mean of the dependent variable is 0.065 in Column I, 0.077 in Column II, 0.110 in Column III, and 0.084 in Column IV. * significant at 10%; ** significant at 5%; *** significant at 1%