The Agglomeration-Differentiation Tradeoff in ... - Yale University
The Agglomeration-Differentiation Tradeoff in ... - Yale University
The Agglomeration-Differentiation Tradeoff in ... - Yale University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>The</strong> <strong>Agglomeration</strong>-<strong>Differentiation</strong> <strong>Tradeoff</strong> <strong>in</strong> Spatial Location Choice<br />
Sumon Datta<br />
Krannert School of Management<br />
Purdue <strong>University</strong><br />
403 W. State Street<br />
West Lafayette, IN 47907<br />
Email: sdatta@purdue.edu<br />
Phone: (765) 496-7747<br />
Fax: (765) 494-9658<br />
K. Sudhir<br />
<strong>Yale</strong> School of Management<br />
135 Prospect St, PO Box 208200<br />
New Haven, CT 06520<br />
Email: k.sudhir@yale.edu<br />
Phone: (203) 432-3289<br />
Fax: (203) 432-3003<br />
June 2011
Abstract<br />
Retailers often co-locate spatially to draw consumers, even though it <strong>in</strong>creases price competition.<br />
<strong>The</strong> paper develops a structural model of entry and location choice that isolates the<br />
agglomeration benefit of co-location, after controll<strong>in</strong>g for pure differentiation rationales for co-<br />
location such as (1) high demand and/or low cost at the location; (2) zon<strong>in</strong>g restrictions and (3)<br />
format differentiation that m<strong>in</strong>imizes the need for spatial differentiation. We augment entry and<br />
location choice data used <strong>in</strong> the literature with revenue and price data to help identify the<br />
agglomeration effect. We <strong>in</strong>troduce a new approach to obta<strong>in</strong> zon<strong>in</strong>g data across a large number<br />
of markets that should be of general <strong>in</strong>terest for a large stream of spatial location applications.<br />
We f<strong>in</strong>d that agglomeration benefits expla<strong>in</strong> a significant fraction of observed co-location. While<br />
zon<strong>in</strong>g restrictions have little direct impact on co-location, <strong>in</strong> comb<strong>in</strong>ation with the<br />
agglomeration benefit, they expla<strong>in</strong> a surpris<strong>in</strong>gly large fraction of observed co-location.<br />
Keywords: Entry, Location Choice, <strong>Agglomeration</strong>, <strong>Differentiation</strong>, Zon<strong>in</strong>g, Retail<br />
Competition, Store Format, Discrete Games, Multiple Equilibria, Structural Model<strong>in</strong>g
1. Introduction<br />
Spatial cluster<strong>in</strong>g is a common phenomenon <strong>in</strong> many types of retail markets such as<br />
restaurants, automobile dealerships, electronics shops and bridal boutiques. That the<br />
phenomenon is well recognized <strong>in</strong> the popular imag<strong>in</strong>ation is seen <strong>in</strong> the popular labels for such<br />
retail clusters: hamburger alleys, restaurant rows, automobile malls etc. Consider for example,<br />
the retail locations of compet<strong>in</strong>g grocery stores. Figure 1 shows the distribution of distance<br />
between a grocery store and its nearest competitor <strong>in</strong> the three US states of New York,<br />
Pennsylvania and Ohio. Somewhat surpris<strong>in</strong>gly, over 45% of stores are located with<strong>in</strong> 0.5 miles<br />
of a competitor. What expla<strong>in</strong>s the high observed levels of co-location <strong>in</strong> grocery stores?<br />
When a grow<strong>in</strong>g retailer embarks on a store expansion strategy it faces two key<br />
questions: (1) Should it enter a particular market (entry decision), and if so, (2) Where with<strong>in</strong> the<br />
market should it locate the new store (location decision)? Economists have long recognized that<br />
locat<strong>in</strong>g close to a competitor could <strong>in</strong>crease profits by <strong>in</strong>creas<strong>in</strong>g aggregate demand at the<br />
location even though the lack of spatial differentiation is likely to <strong>in</strong>crease price competition<br />
(Marshall, 1920). This agglomeration-differentiation tradeoff or volume-price tradeoff is a<br />
central tradeoff <strong>in</strong> spatial location choice. Indeed, there is ample theoretical research (Varian<br />
1980, Stahl 1982, Wol<strong>in</strong>sky 1983, Dudey 1990, Fischer and Harr<strong>in</strong>gton 1996, Bester 1998,<br />
Arentze et al., 2005, Konishi 2005) to suggest that agglomeration benefits can act as an <strong>in</strong>centive<br />
for firms to forego spatial differentiation. 1<br />
But how can we measure the volume and price effects<br />
due to competitors?<br />
1 Some empirical evidence of the benefits of spatial co-location can be found <strong>in</strong> Fox et al. 2007 and Watson 2005.<br />
Vitor<strong>in</strong>o (2011) f<strong>in</strong>ds evidence for <strong>in</strong>ter-store spillovers <strong>in</strong> a particular k<strong>in</strong>d of retail cluster - Shopp<strong>in</strong>g Malls. But <strong>in</strong><br />
a mall sett<strong>in</strong>g, firms only make a strategic entry decision; they do not face the tradeoff of whether to co-locate or<br />
spatially differentiate with rivals.<br />
1
A retailer could use detailed household level data on consumer store choices across<br />
several markets that vary <strong>in</strong> market characteristics (like population and <strong>in</strong>come) and the number<br />
of stores of different formats, and their locations, to estimate the benefit of agglomeration<br />
(volume effect due to co-location) through a household level model of store choice. 2<br />
Given these<br />
household level estimates, one can then solve for a competitive pric<strong>in</strong>g equilibrium to identify<br />
the benefit of spatial differentiation (price effect due to differentiation). Such a method, however,<br />
tends to be impractical because such detailed household level data across multiple retailers are<br />
difficult to obta<strong>in</strong> and a household level analysis across markets is too onerous.<br />
Another approach could be to use firm level data on revenues and prices of all stores<br />
across several markets. Assum<strong>in</strong>g store locations as given, one could develop a consumer<br />
shopp<strong>in</strong>g behavior model to identify the benefit of agglomeration, coupled with a price<br />
competition model to identify the benefit of differentiation. This approach, however, could suffer<br />
from serious issues due to the endogeneity of market structure (i.e., number of firms that enter a<br />
market and their locations). For <strong>in</strong>stance, a location with a high unobserved demand shock is<br />
likely to have higher revenues but the location is also likely to attract more firms. Not account<strong>in</strong>g<br />
for the endogeneity of market structure can give biased estimates of the parameters captur<strong>in</strong>g the<br />
agglomeration benefit and the competitive <strong>in</strong>teractions.<br />
To <strong>in</strong>fer the strategic <strong>in</strong>teractions between firms, researchers <strong>in</strong> market<strong>in</strong>g and economics<br />
have adopted an alternative empirical approach that uses readily observed entry and location<br />
decisions of firms. <strong>The</strong> approach is built on the idea that firms take <strong>in</strong>to account their<br />
competitors’ actions when mak<strong>in</strong>g their decisions. Thus, by solv<strong>in</strong>g for the location choice game<br />
between firms we can <strong>in</strong>fer the strategic <strong>in</strong>teractions between firms. A vast majority of papers<br />
2 For example, Fox et al., (2007) use data from a multi-outlet panel to study consumers’ shopp<strong>in</strong>g behavior and its<br />
impact on store revenues. However their data is from a s<strong>in</strong>gle major metropolitan market.<br />
2
have used this approach to study firms’ entry decisions (e.g., Bresnahan and Reiss 1991; Berry<br />
1992; Mazzeo 2002; Aguirregabiria and Mira 2007; Bajari et al., 2007; Vitor<strong>in</strong>o 2011; Zhu,<br />
S<strong>in</strong>gh and Manuszak 2009; Ciliberto and Tamer 2009). 3<br />
<strong>The</strong> reduced form approach is <strong>in</strong>capable of separat<strong>in</strong>g the ‘net effect’ of competitors <strong>in</strong>to<br />
a volume effect and a price competition effect which can <strong>in</strong>dependently describe the<br />
agglomeration benefit from co-location and the benefit from spatial differentiation, respectively.<br />
In particular, if consumers are <strong>in</strong> fact attracted to a location with multiple compet<strong>in</strong>g stores then<br />
the demand at any location will be endogenous to a firm’s location choice decision and the<br />
decisions of competitors. <strong>The</strong> reduced form approach cannot dist<strong>in</strong>guish such endogenous<br />
demand from the latent profit and is therefore unsuitable for study<strong>in</strong>g firms’ agglomeration-<br />
differentiation tradeoff.<br />
3<br />
A smaller literature has analyzed the<br />
strategic location choice decisions (e.g., Seim 2006; Watson 2005; Orhun 2005; Zhu and S<strong>in</strong>gh<br />
2009), where firms not only decide whether to enter <strong>in</strong>to a market but if they enter, where to<br />
locate and how far to locate from a competitor. <strong>The</strong>se structural models of location choice use a<br />
reduced form profit function that allows latent profit <strong>in</strong> a location to depend on the number of<br />
competitors, and their distances from that location. As we can only make the <strong>in</strong>ference that a<br />
firm’s chosen location must be more profitable <strong>in</strong> expectation than any alternative location, at<br />
best, we can only estimate the average ‘net effect’ of competitors on firm profit. A net negative<br />
effect is characterized as the competition effect and a decrease <strong>in</strong> the negative effect with the<br />
distance of the competitor is highlighted to emphasize only the benefit of spatial differentiation.<br />
Also, a crucial challenge <strong>in</strong> disentangl<strong>in</strong>g the agglomeration-differentiation tradeoff is<br />
that observed co-location may be consistent with pure differentiation rationales. That is, even if<br />
3 Some have addressed entry decisions of retail cha<strong>in</strong>s, consider<strong>in</strong>g how these cha<strong>in</strong>s build up their network (Jia<br />
2008; Holmes 2008; Ellickson, Houghton, and Timm<strong>in</strong>s 2008).
there are no agglomeration benefits firms may still locate close to each other when (a) there is<br />
high demand at the location; (b) there is low cost at the location; (c) zon<strong>in</strong>g regulations restrict<br />
retailers to set-up stores <strong>in</strong> very concentrated areas and (d) the need for spatial differentiation is<br />
lower when retailers can differentiate on other attributes or dimensions such as store formats.<br />
Aga<strong>in</strong>, the exist<strong>in</strong>g structural models of firms’ entry and location choice decisions do not<br />
<strong>in</strong>corporate most of these features. In this paper we develop a comprehensive structural model<br />
that disentangles the agglomeration-differentiation tradeoff while simultaneously controll<strong>in</strong>g for<br />
the alternative explanations for co-location.<br />
We make three crucial contributions to the literature. First, we <strong>in</strong>troduce a novel<br />
approach to obta<strong>in</strong> spatial zon<strong>in</strong>g regulation data for any number of markets. Most towns and<br />
cities <strong>in</strong> the U.S. practice s<strong>in</strong>gle-use zon<strong>in</strong>g where<strong>in</strong> locations with high population and <strong>in</strong>come<br />
are often zoned as residential land where big-box retailers are not allowed to open stores.<br />
Previous studies wrongly viewed the absence of stores <strong>in</strong> such locations as a strategic choice of<br />
firms. Similarly, smaller and concentrated retail zones might force rivals to cluster together. But<br />
previous models would <strong>in</strong>fer such cluster<strong>in</strong>g by rivals to be the result of low competition. Datta<br />
and Sudhir (2011) exhibit the role of zon<strong>in</strong>g <strong>in</strong> firms’ entry location and store format choice<br />
decisions, and the potential biases <strong>in</strong> <strong>in</strong>ference that can result from ignor<strong>in</strong>g zon<strong>in</strong>g. Even though<br />
the critical importance of spatial zon<strong>in</strong>g is fairly well-known, extant research has completely<br />
ignored this issue because of lack of availability of zon<strong>in</strong>g data on a national scale across many<br />
markets. To control for spatial zon<strong>in</strong>g regulations, we use a publicly available, digital dataset<br />
called National Land Cover Dataset (NLCD). In conjunction with Geographic Information<br />
System (GIS) tools such as ArcGIS and Google Earth, we can recover zon<strong>in</strong>g data <strong>in</strong> any number<br />
of markets across the entire U.S. This is the first application of digital land cover data <strong>in</strong><br />
4
Market<strong>in</strong>g and the approach should be of general <strong>in</strong>terest for a large stream of spatial location<br />
applications.<br />
Second, we decompose store profits <strong>in</strong>to revenue and cost, and <strong>in</strong>corporate common<br />
unobserved demand and cost shocks – Location specific demand (cost) characteristics such as,<br />
say, traffic patterns (tax-breaks), which may be common knowledge for firms but which are<br />
unobserved by the researcher. For this, we augment firms’ entry and location choice data with<br />
store revenue data. 4 Extant structural models use a reduced form profit function that cannot<br />
discern whether a location was chosen because of high demand or because of low costs.<br />
Furthermore, when firms cluster <strong>in</strong> a location because the <strong>in</strong>creased competition is more than<br />
offset by an unobserved positive revenue (negative cost) shock at the location, exist<strong>in</strong>g models<br />
would misattribute the co-location to low competition. 5<br />
In our approach, the portion of observed<br />
store revenue that is not expla<strong>in</strong>ed by the observable demand factors or the observed market<br />
structure is attributed to an unobserved revenue shock at the location which is a draw from a<br />
distribution. Hav<strong>in</strong>g accounted for revenue, we then identify the residual cost function through<br />
the latent profit function and the data on observed entry and location choice decisions of firms.<br />
Thus, we are able to <strong>in</strong>fer how the observed market characteristics affect revenue and cost<br />
differently which gives us better <strong>in</strong>sights about the drivers of store location choice.<br />
Third, we show how to disentangle the agglomeration-differentiation tradeoff by further<br />
decompos<strong>in</strong>g store revenue <strong>in</strong>to its components of consumer shopp<strong>in</strong>g location choice based<br />
volume and spatial competition based price. Specifically, for volume we model consumers’<br />
4 Some recent research that has also used post-action performance data to ga<strong>in</strong> richer <strong>in</strong>sights about the drivers of<br />
firms’ strategic decisions <strong>in</strong>clude Ellickson and Misra (2007) and Draganska et al., (2009).<br />
5 Orhun (2005) attempts to control for location-specific common profit shocks. However, with only choice data, one<br />
can only model latent profits whose errors have to be normalized for estimation. For <strong>in</strong>stance, Orhun (2005)<br />
assumed that the distribution of common profit shocks have a standard normal distribution.<br />
5
shopp<strong>in</strong>g location choice, which <strong>in</strong>corporates the spatial configuration of firms around<br />
consumers, and we model price as a function of the spatial configuration of rivals around a store.<br />
Hence, the benefits of agglomeration are realized through <strong>in</strong>creased volume potential while the<br />
benefits of spatial differentiation emerge from acquir<strong>in</strong>g a greater share of that potential as a<br />
result of decreased price competition. As competitors affect both volume and price, a non-<br />
parametric identification of the two effects would require additional data on sales or prices.<br />
Without this data, one would have to rely on suitable functional form assumptions so that the<br />
locations of rivals affect store volume differently from the way they affect store prices. Hence we<br />
further augment our data with price data for a set of stores that belong to one store cha<strong>in</strong>.<br />
F<strong>in</strong>ally, when different store formats specialize <strong>in</strong> different product categories or pric<strong>in</strong>g<br />
strategies or services, the need for spatial differentiation may be lower. In the context of grocery<br />
stores, format types <strong>in</strong>clude Supermarkets, Superstores, Limited Assortment and Warehouse<br />
stores, Natural Foods stores, Food and Drug stores and Supercenters and Wholesale Clubs.<br />
Datta and Sudhir (2011) show that when zon<strong>in</strong>g restrictions <strong>in</strong>crease <strong>in</strong> a market the enter<strong>in</strong>g<br />
grocery retailers are more likely to exhibit greater diversity <strong>in</strong> their store formats as a means to<br />
mitigate the reduced the scope for spatial differentiation. Hence, we control for format<br />
differentiation by account<strong>in</strong>g for the different store formats.<br />
<strong>The</strong> empirical strategy to <strong>in</strong>vestigate firms’ entry and location decisions <strong>in</strong>volves solv<strong>in</strong>g<br />
a choice game where firms’ strategies are <strong>in</strong>terrelated. We estimate a static, structural<br />
simultaneous move game for firms’ entry and location choice decisions with <strong>in</strong>complete<br />
<strong>in</strong>formation between firms. 6<br />
We use maximum likelihood estimation (MLE) for estimation of the<br />
6 We do not have store entry dates which are required to solve a dynamic choice game. However, our model can be<br />
extended to a dynamic set-up similar to Aguirregabiria and Vicent<strong>in</strong>i (2006) who have proposed a dynamic model of<br />
an oligopoly <strong>in</strong>dustry characterized by spatial competition.<br />
6
static discrete game. 7 Estimation challenges <strong>in</strong>clude the possibility of multiple equilibria <strong>in</strong> the<br />
model, multiple equilibria <strong>in</strong> the data, and slow convergence or potential non-convergence of the<br />
MLE algorithm. 8<br />
We build on recent developments <strong>in</strong> the empirical literature to address each of<br />
these challenges and these are expla<strong>in</strong>ed <strong>in</strong> detail <strong>in</strong> the estimation section.<br />
Our estimates and counterfactual analysis show that the agglomeration effect is strong<br />
and expla<strong>in</strong>s a significant fraction of observed co-location of grocery stores across several<br />
markets. Surpris<strong>in</strong>gly, zon<strong>in</strong>g has little direct effect on co-location. But tighter zon<strong>in</strong>g<br />
restrictions <strong>in</strong>teract with the agglomeration effect to expla<strong>in</strong> a surpris<strong>in</strong>gly large fraction of<br />
observed collocation. We f<strong>in</strong>d that a small change <strong>in</strong> zon<strong>in</strong>g can cause a discont<strong>in</strong>uous impact on<br />
the location pattern. <strong>The</strong> f<strong>in</strong>d<strong>in</strong>g that zon<strong>in</strong>g regulations and the agglomeration effect <strong>in</strong>teract to<br />
shape market structure has important policy implications for local government bodies that make<br />
zon<strong>in</strong>g decisions. It also highlights the value of a structural model <strong>in</strong> understand<strong>in</strong>g how a small<br />
perturbation of market characteristics can cause strategic firms to respond <strong>in</strong> complex and<br />
nonl<strong>in</strong>ear ways.<br />
<strong>The</strong> rest of the discussion is organized as follows: Section 2 describes the model and<br />
estimation strategy. Section 3 describes the data and the approach for recover<strong>in</strong>g spatial zon<strong>in</strong>g<br />
data. Section 4 describes the estimates of the model. Section 5 presents the results of<br />
counterfactual simulations. Section 6 concludes with a summary of the f<strong>in</strong>d<strong>in</strong>gs and the<br />
limitations of this research.<br />
2. Model and Estimation Strategy<br />
7<br />
Alternatives to likelihood based approaches <strong>in</strong>clude method of moments (Thomadsen 2005; Draganska et al.,<br />
2009), m<strong>in</strong>imum distance or asymptotic least square estimators (Pakes et al., 2007; Bajari et al., 2007; Pesendorfer<br />
and Schmidt-Dengler 2008) and maximum score estimators (Fox and Bajari 2010; Fox 2007; Ellickson et al., 2010).<br />
8<br />
See Aguirregabiria et. al., (2008) for a discussion on the dist<strong>in</strong>ction between multiple equilibria <strong>in</strong> model and<br />
multiple equilibria <strong>in</strong> data.<br />
7
2.1. A Comprehensive Model of Strategic Entry and Location Choice<br />
<strong>The</strong> entry and location choice game <strong>in</strong>volves a nested framework with two stages. In the<br />
first stage, each firm, i, decides whether or not to enter a market m (m = 1, 2,…, M). In the<br />
second stage, the enter<strong>in</strong>g firms simultaneously choose their respective store type or format, f (f<br />
= 1, 2,…, F) and the store location with<strong>in</strong> the market.<br />
For the purposes of illustration, imag<strong>in</strong>e a square city with a grid of L m discrete blocks or<br />
‘locations’ (Figure 2(a)). In extant models, firm i's payoff at each location, l (l = 1, 2,…, L m ), is<br />
modeled as a reduced-form function of the market characteristics at the location, xl, the actions<br />
(entry and location choices) of all firms, a = (ai, a-i), and an idiosyncratic profit shock, ε il , which<br />
is the firm’s private <strong>in</strong>formation and is known to rivals (and the researcher) only <strong>in</strong> distribution:<br />
m m<br />
π ( a ) =Π ( x , a)<br />
+ ε<br />
(1)<br />
ifl i f l il<br />
In this <strong>in</strong>complete <strong>in</strong>formation setup, a firm cannot predict rivals’ discrete actions but it<br />
has rational beliefs about their strategies. For example, suppose firms are homogeneous, then<br />
each firm will make its decision based on its belief about the number of firms that would enter<br />
the market ,<br />
m<br />
N , and its belief that an enter<strong>in</strong>g rival will choose a particular location as<br />
represented by a vector of conditional location choice probabilities,<br />
8<br />
( )<br />
m m<br />
P P { p1, p2,..., p m<br />
L }<br />
= .<br />
For <strong>in</strong>stance, the firm may have a belief that a rival, conditional on entry, will choose location ‘j’<br />
with probability p j . Hence, for homogeneous firms the expected profit at location l can be<br />
written as (after dropp<strong>in</strong>g subscript ‘f’ for format):
9<br />
( ( ⎡ ⎤ ) )<br />
m m m m<br />
E[ πil( ai)] =Π xl, E⎣N ⎦ , P + εil(2)<br />
We build on this popular model<strong>in</strong>g approach and <strong>in</strong>troduce several new features. First, <strong>in</strong><br />
the extant models, firms are allowed to consider all L m locations <strong>in</strong> the market so that each<br />
location has some positive probability of be<strong>in</strong>g chosen by a firm. However, s<strong>in</strong>ce firms are not<br />
allowed to set up stores <strong>in</strong> residential locations, we use our zon<strong>in</strong>g data to exclude such locations<br />
and concentrate only on a subset of potential retail locations, l = {1, 2,…, lm} (Figure 2(b)).<br />
Second, we break down the reduced-form profit <strong>in</strong>to revenue and a cost multiplier. 9<br />
We<br />
allow both revenue and cost to <strong>in</strong>clude observed and unobserved (to the researcher) components.<br />
Third, <strong>in</strong>stead of an idiosyncratic profit shock, we assume an idiosyncratic cost shock. Formally,<br />
we revise Equation (1) as follows:<br />
( ) ( )<br />
m r m c<br />
π ( a ) = R x , a, υ * C x , υ , ς<br />
(3)<br />
ifl i fl l l ifl l l il<br />
where, revenue has the follow<strong>in</strong>g multiplicative form:<br />
( ) ˆ ( )<br />
r r<br />
R x , a, υ = R x , a * υ<br />
(4)<br />
fl l l fl l l<br />
R ˆ<br />
fl is the observed component of store revenue that is a function of the store format, f,<br />
the market characteristics at the location, xl, and the actions (entry, location and format choices)<br />
r<br />
of all firms, a. <strong>The</strong> unobserved component of revenue, υ l , is a common location-specific shock<br />
that is common knowledge for all firms at the time of entry. It accounts for location-specific<br />
demand characteristics such as traffic density that are unobserved by the researcher.<br />
<strong>The</strong> cost multiplier <strong>in</strong> Equation (3) has the follow<strong>in</strong>g multiplicative form:<br />
( , υ , ε ) = ˆ ( ) * υ *exp ( ξ ) *exp ( ε )<br />
C x C x<br />
m c c m<br />
ifl l l il fl l l il<br />
9<br />
As described later (Equation 15), we will consider the log transformation of Equation (3) which will yield the<br />
familiar form: Profit = Revenue – Cost.<br />
(5)
where, the observed component, ˆ C fl , is a function of the store format, f, and the market<br />
characteristics at the location, xl,. <strong>The</strong> unobserved component of cost consists of three elements:<br />
c<br />
(a) A common location-specific shock, υ l , that is common knowledge for all firms at the time of<br />
entry. It accounts for location-specific cost characteristics such as commercial taxes that are<br />
unobserved by the researcher. Now, the common unobserved cost shock at a location is likely to<br />
be correlated with the common unobserved revenue shock at the location. We empirically check<br />
for this correlation through the follow<strong>in</strong>g assumption about the distribution of the two shocks:<br />
( )<br />
c ( υl<br />
)<br />
r ⎛ln υ ⎞ r<br />
2<br />
l ⎛ω⎞ ⎛ l 0 ⎡ σ r ρσ rσ⎤⎞ ⎜ ⎟=<br />
⎛ ⎞<br />
c<br />
⎜ ⎟ N ⎜ ,<br />
c ⎜ ⎟ ⎢ 2 ⎥⎟<br />
⎜ln ⎟ ⎜ω⎟ ⎜ 0 ρσ l ⎝ ⎠ rσc σ ⎟<br />
⎝ ⎠ ⎝ ⎣ c<br />
⎝ ⎠<br />
⎦⎠<br />
m<br />
(b) An overall market-specific attractiveness parameter, exp(<br />
)<br />
all firms but is unobserved by the researcher.<br />
10<br />
(6)<br />
ξ , that is common knowledge for<br />
(c) <strong>The</strong> firm’s idiosyncratic cost shock at the location, exp( ε il ) , that is the firm’s private<br />
<strong>in</strong>formation and known to rivals and the researcher only <strong>in</strong> distribution.<br />
F<strong>in</strong>ally, to separate the agglomeration-differentiation effect, we decompose the observed<br />
component of revenue, ˆ fl R , <strong>in</strong>to a consumer shopp<strong>in</strong>g location choice based volume ( v fl ) and a<br />
competition effect based price <strong>in</strong>dex ( pr fl ).<br />
( ) ( )<br />
Rˆ = v * pr<br />
fl fl fl<br />
This decomposition of revenue will enable us to separate the volume and price effects of<br />
competitors and thus dist<strong>in</strong>guish the benefits of agglomeration that <strong>in</strong>crease volume, from the<br />
benefits of spatial differentiation that reduce price competition. We now describe the volume and<br />
price components of revenue.<br />
2.1.1. Consumers’ Shopp<strong>in</strong>g Location Choice Based Volume<br />
(7)
We have detailed <strong>in</strong>formation about consumers up to the Census Block Group (CBG)<br />
level. Hence, <strong>in</strong> what follows, we use demographic data at the CBG level and assume that<br />
consumers are located at the population density weighted center of their respective CBG.<br />
However, the model can easily be extended to a household level.<br />
Consumers <strong>in</strong> each CBG, g, choose the store format and the retail location where they<br />
want to shop. <strong>The</strong>y <strong>in</strong>cur a travel cost (Tgl) to go to a retail location l. This travel cost could be a<br />
non-l<strong>in</strong>ear function of the distance, dgl, between the consumer’s location and the retail location.<br />
We also allow the travel cost to differ by the median household <strong>in</strong>come of the CBG (med_hhI),<br />
the median age (med_age), and the m<strong>in</strong>imum distance consumers have to travel before they can<br />
get to the nearest retail location (m<strong>in</strong>_d). For <strong>in</strong>stance, a consumer, who is located deep with<strong>in</strong> a<br />
residential zone and is far from the nearest retail location, may be more will<strong>in</strong>g to go to a store<br />
that is farther away, than say, a consumer who is close to several retail locations. Formally, the<br />
travel cost is given by:<br />
( )<br />
Tvl = α d + α d + α med _ hhI + α med _ age + α m<strong>in</strong>_ d * d<br />
gl 1 gl 2<br />
2<br />
gl 3 g 4 g 5<br />
g gl<br />
A consumer who wants to buy, let’s say, groceries, may be attracted to a particular<br />
grocery store <strong>in</strong> location l if the location also consists of other commercial activities that cater to<br />
the consumer’s non-grocery needs (e.g., electronics and apparel stores). That is, there could also<br />
be economies of scope from one-stop shopp<strong>in</strong>g or multipurpose shopp<strong>in</strong>g ( α MS ). Hence, we<br />
account for the extent of commercial activity <strong>in</strong> the location ( comm l ). In addition, if consumers<br />
expect low prices at the store then they may be even more likely to visit the store. To control for<br />
this price effect ( α pr ), we account for the price <strong>in</strong>dex of the store format, pr fl . <strong>The</strong> price <strong>in</strong>dex<br />
specification is described <strong>in</strong> the next sub-section.<br />
11<br />
(8)
Next, a consumer shopp<strong>in</strong>g for groceries likely frequents locations where multiple<br />
grocery stores are collocated (store agglomeration effect). Hence, we consider the effect of the<br />
total number of compet<strong>in</strong>g stores at the location, Nl. We also consider any scope economies of<br />
shopp<strong>in</strong>g with<strong>in</strong> the grocery sector when grocery stores with different formats collocate (format<br />
agglomeration effect). For <strong>in</strong>stance, consumers may be more likely to visit a particular Food and<br />
Drug store when it is located close to a Supercenter. Hence, we use an <strong>in</strong>dicator, OF<br />
I fl , for the<br />
presence of store formats other than the focal format f, and we also allow the format<br />
agglomeration effect to be format-specific. F<strong>in</strong>ally, consumers may simply have a strong<br />
<strong>in</strong>tr<strong>in</strong>sic preference ( α f , Pref ) for the store format f and there could also have an unobserved<br />
preference for the location, η gl . Formally, for a consumer <strong>in</strong> CBG g, the utility of shopp<strong>in</strong>g <strong>in</strong><br />
stores with format f <strong>in</strong> location l is:<br />
U = Uˆ+ η<br />
gfl gfl gl<br />
ˆ =− + + ln + + + (9)<br />
OF<br />
and U Tvl α comm α ( pr ) α N α , I α ,<br />
gfl gl MS l pr fl SA l f FA fl f Pref<br />
We assume i.i.d. Type 1 extreme value distribution for the preference shock so that the<br />
probability that a consumer <strong>in</strong> CBG g will shop <strong>in</strong> stores with format f <strong>in</strong> location l is given by<br />
the standard logit form:<br />
p<br />
=<br />
csr<br />
gfl F<br />
exp<br />
∑ ∑<br />
f '= 1 csr<br />
j∈Lg ( Uˆ<br />
gfl )<br />
exp(<br />
Uˆ<br />
gf ' j )<br />
where, the superscript ‘csr’ for the probability denotes that this is the choice probability of<br />
consumers. We put a cap on consumers’ choice set by specify<strong>in</strong>g that consumers may shop at<br />
12<br />
(10)
any retail location with<strong>in</strong> a radius, Rad. 10<br />
That is, we assume that Rad is the maximum distance<br />
that a consumer will travel for shopp<strong>in</strong>g, and so <strong>in</strong> Equation (10), L is the set of retail locations<br />
with<strong>in</strong> the radius, Rad, from CBG g. Eventually, we estimate our model with different<br />
specifications for Rad <strong>in</strong> order to empirically <strong>in</strong>fer the maximum distance that consumers are<br />
will<strong>in</strong>g to travel.<br />
Note that this maximum travel distance automatically implies that the trade radius of a<br />
store (catchment area from where the store gets its customers) is Rad. Next, us<strong>in</strong>g consumers’<br />
per capita <strong>in</strong>come as a proxy for their consumption capacity or their purchas<strong>in</strong>g ability, we<br />
construct a metric called Customer Value ( CV fl ) for measur<strong>in</strong>g the net worth of the consumers<br />
who are attracted towards stores with format f <strong>in</strong> location l. For this, note that Equation (10) is<br />
also the share of consumers located <strong>in</strong> CBG g, who will shop <strong>in</strong> stores with format f <strong>in</strong> location l.<br />
We weigh consumers’ choice probability by the number of such consumers (CBG population,<br />
Popg) and their per capita <strong>in</strong>come (PCIg). 11<br />
13<br />
csr<br />
g<br />
<strong>The</strong>n the customer value metric, CVfl<br />
, is obta<strong>in</strong>ed by<br />
aggregat<strong>in</strong>g the <strong>in</strong>flux of consumers from different CBGs around the location:<br />
where,<br />
location l:<br />
csr ( )<br />
CV = ∑ p Pop PCI . (11)<br />
fl gfl g g<br />
ret<br />
g∈Ll ret<br />
L l is the set of CBGs that lie with<strong>in</strong> the trade radius, Rad, of location l.<br />
We then transform this customer value metric <strong>in</strong>to volume for stores with format f <strong>in</strong><br />
v fl CVfl α<br />
= ⎡<br />
⎣<br />
⎤<br />
⎦<br />
10 If we do not impose such a cap on the maximum distance then the estimation becomes very cumbersome and slow<br />
as our dataset consists of several large markets that consist of large number locations and CBGs.<br />
11 We use per capita <strong>in</strong>come for convenience. Alternatively, one could, of course, use other better variables such as<br />
per capita expenditure on grocery.<br />
fV ,<br />
(12)
OF<br />
Hence, <strong>in</strong> our framework, volume is endogenous to firms’ actions (through N l and I fl )<br />
and it also depends on the market characteristics and consumer preferences.<br />
2.1.2. Competitive Effect Based Price Index<br />
Firms would like to differentiate spatially from rivals to reduce price competition. We<br />
model the effect of competition on the price <strong>in</strong>dex of stores with format f that are <strong>in</strong> location l. 12<br />
We use a flexible, semi-parametric approach so that the competition effect is split differentially<br />
as a function of the store formats and distances of rivals from the location.<br />
Similar to Seim (2006), we divide the area around a location (up to the trade radius, Rad)<br />
<strong>in</strong>to concentric circles or distance bands.<br />
13<br />
All rivals of a particular format type that are on a<br />
distance band b (b = 1, 2,…, B) around location l are assumed to have the same effect on price.<br />
Formally, the price <strong>in</strong>dex of a store with format f that is <strong>in</strong> location l is given by:<br />
and<br />
, * ( ) *exp ' ' *<br />
'<br />
pr<br />
βx<br />
⎛ ⎞<br />
prfl = β f pr xl ⎜∑β f −fbNfbl+ ∑∑ β f −fbNfbl⎟<br />
υl<br />
⎝ b b f ≠ f ⎠<br />
pr pr<br />
2<br />
ln( υl) ωlN(0, σ pr )<br />
= (13)<br />
where, β f , pr is a format-specific parameter which allows the <strong>in</strong>tr<strong>in</strong>sic pric<strong>in</strong>g ability of a store to<br />
differ by the store format. This <strong>in</strong>tr<strong>in</strong>sic pric<strong>in</strong>g ability of a store format could be due to format-<br />
specific differences <strong>in</strong> cost, efficiency, product mix, and service quality. However, we rema<strong>in</strong><br />
agnostic about the specific reasons. <strong>The</strong> second component on the right-hand side allows the<br />
pric<strong>in</strong>g ability of firms to depend on exogenous observable location characteristics, xl. In our<br />
application we use the per capita <strong>in</strong>come of consumers with<strong>in</strong> a 2 mile radius of the location to<br />
allow for price discrim<strong>in</strong>ation or the ability to sell premium products <strong>in</strong> affluent neighborhoods.<br />
12 We use sales-weighted prices across all categories <strong>in</strong> a store as the price <strong>in</strong>dex of the store.<br />
13 Alternatively, one could employ a cont<strong>in</strong>uous distance weight<strong>in</strong>g approach as <strong>in</strong> Orhun (2005).<br />
14
<strong>The</strong> third component on the right-hand side of Equation 13 <strong>in</strong>cludes the <strong>in</strong>traformat<br />
competition effect and the <strong>in</strong>terformat competition effect. For <strong>in</strong>traformat competition we<br />
consider the number of rivals that have the same format, f, as the focal firm and that are located<br />
<strong>in</strong> distance band b around location l ( N fbl ). Here, β f − fb<br />
15<br />
is the competitive effect of one such<br />
rival. For <strong>in</strong>terformat competition, we consider the number of rivals that have a different format,<br />
f’ ( f ' f<br />
≠ ), and f ' fb<br />
β − is the competitive effect of one such f’-format rival. If the estimates<br />
reveal a weaken<strong>in</strong>g of the competitive effects at greater distance bands then that will <strong>in</strong>dicate the<br />
benefits of spatial differentiation whereas, estimates of weaker <strong>in</strong>terformat competitive effects<br />
relative to <strong>in</strong>traformat competitive effects with<strong>in</strong> the same distance band will <strong>in</strong>dicate the<br />
benefits of format differentiation.<br />
F<strong>in</strong>ally, we <strong>in</strong>troduce a common, location-specific price shock ( υ ) that is common<br />
knowledge for all firms at the time of entry but is unknown to the researcher. We assume that<br />
this price shock has a log-normal distribution.<br />
To summarize, like volume, the price <strong>in</strong>dex also depends on market characteristics and is<br />
endogenous to firms’ actions. As firms’ actions affect both volume and price, a non-parametric<br />
identification of the competition effects would require price data for all stores. But we only have<br />
price data for one store cha<strong>in</strong>. Fortunately, this cha<strong>in</strong> operates more than one store format and is<br />
present <strong>in</strong> most markets <strong>in</strong> our dataset, and, therefore, experiences large variations <strong>in</strong> the spatial<br />
distribution of market characteristics and rivals. Hence, despite its <strong>in</strong>complete nature, the data<br />
partly assists identification. However, we also rely on our functional form assumption of how<br />
locations of rivals affect volume <strong>in</strong> a different way from how they affect prices.<br />
pr<br />
l
2.1.3. <strong>The</strong> Profit Function<br />
Conform<strong>in</strong>g to the multiplicative specifications so far, the observed component of the cost<br />
multiplier, ˆ C fl (Equation 5), is specified as:<br />
B<br />
ˆ ⎛ ⎞<br />
Cfl( xl) = exp⎜∑<br />
γ fbxxbl ⎟<br />
(14)<br />
⎝ b=<br />
1 ⎠<br />
where, bl x are the observed cost shifters at distance band b around location l and γ fbx are format<br />
and band specific cost parameters.<br />
Substitut<strong>in</strong>g the expressions for revenue and cost <strong>in</strong>to our profit specification, Equation<br />
(3), then tak<strong>in</strong>g the log transformation, and after mak<strong>in</strong>g some trivial sign reversals, we have a<br />
equation for the transformed profit function that is very similar to equation (1):<br />
( v pr ) ( Cˆ<br />
)<br />
( ) ( ) ( )<br />
16<br />
( )<br />
r c m<br />
π = ln π = ln + ln + ω − ln + ω + ξ + ε<br />
ifl ifl fl fl l fl l il<br />
2.4 Equilibrium Choice Probabilities:<br />
Recall that the idiosyncratic cost shock, ε il , is known to rivals only <strong>in</strong> distribution. Due<br />
to such <strong>in</strong>complete <strong>in</strong>formation about rivals’ profits, a firm cannot exactly predict rivals’ discrete<br />
actions but it can have rational expectations about rivals’ strategies. Hence, for a given set of<br />
pr r c<br />
vectors of price, revenue and cost shocks across all locations ( ω , ω , ω ), firm i can form<br />
rational expectations about the number of firms that will enter the market, N m , and the location<br />
and format choices of the (N m m m m m<br />
-1) enter<strong>in</strong>g rivals, P P1 , P2 ,... PF<br />
(15)<br />
= ⎡<br />
⎣<br />
⎤<br />
⎦<br />
. That is, correspond<strong>in</strong>g<br />
to each format f (f’) firms we will have a vector of lm conditional choice probabilities (CCPs),<br />
{ 1, 2,...,<br />
}<br />
P = p p p<br />
m<br />
f f f flm<br />
m ( Pf ' { pf '1, pf '2 ,..., pf<br />
'l<br />
} ) m<br />
= . For <strong>in</strong>stance, fj<br />
p ( f ' j)<br />
p is a CCP of a f-
format (f’-format) rival and it represents the focal firm’s belief that a f-format (f’-format) rival<br />
will choose location j when a total of<br />
m<br />
N firms enter the market.<br />
Based on these beliefs, we can obta<strong>in</strong> expressions for the total number of compet<strong>in</strong>g<br />
stores <strong>in</strong> a location ( [ l ] )<br />
OF ( E⎡I ⎤ fl )<br />
E N , the chance that there will be rivals with other formats <strong>in</strong> a location,<br />
⎣ ⎦ and the number of f-format (f’-format) rivals <strong>in</strong> distance band b, E ⎡N⎤ fbl<br />
17<br />
⎣ ⎦ ( E ⎡<br />
⎣N⎤ f 'bl<br />
⎦ )<br />
(Expressions for these expectations are shown <strong>in</strong> Appendix A). Consequently, given model<br />
parameters and the vectors of location-specific shocks, firms can derive the expected values of<br />
volume and price <strong>in</strong>dex which would then lead to the follow<strong>in</strong>g expression for expected profit:<br />
( ( ) ( ) ) ( ˆ )<br />
( )<br />
pr r c r c m<br />
E ⎡ πiflω , ω , ω ⎤ E ⎡ln v ⎤ fl E ⎡ln pr ⎤<br />
⎣ ⎦<br />
=<br />
⎣ ⎦<br />
+<br />
⎣ fl ⎦<br />
+ ωl − ln c fl + ωl+ ξ + εil<br />
(16)<br />
S<strong>in</strong>ce fl v is a highly non-l<strong>in</strong>ear function of OF<br />
N l and I , we will make the follow<strong>in</strong>g<br />
simplify<strong>in</strong>g assumption:<br />
OF<br />
( ( ⎡ ⎤ ) )<br />
( ) [ ]<br />
E[ln vkl ] = E⎡ln vkl X, E Nl , E I fl ; α ⎤<br />
⎣ ⎣ ⎦ ⎦ (17)<br />
This expected volume can be calculated based on firms’ prediction of the outcome of<br />
consumers’ shopp<strong>in</strong>g behavior as Equation (9) transforms <strong>in</strong>to:<br />
( ) [ ]<br />
Uˆ =− Tvl + α comm + α E ⎡ln pr ⎤ α E N α E ⎡I⎤ ⎣ ⎦<br />
+ + ⎣ ⎦+<br />
α<br />
gfl gl MS l pr fl SA l f , FA<br />
OF<br />
fl f ,Pref<br />
We also have:<br />
⎣<br />
ln ( ) ⎦<br />
ln ( ) ln ( ) ∑ ∑∑<br />
E⎡ pr ⎤ = β + β x + β E⎡N ⎤+ β E⎡N ⎤+<br />
ω<br />
pr<br />
fl f , pr x l f −fb⎣ fbl ⎦ f '−fb⎣ f 'bl⎦<br />
l<br />
b b f '≠<br />
f<br />
fl<br />
. (18)<br />
Thus, the expected profit <strong>in</strong> equation (16) can be rewritten as a function of the<br />
equilibrium number of entrants <strong>in</strong> the market,<br />
(19)<br />
m<br />
N , the equilibrium location choice probabilities
<strong>in</strong> the market for firms of all formats,<br />
pr r c<br />
ω , ω , ω ), and a set of model parameters, θ { αβγσρ , , , , }<br />
m<br />
P , the specific draws of price, revenue and cost shocks (<br />
( )<br />
= :<br />
( )<br />
pr r c m m pr r c m<br />
E⎡ π , , ˆ<br />
ifl ω ω ω ⎤<br />
⎣ ⎦<br />
= π fl N , P , ω , ω , ω , θ + ξ + εil<br />
Note that m<br />
ξ is common for all locations <strong>in</strong> the market and therefore does not <strong>in</strong>fluence<br />
the location choice after firm i has decided to enter the market. Thus, if we assume that the<br />
idiosyncratic component, ε il , has a Type 1 extreme value distribution that is <strong>in</strong>dependent across<br />
locations and firms then the conditional probability (conditional on entry) that a f-format firm<br />
chooses location l is given by the logit form:<br />
m m pr r c<br />
( N , P , , , , )<br />
ψ ω ω ω θ =<br />
fl F lm<br />
∑∑<br />
f '= 1 j=<br />
1<br />
18<br />
m ( ˆ π fl ( N<br />
m pr r c<br />
P ω ω ω θ)<br />
)<br />
m<br />
ˆ π f ' j(<br />
N<br />
m pr<br />
P ω<br />
r c<br />
ω ω θ)<br />
exp , , , , ,<br />
( )<br />
exp , , , , ,<br />
Integrat<strong>in</strong>g over the distributions of the common unobserved shocks, we have the<br />
location choice probability, conditional only on entry:<br />
( ) ( )<br />
m m m m pr r c pr r c pr r c<br />
Ψ N , θ = ∫ ∫∫ Ψ N , ω , ω , ω , θ g( ω ) f( ω , ω ) dω dω dω<br />
(22)<br />
In equilibrium firms’ beliefs must match with rivals’ strategies. So:<br />
( ; θ) ( , ; θ)<br />
(20)<br />
(21)<br />
m m m m<br />
P N = Ψ N P<br />
(23)<br />
This represents a system of equations that describes firms’ CCPs as the fixed po<strong>in</strong>t of a<br />
cont<strong>in</strong>uous mapp<strong>in</strong>g between firms’ strategies and their beliefs about rivals’ strategies. As the<br />
CCPs with<strong>in</strong> market m must add up to 1, by Brouwer’s fixed po<strong>in</strong>t theorem, this system of<br />
equations has at least one solution or fixed po<strong>in</strong>t.<br />
Next, we normalize the profit from not enter<strong>in</strong>g a market to one so that the log of profit is<br />
normalized to zero. <strong>The</strong>n the entry probability for a firm is given by the nested logit form:
pr r c<br />
( ω , ω , ω , θ)<br />
p Entry<br />
F<br />
lm<br />
∑∑<br />
f = 1 l=<br />
1<br />
19<br />
( ˆ fl ( N P<br />
) )<br />
m m m pr r c<br />
exp( ξ )* exp π , , ω , ω , ω , θ<br />
f = 1 l=<br />
1<br />
=<br />
F lm<br />
m m m pr r c<br />
1+ exp( ξ )* exp π , , ω , ω , ω , θ<br />
∑∑<br />
( ˆ fl ( N P<br />
) )<br />
Hence, if there are, say, E potential retail entrants then the expected total number of<br />
entrants <strong>in</strong> market m is given by:<br />
(24)<br />
m<br />
N = E * p( Entry)<br />
(25)<br />
By exogenously fix<strong>in</strong>g E, and by observ<strong>in</strong>g the actual number of entrants,<br />
the market specific cost parameter is:<br />
( )<br />
F m<br />
( N ) ( E N ) ∑∑ ˆ fl ( N P<br />
)<br />
m<br />
N , the estimate for<br />
l<br />
m pr r c m m ⎛ m m pr r c ⎞<br />
ξ ω , ω , ω , θ = ln −ln − −ln⎜ exp π , , ω , ω , ω , θ ⎟<br />
⎝ f = 1 l=<br />
1<br />
⎠<br />
Aga<strong>in</strong> <strong>in</strong>tegrat<strong>in</strong>g over the distributions of the common unobserved shocks, we have:<br />
( )<br />
m m pr r c pr r c pr r c<br />
ξ θ = ∫ ∫∫ ξ ω , ω , ω , θ g( ω ) f(<br />
ω , ω ) dω dω dω<br />
A simultaneous solution for Equations (23) and (27) gives the jo<strong>in</strong>t equilibrium<br />
predictions for the number of entrants, and the format and location decisions of those entrants.<br />
We assume that m<br />
ξ is i.i.d. across markets, and follows a normal distribution,<br />
Thus the probability that a total of<br />
(26)<br />
(27)<br />
2<br />
N ( µσ , ) .<br />
m<br />
N firms enter the market is given by the p.d.f. of this normal<br />
distribution at the value obta<strong>in</strong>ed <strong>in</strong> Equation (27). Note that the value of m<br />
ξ adjusts to the size<br />
of E <strong>in</strong> relation to the outside option of no entry. Hence, although the size of E is not observed by
the researcher, vary<strong>in</strong>g the size will have only a m<strong>in</strong>iscule effect on our <strong>in</strong>ferences about firms’<br />
strategies (See discussion <strong>in</strong> Seim (2006)).<br />
Next, note that for a given θ and<br />
m<br />
N , we can get estimates of price and revenue when<br />
firms’ locations are set to be identical to the observed spatial configuration of stores <strong>in</strong> the data.<br />
We can compare these estimates with our price and revenue data and thus obta<strong>in</strong> the price and<br />
pr r<br />
revenue shocks, ( obv , obv )<br />
ω θ ω θ , for the set of chosen locations that correspond to the observed<br />
spatial configuration of stores <strong>in</strong> the data. <strong>The</strong>se price and revenue shocks are <strong>in</strong>cluded <strong>in</strong> the<br />
likelihood function:<br />
L<br />
( )<br />
Θ =<br />
M<br />
∏<br />
m=<br />
1<br />
l ⎧ F m<br />
I ( fl)<br />
⎫<br />
m m pr 2<br />
r<br />
⎨∏∏( ψ fl ( N , P ; θ) ) ⎬*<br />
∏φ( ωobv θ,0, σ pr ) * ∏φ( ωobv θ,0,<br />
Σ)<br />
⎩ f = 1 l=<br />
1<br />
⎭<br />
Price Data<br />
Revenue Data<br />
<br />
⎡ ⎤<br />
⎢ ⎥<br />
⎢ ⎥<br />
⎢ ⎥<br />
⎢ ⎥<br />
⎢ ⎥<br />
⎢ Location Choice<br />
⎥<br />
⎢ ⎥<br />
⎢<br />
m 2 * φξ ( ; µσ , ) ⎥<br />
⎢ ⎥<br />
⎢⎣ Entry Choice ⎥⎦<br />
( θ) ( θ)<br />
m m m m<br />
s.t. P N ; =Ψ N , P ; , ∀ m<br />
(28)<br />
( )<br />
2<br />
where, Θ is the set of all model parameters { θ , µσ , }<br />
Θ= , and I( fl ) is an <strong>in</strong>dicator that<br />
equals one if location l is chosen by a f-format firm, and is zero otherwise. φ is the pdf of a<br />
normal distribution whereas φ has been used to <strong>in</strong>dicate the pdf of the marg<strong>in</strong>al distribution of<br />
revenue shocks.<br />
2.2 Estimation Strategy<br />
2.2.1. Simplify<strong>in</strong>g Restrictions<br />
20
In the generalized model specification the number of model parameters <strong>in</strong>creases<br />
exponentially with the number of format types (F) due to the <strong>in</strong>terformat and <strong>in</strong>traformat<br />
competition effects (Equation 13). <strong>The</strong> number of distance bands (B) around each location<br />
further explodes the number of parameters. For <strong>in</strong>stance, <strong>in</strong> our empirical application <strong>in</strong> this<br />
paper, we have six format types (F = 6). When we consider five 1-mile width distance bands<br />
around each location (B = 5), the number of competition effect parameters is 180 (F 2 *B =<br />
6*6*5). Also, the number of parameters for the observable component of cost (Equation 14) is<br />
proportional to F*B. Furthermore, we are also constra<strong>in</strong>ed by data for only a limited set of<br />
sample markets (small M). Hence, we make two restrictions <strong>in</strong> the model specification to reduce<br />
the model parameters to a manageable number.<br />
First, we assume that the competition effect between a pair of rivals is symmetric. That is,<br />
for any distance band, b, and for two rivals with formats f and f’, we assume β f ' fb = β f f 'b.<br />
In<br />
21<br />
− −<br />
our empirical application for the grocery <strong>in</strong>dustry, this restriction implies that we treat the<br />
competition effect of a Supermarket on a Superstore to be the same as the competition effect of a<br />
Superstore on a Supermarket. Note however that we allow <strong>in</strong>traformat and <strong>in</strong>ter-format effects to<br />
be heterogeneous. <strong>The</strong>refore, (1) the competition effect between two Supermarkets can be<br />
different from that between two Superstores; and (2) the competition effect between, say, a<br />
Supermarket and a Supercenter can be different from that between a Superstore and a<br />
Supercenter.<br />
Second, we assume that the ratio between the competition effect from a rival at a<br />
particular distance band, b (b ≠ 1) and the competition effect from that rival <strong>in</strong> the first 0-1 mile<br />
distance band is a constant value ( κ b ) . That is, we have:
β = β κ ; β = β κ ; ...; β = β κ<br />
f −f 2 f −f1 2 f −f 3 f −f1 3 f −fB f −f1<br />
B<br />
β = β κ ; β = β κ ; ...; β = β κ<br />
(29.1)<br />
f '−f 2 f '−f1 2 f '−f 3 f '−f1 3 f '−fB f '−f1 B<br />
( # competition effect parameters = ( F*( F + 1) / 2 ) + ( B-1)<br />
)<br />
Similarly, the impact of market characteristics on cost ( γ fbx ) are allowed to be format-<br />
specific but we assume a constant ratio between the impact of a variable at a particular distance<br />
band to the impact <strong>in</strong> the first 0-1 mile distance band. <strong>The</strong> constant is specific to the variable and<br />
the particular distance band. For <strong>in</strong>stance, suppose for cost (Equation 14) the coefficients of<br />
population and per capita <strong>in</strong>come <strong>in</strong> different distance bands are denoted by γ 1fb and γ 2 fb ,<br />
respectively; then the restriction implies:<br />
γ1= γ1 λ ; γ1= γ1 λ ; ... ; γ1= γ1 λ<br />
f 2 f 1 2 f 3 f 1 3 fB f 1 B<br />
γ2= γ2 ζ ; γ2= γ2 ζ ; ... ; γ2= γ2ζ f 2 f 1 2 f 3 f 1 3 fB f 1 B<br />
( # observable cost component parameters ∝ ( F + B)<br />
)<br />
22<br />
(29.2)<br />
Note that if we allow the ratios or the multipliers, κb, λb and ζ b to be format-specific<br />
then that is equivalent to directly estimat<strong>in</strong>g the format-specific coefficients, such as β f − fb , γ 1fb and γ 2 fb . In our estimation, we do not impose any restrictions on the values that the multipliers<br />
can take at different distance bands. If these multipliers turn out to be decreas<strong>in</strong>g with distance<br />
and less than one then that would imply that the impact of the variable weakens with distance. In<br />
particular, weaken<strong>in</strong>g of the competitive effects at greater distances would <strong>in</strong>dicate the benefits<br />
of spatial differentiation.<br />
2.2.2. Multiple Equilibria <strong>in</strong> the Model
* *<br />
Estimation <strong>in</strong>volves f<strong>in</strong>d<strong>in</strong>g the equilibrium solution, ( MLE , MLE )<br />
23<br />
P Θ , which is the global<br />
optimum of Equation (28) where, *<br />
Θ MLE are the Maximum Likelihood Estimates (MLE) and<br />
*<br />
P MLE are the correspond<strong>in</strong>g equilibrium CCPs. Us<strong>in</strong>g a nested fixed-po<strong>in</strong>t (NFXP) approach for<br />
estimation is computationally demand<strong>in</strong>g as it <strong>in</strong>volves solv<strong>in</strong>g for the fixed-po<strong>in</strong>t of Equation<br />
pr r c<br />
(22) for each draw of ⎡<br />
⎣ω , ω , ω ⎤<br />
⎦<br />
and at each step of the likelihood maximization. More<br />
importantly, NFXP suffers from the possibility of multiple equilibria <strong>in</strong> the model. Specifically,<br />
for a value of θ , if Equation (22) has multiple solutions for CCPs then the likelihood is not well<br />
def<strong>in</strong>ed. 14<br />
A recursive extension of the PML, called the Nested Pseudo Likelihood (NPL) approach<br />
addresses this problem at a relatively small additional computational cost (Aguirregabiria and<br />
Mira, 2007).<br />
Researchers have, therefore, developed two-step estimation approaches that avoid<br />
these problems. In a two-step Pseudo Maximum Likelihood (PML) approach, the CCPs are<br />
estimated <strong>in</strong> a parametric or nonparametric first step and the parameter estimates are obta<strong>in</strong>ed by<br />
maximiz<strong>in</strong>g the result<strong>in</strong>g likelihood <strong>in</strong> the second step (Bajari et. al., 2007). However, <strong>in</strong> most<br />
empirical contexts, consistent and precise first-stage estimates of CCPs are <strong>in</strong>feasible.<br />
15<br />
<strong>The</strong> standard NPL approach starts with an <strong>in</strong>itial guess of the CCPs, and<br />
converges to an equilibrium solution <strong>in</strong> the limit. For example, <strong>in</strong> our case, we would start with<br />
<strong>in</strong>itial guess values for firms’ beliefs about rivals’ CCPs, P . <strong>The</strong>n, us<strong>in</strong>g Equations (21) through<br />
(28) we would obta<strong>in</strong> the likelihood, ( 0 ) ,<br />
parameter estimates, 1<br />
Θ , and new CCPs, 1<br />
0<br />
L P Θ . Maximiz<strong>in</strong>g the likelihood would give the<br />
P . This would constitute one iteration, and the new<br />
14 One way to deal with this problem is to provide sufficient conditions that the parameters, θ, must satisfy to ensure<br />
a unique equilibrium (e.g., Seim, 2006; Zhu and S<strong>in</strong>gh, 2009).<br />
15 Another application of the NPL approach for a static game can be found <strong>in</strong> Ellickson and Misra (2008).
CCPs would be used for firms’ beliefs about rivals’ actions <strong>in</strong> the next iteration. <strong>The</strong> n th iteration<br />
of the standard NPL approach can be denoted by the follow<strong>in</strong>g contraction mapp<strong>in</strong>g, M:<br />
( Pn, n) ( Pn−1) where, n arg max L( Pn−1, ) ; Pn ( Pn−1,<br />
n)<br />
Θ =Μ Θ = Θ =Ψ Θ (30)<br />
Θ<br />
For a graphical illustration of the NPL iterations, suppose that the set ( P, Θ ) could be<br />
‘collapsed’ onto one axis. In Figure 2(a) the X-axis corresponds to the vector Pn − 1,<br />
the Y-axis<br />
corresponds to the set ( Pn, Θ n)<br />
, and the solid curve represents the contraction mapp<strong>in</strong>g ( P)<br />
24<br />
Μ .<br />
<strong>The</strong> dotted l<strong>in</strong>es represent the ‘track’ followed by the NPL iterations correspond<strong>in</strong>g to a<br />
particular start<strong>in</strong>g value, P 0 . Note that a different start<strong>in</strong>g value,<br />
'<br />
P 0 , would result <strong>in</strong> a different<br />
track for the NPL iterations. With multiple iterations, if there is convergence, the contraction<br />
* *<br />
mapp<strong>in</strong>g would converge to an equilibrium solution or a NPL fixed po<strong>in</strong>t, ( , )<br />
P Θ . In Figure<br />
3(a), this is the po<strong>in</strong>t where Μ ( P)<br />
<strong>in</strong>tersects the 45 o l<strong>in</strong>e. Furthermore, if the fixed po<strong>in</strong>t is<br />
* *<br />
unique then it is, <strong>in</strong> fact, the global optimum, ( MLE , MLE )<br />
2.2.2. Multiple Equilibria <strong>in</strong> the Data<br />
P Θ .<br />
<strong>The</strong> standard NPL approach, however, does not address the possibility of multiple<br />
equilibria <strong>in</strong> the data which is when the contraction mapp<strong>in</strong>g <strong>in</strong> Equation (31) does not have a<br />
unique NPL fixed po<strong>in</strong>t. <strong>The</strong> multiple eqilibria or the multiple NPL fixed po<strong>in</strong>ts are essentially<br />
the different ‘local optima’ of Equation (29). This is illustrated <strong>in</strong> Figure 3(b) where Μ ( P)<br />
<strong>in</strong>tersects the 45 o l<strong>in</strong>e at multiple po<strong>in</strong>ts. Consequently, the NPL iterations may potentially<br />
converge to a ‘local optima’ and not the global optimum. Further, as the track followed by the
NPL iterations depends on the start<strong>in</strong>g value, P 0 , different start<strong>in</strong>g values would result <strong>in</strong> dist<strong>in</strong>ct<br />
tracks which could potentially converge to different ‘local optima’. One option is to spread the<br />
search for the global optimum over a wide range of the contraction mapp<strong>in</strong>g, Μ ( P)<br />
, by us<strong>in</strong>g<br />
parallel-NPL where a large number of NPL algorithms, say, T, are run <strong>in</strong> parallel with different<br />
start<strong>in</strong>g values. By thus follow<strong>in</strong>g T dist<strong>in</strong>ct tracks for the NPL iterations, this approach, upon<br />
1* 1* 2* 2* T* T*<br />
convergence, would give us a set of T fixed po<strong>in</strong>ts, ( P , Θ ) ; ( P , Θ ) ;...; ( P , Θ )<br />
25<br />
⎡ ⎤<br />
⎣ ⎦ 16<br />
.<br />
* * ( PMLE , ΘMLE<br />
)<br />
However, it does not guarantee that this set will conta<strong>in</strong> the global optimum, .<br />
For a more efficient search of the global optimum, Aguirregabiria and Mira (2005)<br />
propose comb<strong>in</strong><strong>in</strong>g the parallel-NPL with a Genetic Algorithm (GA). GA is a search heuristic<br />
that mimics natural evolution processes such as ‘selection’, ‘crossover’ or ‘reproduction’ and<br />
‘mutation’, and can be used to obta<strong>in</strong> the global optimum of complex optimization problems.<br />
Comb<strong>in</strong><strong>in</strong>g the parallel-NPL with GA has two advantages – (1) <strong>The</strong> crossover and mutation<br />
steps spread the search for the global optimum over a much wider range of the contraction<br />
mapp<strong>in</strong>g than what is feasible with just the parallel-NPL, and (2) <strong>The</strong> selection step steers the<br />
tracks of the parallel-NPL iterations towards those regions of the contraction mapp<strong>in</strong>g that are<br />
more likely to conta<strong>in</strong> the global optimum. 17<br />
In our estimation, we <strong>in</strong>sert two GA steps after each iteration of the parallel-NPL. Note<br />
that after the n th iteration of the parallel-NPL, we will have T vectors of CCPs,<br />
⎡ ⎤<br />
1 2 T<br />
⎣Pn; Pn ;...; Pn<br />
⎦ .<br />
First, <strong>in</strong> a selection step, we evaluate each vector of CCPs by us<strong>in</strong>g a ‘fitness criterion’ where the<br />
16 Many of the fixed po<strong>in</strong>ts may be identical.<br />
17 Su and Judd (2010) suggest us<strong>in</strong>g a Mathematical Programm<strong>in</strong>g with Equilibrium Constra<strong>in</strong>ts approach that f<strong>in</strong>ds<br />
the parameter estimates and the equilibrium CCPs simultaneously. However, like the parallel-NPL, this approach<br />
also relies on multiple runs with different start<strong>in</strong>g values to f<strong>in</strong>d different equilibria. Hence, its ability to f<strong>in</strong>d the<br />
global optimum <strong>in</strong> problems that have a large action space (as <strong>in</strong> our entry and location choice problem) is unclear.
CCPs that are likely to be closer to the global optimum are considered to be more fit. Analogous<br />
to the natural selection process <strong>in</strong> nature, the more fit CCPs are given a greater chance of<br />
survival and reproduction so that future search for the global optimum is concentrated <strong>in</strong> their<br />
neighborhood. This is done by draw<strong>in</strong>g, with replacement, T ‘mother’ CCPs,<br />
and T ‘father’ CCPs,<br />
1'' 2'' T ''<br />
⎡<br />
⎣Pn ; Pn ;...; P ⎤ n ⎦<br />
, from the orig<strong>in</strong>al set,<br />
more fit CCPs have a greater chance of gett<strong>in</strong>g selected.<br />
1' 1'' 2' 2'' T'T'' Next, each of the T ‘couples’, ( Pn , Pn ) ; ( Pn , Pn ) ;...; ( Pn , Pn<br />
)<br />
26<br />
⎡ ⎤<br />
1' 2' T '<br />
⎣Pn ; Pn ;...; Pn<br />
⎦ ,<br />
1 2 T<br />
⎡<br />
⎣Pn; Pn ;...; P ⎤ n ⎦<br />
, such that the<br />
⎡ ⎤<br />
⎣ ⎦<br />
, go through a<br />
crossover step to produce an ‘offspr<strong>in</strong>g’ that <strong>in</strong>herits the traits of both its parent CCPs. To the<br />
extent that both parents are likely to be fit, the result<strong>in</strong>g offspr<strong>in</strong>g also has a high chance of be<strong>in</strong>g<br />
fit. Hence, we obta<strong>in</strong> a new generation of T vectors of CCPs that are likely to be quite close to<br />
the global optimum. To further reduce the chances of miss<strong>in</strong>g the global optimum, some<br />
mutations may be implanted <strong>in</strong>to the offspr<strong>in</strong>gs so that the search cont<strong>in</strong>ues to span a wide range<br />
of the contraction mapp<strong>in</strong>g. With multiple iterations of the parallel-NPL and GA steps, if there is<br />
convergence, we would obta<strong>in</strong> a set of T fixed po<strong>in</strong>ts which almost certa<strong>in</strong>ly would conta<strong>in</strong> the<br />
global optimum.<br />
2.2.3. Convergence<br />
<strong>The</strong> algorithm may not converge to the global optimum if the contraction mapp<strong>in</strong>g does<br />
not have good local convergence properties around the global optimum. Intuitively, as shown <strong>in</strong><br />
Figure 3(c), convergence to a fixed po<strong>in</strong>t depends on the concavity or the convexity of the<br />
mapp<strong>in</strong>g <strong>in</strong> the neighborhood of that fixed po<strong>in</strong>t. Kasahara and Shimotsu (2008) recommend<br />
transform<strong>in</strong>g the mapp<strong>in</strong>g by replac<strong>in</strong>g ( P,<br />
)<br />
( P,<br />
)<br />
Ψ Θ and P :<br />
Ψ Θ with the follow<strong>in</strong>g log-l<strong>in</strong>ear comb<strong>in</strong>ation of
( ) ( )<br />
δ 1−δ<br />
Λ P, Θ = ⎡ P, ⎤<br />
⎣<br />
Ψ Θ<br />
⎦<br />
⎡⎣P⎤⎦ ; δ ∈[0,1]<br />
Note that P =Λ( P,<br />
Θ ) and P ( P,<br />
)<br />
27<br />
(31)<br />
=Ψ Θ have the same fixed-po<strong>in</strong>t solution(s). An<br />
appropriate value of δ can modify the concavity or convexity of the mapp<strong>in</strong>g such that the<br />
transformed mapp<strong>in</strong>g is Locally Contractive around the fixed po<strong>in</strong>t and will converge even if the<br />
orig<strong>in</strong>al mapp<strong>in</strong>g does not. 18<br />
F<strong>in</strong>ally, even when the mapp<strong>in</strong>g does converge, the rate of<br />
convergence could be extremely slow and may require a large number of iterations. To avoid<br />
this, Kasahara and Shimotsu (2008) propose the follow<strong>in</strong>g q-stage operator called q-NPL:<br />
( ( ( ( ) ) ) )<br />
q<br />
Λ ( P, Θ ) =Λ Λ ... Λ P,<br />
Θ , Θ ,..., Θ , Θ<br />
<br />
q times<br />
q<br />
Aga<strong>in</strong>, P =Λ ( P,<br />
Θ ) and P ( P,<br />
)<br />
(32)<br />
=Ψ Θ have the same fixed-po<strong>in</strong>t solution(s). In<br />
q<br />
addition, Λ ( P,<br />
Θ ) also has the locally contractive property of ( P,<br />
)<br />
Λ Θ . Hence, <strong>in</strong> our<br />
estimation, we replace the standard NPL operator, Ψ , with the Locally Contractive, q-NPL<br />
operator,<br />
q<br />
Λ . <strong>The</strong> result<strong>in</strong>g parallel NPL iterations are then comb<strong>in</strong>ed with GA as described<br />
above. This procedure searches efficiently over the space of possible equilibria and converges<br />
fast to a set of equilibria which almost certa<strong>in</strong>ly conta<strong>in</strong>s the global optimum. Details of the<br />
sequence of steps <strong>in</strong>volved <strong>in</strong> estimation are provided <strong>in</strong> Appendix A2.<br />
2.3 Identification<br />
18 Kasahara and Shimotsu (2008) suggest the follow<strong>in</strong>g procedure for select<strong>in</strong>g the value of δ : Simulate a sequence<br />
N<br />
{ P n} n=<br />
0<br />
by iterat<strong>in</strong>g the transformed mapp<strong>in</strong>g for different values of δ , say for δ ∈ { 0.1,0.2,...,0.9}<br />
. <strong>The</strong>n<br />
pick the value of δ that leads to the smallest value of the mean of<br />
P P<br />
P − P<br />
n+ 1 n −<br />
n N<br />
across n = 1,…, N.
Extant models of location choice use only entry and spatial location choice data. <strong>The</strong>y<br />
exploit the variation <strong>in</strong> exogenous market characteristics around a location and the number and<br />
geographical locations of rivals, <strong>in</strong> order to identify the effects of market characteristics and the<br />
nature of competition. Given the entry and location choice data, they can only obta<strong>in</strong> make<br />
<strong>in</strong>ferences that the level of profits where a firm locates is greater than <strong>in</strong> locations where they do<br />
not locate, conditional on what they expect competitors to do.<br />
However, for identify<strong>in</strong>g the agglomeration effect, we need to go beyond the profits and<br />
decompose the quantity (demand) enhanc<strong>in</strong>g effects of agglomeration and the marg<strong>in</strong> effects of<br />
differentiation. We augment extant models with revenue data and price data. <strong>The</strong> revenue data<br />
now helps isolate the cost impact from profits. <strong>The</strong> price data helps separate revenues <strong>in</strong>to its<br />
quantity and price components.<br />
Identify<strong>in</strong>g the quantity component helps to isolate the agglomeration benefit. We note<br />
that the price data we have is only from one cha<strong>in</strong> which has stores of different formats. Hence<br />
the competitive effect on price is identified non-parametrically only <strong>in</strong> areas where this cha<strong>in</strong><br />
locates its stores. We make the assumption that the price effect is identical across all stores of the<br />
same format to facilitate identification <strong>in</strong> other locations.<br />
3 Data<br />
3.1 Store Data and Sample Markets<br />
We <strong>in</strong>vestigate the spatial configuration of big-box grocery stores. We have store location<br />
(latitude and longitude), store format and weekly revenue data at the national level for the period<br />
2007-08 from Nielsen’s ‘Trade Dimensions’. For our analysis, we use average weekly store<br />
revenue data. In a different dataset, we have store location and store format data (but no revenue<br />
data) for the period 2000-01 and for a sample of local markets <strong>in</strong> the three states of New York,<br />
28
Pennsylvania and Ohio. This second dataset also has price <strong>in</strong>dex data for stores belong<strong>in</strong>g to one<br />
store cha<strong>in</strong>. 19<br />
In our price model, price data from a different time period can be used to estimate<br />
the competition parameters and the distribution of price shocks as long as we use the market<br />
configuration and market characteristics correspond<strong>in</strong>g to that period, and if we assume that the<br />
price shocks do not change over the seven year period. Hence, we comb<strong>in</strong>e the two data sets so<br />
that for a sample of markets we have the market configuration and revenue data for all stores <strong>in</strong><br />
one period (2008), and the market configuration and price data for one of the stores <strong>in</strong> many<br />
markets, but for a different period (2001). <strong>The</strong> data constra<strong>in</strong>t of hav<strong>in</strong>g prices for only one store<br />
cha<strong>in</strong> may appear as a serious weakness. However, as discussed above, it aids our identification.<br />
Also, it is <strong>in</strong>terest<strong>in</strong>g from a managerial perspective as it mimics a more realistic situation where<br />
firms are likely to have more <strong>in</strong>formation about themselves (own prices and own revenue) and<br />
relatively less <strong>in</strong>formation about rivals (only revenue <strong>in</strong>formation about rivals but no price <strong>in</strong>dex<br />
<strong>in</strong>formation).<br />
Among the markets for which we have price data, we select a sample of 98 fairly<br />
isolated, small and medium sized towns to avoid the problems associated with large markets and<br />
suburbs such as unclear market boundaries, cannibalization due to multiple stores of a firm <strong>in</strong> the<br />
same market, and complex sub-zon<strong>in</strong>g regulations. In 2008, our 98 sample markets had<br />
19<br />
We have weekly product category-level price <strong>in</strong>dex data for a one year period for 27 grocery product categories<br />
and for each store that belongs to the store cha<strong>in</strong> ( pr = ∑∑ w * pr ; where, w ciuts is the revenue share<br />
cts ciuts ciuts<br />
∀∈ i c∀u∈i of UPC, u, of item, i, with<strong>in</strong> product category, c, for week t <strong>in</strong> store s). To construct store-level price <strong>in</strong>dices we<br />
adopt an approach similar to Chevalier et al., (2003; p. 22). That is, we aggregate over the product categories and<br />
27 52<br />
weeks to form a store-level price <strong>in</strong>dex ( pr = ∑∑ w * pr ; where w cts is the dollar share of category c <strong>in</strong><br />
week t <strong>in</strong> store s).<br />
s cts cts<br />
c= 1 t=<br />
1<br />
29
altogether 438 big-box grocery stores. 20<br />
<strong>The</strong>se stores have been classified <strong>in</strong>to six format types<br />
(i.e., F = 6): Supermarkets (SM), Superstores (SS), Supercenters and Wholesale Clubs (SC),<br />
Limited Assortment and Warehouse stores (LA), Natural Foods stores (NF) and Food and Drug<br />
stores (FD). Table 1 provides a description of these store formats.<br />
3.2. Consumer and Retail Locations<br />
Data on market characteristics are obta<strong>in</strong>ed from the U.S. Census. Although detailed<br />
demographic data at a Census Block Group (CBG) level are available only for the year 2000, the<br />
U.S. Census provides annual census projections for the county level. Hence, we project the CBG<br />
level census data to their 2008 values by the proportion of change <strong>in</strong> the respective counties<br />
between 2000 and 2008. As we do not have <strong>in</strong>formation about consumers beyond the CBG level,<br />
we follow the convention <strong>in</strong> the literature and place consumers <strong>in</strong> a CBG at the population<br />
weighted center of the CBG. <strong>The</strong>se are our consumer locations.<br />
For the location choice game, we divide a market <strong>in</strong>to a uniform grid of discrete 1 sq.<br />
mile blocks or market locations. Our 98 sample markets have a total of 4,792 such locations. But<br />
zon<strong>in</strong>g regulations dictate which of these locations are available for big-box retailers. Below, we<br />
discuss our approach for identify<strong>in</strong>g these potential retail locations and their commercial<br />
centers. Just as consumers are placed at the population weighted center of CBGs, we place<br />
retailers with<strong>in</strong> a retail location at the commercial center of the location.<br />
Our concept of market locations deviates from the standard approach <strong>in</strong> earlier research<br />
that treats census divisions as market locations and places retail stores at the population weighted<br />
center along with consumers. <strong>The</strong> standard approach simplifies the data setup process but it has<br />
severe drawbacks: (1) <strong>The</strong> population weighted center of a census division is likely to be a<br />
20 A comparison of the market configurations between 2001 and 2008 showed that the number of stores <strong>in</strong> these<br />
markets <strong>in</strong>creased less than 10% from 399 to 438.<br />
30
esidential zone so that plac<strong>in</strong>g retail stores there would confound the <strong>in</strong>clusion of zon<strong>in</strong>g<br />
regulations; (2) Stores are rarely present <strong>in</strong> the <strong>in</strong>terior of a census division, rather, they are<br />
present on roads that border these census divisions; (3) Census divisions vary extensively <strong>in</strong> size<br />
so that, for large census divisions, stores may be located quite far from the center and also quite<br />
far from each other. Such artificial distortions <strong>in</strong> distances between rivals can be very damag<strong>in</strong>g<br />
for our application as we are <strong>in</strong>terested <strong>in</strong> expla<strong>in</strong><strong>in</strong>g co-location of rivals through consumers’<br />
will<strong>in</strong>gness to travel to such locations. Our concept of market locations not only allows us to<br />
<strong>in</strong>corporate spatial zon<strong>in</strong>g regulations but it also avoids major distortions of the distances<br />
between rivals and the distances of stores from population centers. 21<br />
We next describe the National Land Cover Dataset (NLCD) and discuss how it is used <strong>in</strong><br />
conjunction with Geographical Information System tools such as ArcGIS and Google Earth to<br />
recover the potential retail locations and their commercial centers.<br />
3.3 Spatial Zon<strong>in</strong>g Data<br />
Multi-Resolution Land Characteristics Consortium, a conglomerate of several federal<br />
agencies, has created two NLCD datasets that provide consistent and accurate digital land-cover<br />
<strong>in</strong>formation for the coterm<strong>in</strong>ous U.S. <strong>The</strong> first national land-cover mapp<strong>in</strong>g project, NLCD 1992,<br />
was derived from the early to mid-1990s Landsat <strong>The</strong>matic Mapper satellite data. It applied a 21-<br />
class, geo-referenced, land-cover classification (see Vogelmann et al., 2001). <strong>The</strong> second project,<br />
NLCD 2001, updated the data for the year 2001 (see Homer et al., 2004). Both datasets have a<br />
spatial resolution of 30 meters. That is, every 30 sq. meter area of land is classified as a specific<br />
land type (e.g., deciduous forest, grassland, open water, etc.) and is allocated one pixel po<strong>in</strong>t with<br />
21 In this paper, distance between two po<strong>in</strong>ts always refers to the great-circle distance.<br />
31
a dist<strong>in</strong>ct color code and the associated latitude and longitude. 22<br />
Step 1: Construct<strong>in</strong>g Market Boundaries and Market Locations<br />
32<br />
Interest<strong>in</strong>gly, the land type<br />
classifications <strong>in</strong>clude residential and commercial land. Residential land is further classified <strong>in</strong>to<br />
low and high <strong>in</strong>tensity residential land, and commercial land comprises of highly developed<br />
areas that do not <strong>in</strong>clude residential areas. We use the NLCD data <strong>in</strong> the follow<strong>in</strong>g three steps to<br />
identify the potential retail locations and their commercial centers.<br />
We use the data <strong>in</strong> NLCD 2001 to construct the market boundaries of our sample markets.<br />
<strong>The</strong> residential and commercial land area pixel po<strong>in</strong>ts <strong>in</strong> each market are projected on a map by<br />
us<strong>in</strong>g the ArcGIS software. This gives us the spatial area of <strong>in</strong>terest for a market. A simple visual<br />
<strong>in</strong>spection of the pixel density is used to construct the market boundaries where the pixels fade<br />
away (See Figure 4(a)). As our sample markets are reasonably isolated from other towns and<br />
cities, we can be flexible <strong>in</strong> choos<strong>in</strong>g the shape of their boundaries. A rectangular shape is<br />
preferred so that a market can be easily divided <strong>in</strong>to a uniform grid of discrete blocks or market<br />
locations. Thus, we construct imag<strong>in</strong>ary rectangular borders (L miles X H miles where L and H<br />
are <strong>in</strong>tegers that vary across markets) around the residential and commercial pixel po<strong>in</strong>ts of each<br />
market and then divide the market, specifically, <strong>in</strong>to 1 sq. mile locations (See Figure 4(b)).<br />
Step 2: Commercial Activity and Commercial Center <strong>in</strong> a Location<br />
<strong>The</strong> extent of commercial activity <strong>in</strong> a location (as def<strong>in</strong>ed above) could affect firms’<br />
profit <strong>in</strong> the location if consumers have a preference for multi-purpose shopp<strong>in</strong>g or one-stop<br />
shopp<strong>in</strong>g. For <strong>in</strong>stance, when shopp<strong>in</strong>g for groceries, consumers may like to comb<strong>in</strong>e their<br />
shopp<strong>in</strong>g trip with non-grocery purchases such as cloth<strong>in</strong>g and electronics so that locations with<br />
more retail bus<strong>in</strong>esses may be more attractive to firms. We isolate the NLCD 2001 pixel po<strong>in</strong>ts<br />
22<br />
A pixel po<strong>in</strong>t is one of the <strong>in</strong>dividual dots that make up a graphical image. Each pixel po<strong>in</strong>t comb<strong>in</strong>es red, green,<br />
and blue phosphors to create a specific color.
that correspond to commercial land with retail bus<strong>in</strong>esses (See Appendix C for technical details)<br />
and use the number of pixel po<strong>in</strong>ts <strong>in</strong> a location as a measure for the extent of commercial<br />
activity <strong>in</strong> that location. <strong>The</strong> mean of the latitudes and longitudes of the commercial land pixel<br />
po<strong>in</strong>ts <strong>in</strong> a location gives us the commercial center of the location (See Figure 4(c)). We place all<br />
retail stores with<strong>in</strong> a location at the commercial center of that location.<br />
Step 3: Discern<strong>in</strong>g Potential Retail Locations from other Commercial Locations<br />
<strong>The</strong> market locations which conta<strong>in</strong> the commercial land pixel po<strong>in</strong>ts are the commercial<br />
locations and they constitute a very small share of all market locations. <strong>The</strong> locations without<br />
any commercial activity are mostly residential locations and some barren land. Hence, we<br />
account for residential zon<strong>in</strong>g by exclud<strong>in</strong>g locations that do not have any commercial land pixel<br />
po<strong>in</strong>ts. But even with<strong>in</strong> commercial locations, not all locations may be open to big-box retailers.<br />
For <strong>in</strong>stance, some commercial zones like, say, downtown areas, might only allow small<br />
bus<strong>in</strong>esses such as banks and restaurants. An obvious candidate for a potential retail location for<br />
big-box stores is any commercial location that has at least one big-box store which could be a<br />
grocery store or a non-grocery store. Hence, we project the locations on to Google Earth and use<br />
a tool called ‘Places Categories’ which shows the locations of various types of bus<strong>in</strong>esses <strong>in</strong> a<br />
region (See Figure 4(d)). We carefully comb through the commercial locations, and specifically<br />
check for the presence of major retail stores, major grocery stores and shopp<strong>in</strong>g centers to<br />
identify the commercial locations that have at least one big-box store.<br />
Now, the absence of big-box stores <strong>in</strong> a commercial location does not necessarily imply<br />
that such stores are not allowed <strong>in</strong> that location. In particular, a commercial location that is open<br />
to big-box stores may not have any such store if it is <strong>in</strong> an unfavorable or poor neighborhood and<br />
33
cannot support a big store. 23<br />
As we do not have a precise method for identify<strong>in</strong>g such locations,<br />
we use a stylized selection procedure. For each market, we f<strong>in</strong>d the m<strong>in</strong>imum value of the total<br />
<strong>in</strong>come of consumers with<strong>in</strong> a 2-mile radius of the commercial locations that have big-box<br />
stores. We use this m<strong>in</strong>imum as a benchmark for a commercial location <strong>in</strong> the market to be<br />
attractive enough to support at least one big-box retail store. That is, if a commercial location<br />
does not have any big-box store and the total <strong>in</strong>come of consumers with<strong>in</strong> a 2-mile radius of the<br />
location is less than the market benchmark then we presume that the absence of a big-box store is<br />
due to the unattractiveness of the location and not necessarily because of zon<strong>in</strong>g restrictions.<br />
Hence, a commercial location with no big-box store is still treated as a potential retail location<br />
when the follow<strong>in</strong>g condition is satisfied:<br />
Income <strong>in</strong> 2-mile radius of a commercial<br />
location that has no big-box store<br />
≤<br />
⎧Income <strong>in</strong> 2-mile radius of a commercial⎫<br />
m<strong>in</strong> ⎨ ⎬<br />
⎩location that has a big-box store ⎭<br />
To summarize, we use the NLCD data to construct market boundaries so that each market<br />
can be divided <strong>in</strong>to a grid of 1 sq. mile locations. <strong>The</strong>n the commercial land pixel po<strong>in</strong>ts are used<br />
to obta<strong>in</strong> the extent of commercial activity <strong>in</strong> a location and also to locate the commercial center<br />
of the location. Extant models that do not account for zon<strong>in</strong>g, assume that firms are allowed to<br />
set up stores <strong>in</strong> any market location. In contrast, we account for residential zon<strong>in</strong>g by exclud<strong>in</strong>g<br />
locations that do not have any commercial land pixel po<strong>in</strong>ts. F<strong>in</strong>ally, we account for zon<strong>in</strong>g<br />
regulations particularly aga<strong>in</strong>st big-box retailers, with<strong>in</strong> commercial locations, by def<strong>in</strong><strong>in</strong>g<br />
potential retail locations as those commercial locations that (1) have at least one big-box store<br />
which is either a grocery or a non-grocery store, and (2) do not have a big-box store and are <strong>in</strong> a<br />
poor neighborhood which is below the market benchmark as described above.<br />
23 Note that competition between stores <strong>in</strong> neighbor<strong>in</strong>g locations cannot expla<strong>in</strong> the absence of big-box stores <strong>in</strong> a<br />
location as we are consider<strong>in</strong>g big-box stores across any segment of the retail <strong>in</strong>dustry.<br />
34
4. Results<br />
<strong>The</strong> estimation results are presented <strong>in</strong> three parts. Table 2(a) presents the estimates for<br />
the consumer shopp<strong>in</strong>g location choice or the demand side of the model. Table 2(b) presents the<br />
results of the price <strong>in</strong>dex portion of the model. F<strong>in</strong>ally, the estimates of cost and unobserved<br />
shocks are presented <strong>in</strong> Table 2(c).<br />
<strong>The</strong> demand side estimates <strong>in</strong>dicate that consumers experience a negative travel cost that<br />
is convex with respect to distance (<strong>The</strong> coefficient of 2<br />
dgl is positive and significant). 24<br />
Consumers who are far away from the nearest retail location (That is, when the value of<br />
35<br />
m<strong>in</strong>_dg<br />
is large), are more will<strong>in</strong>g to travel long distances to get to a grocery store. Demographic<br />
characteristics seem to have very little explanatory power for consumers’ travel costs.<br />
<strong>The</strong> results show that consumers not only value economies of scope from the presence of<br />
other, non-grocery bus<strong>in</strong>esses at a location but they also value the agglomeration of multiple<br />
grocery stores at the location. <strong>The</strong> store agglomeration parameter ( α SA = 0.5342) is positive and<br />
significant which suggests that consumers likely visit locations with multiple grocery stores. <strong>The</strong><br />
format agglomeration effect ( α f , FA ) is also positive and even significant for a few store formats<br />
(Supercenters, Limited Assortment stores, and Food and Drug stores). Hence, consumers are<br />
more likely to visit locations with multiple grocery stores when the cluster of stores consists of<br />
different formats. Consequently, strategic store and format agglomeration <strong>in</strong>crease consumers’<br />
propensity to shop at a location, thus <strong>in</strong>creas<strong>in</strong>g volume at that location.<br />
24 Compar<strong>in</strong>g the results (not presented here) with different specifications for the maximum distance that consumers<br />
may travel for shopp<strong>in</strong>g, Rad, suggested that a distance of 5 miles was sufficient. Rad values of 6 miles and above<br />
did not change parameter estimates or <strong>in</strong>crease the likelihood value significantly (vis-à-vis AIC and BIC criteria).<br />
On the other hand, Rad values of 4 miles and below resulted <strong>in</strong> significantly different estimates for some model<br />
parameters and also gave significantly smaller likelihood values.
F<strong>in</strong>ally, consumers have a high preference for Supercenters and for Food and Drug stores<br />
relative to the Supermarket format. Hence, consumers may be more will<strong>in</strong>g to travel long<br />
distances to get to such stores. Consumers have a relatively low preference for Limited<br />
Assortment stores. This could be because Limited Assortment stores generally carry more name-<br />
brand products and very few national brand products.<br />
<strong>The</strong> results of the price <strong>in</strong>dex portion of the model (Table 2(b)) show that the format-<br />
specific price constant, β f , pr , is lowest for Limited Assortment stores and Supercenters. This is<br />
expected s<strong>in</strong>ce the stores with these formats are typically EDLP stores or they offer relatively<br />
more name-brand products that have low prices. For the effect of competition, recall that we<br />
allowed for separate <strong>in</strong>tra-format and <strong>in</strong>ter-format competition, and we considered competition<br />
from rivals <strong>in</strong> various 1-mile width distance bands (B = 5 when Rad = 5 miles). Our results show<br />
that the competition effect decreases dramatically with distance.<br />
Not surpris<strong>in</strong>gly, <strong>in</strong>traformat competition is generally more severe than <strong>in</strong>terformat<br />
competition. <strong>The</strong> extent of <strong>in</strong>traformat competition is the highest between Food and Drug<br />
comb<strong>in</strong>ation stores, which is comparable to the competition between Supercenters. Superstores<br />
are also found to compete quite heavily with each other. Interest<strong>in</strong>gly, for some formats, the<br />
<strong>in</strong>terformat competition effect is found to be comparable to the <strong>in</strong>traformat competition effect.<br />
For <strong>in</strong>stance, the competition effect between Supermarkets and Superstores is quite comparable<br />
to that between two Supermarkets. <strong>The</strong> competition effect between Superstores and Food and<br />
Drug comb<strong>in</strong>ation stores also seems to be quite high. <strong>The</strong> results highlight the importance of<br />
account<strong>in</strong>g for format differentiation, <strong>in</strong> addition to spatial differentiation.<br />
To explore the value of separat<strong>in</strong>g the agglomeration-differentiation effects of rivals, we<br />
estimated a model that did not <strong>in</strong>corporate agglomeration benefits <strong>in</strong> the consumer model<br />
36
(Parameter estimates not shown). <strong>The</strong> results showed that the competition effects are biased<br />
downwards for all format types. <strong>The</strong> bias was more severe for <strong>in</strong>ter-format competition. Hence,<br />
not model<strong>in</strong>g the agglomeration-differentiation tradeoff can highly underestimates the<br />
competition <strong>in</strong>tensity between stores with different formats. In retrospect, this is expected<br />
because without an agglomeration effect the model would misattribute observed collocation of<br />
stores <strong>in</strong> the data to low competition. S<strong>in</strong>ce the agglomeration benefit is higher when the cluster<br />
of stores has different formats, it is understandable that the <strong>in</strong>ter-format competition effect is<br />
more biased.<br />
F<strong>in</strong>ally, the estimates of cost and unobserved shocks (Table 2(c)) also give some<br />
<strong>in</strong>terest<strong>in</strong>g <strong>in</strong>sights. Although the Supercenter format enjoys a high preference from consumers,<br />
it also tends to <strong>in</strong>cur high costs <strong>in</strong> densely populated neighborhoods. We f<strong>in</strong>d a strong negative<br />
correlation between the location-specific cost shocks and demand shocks (-0.8932). This<br />
conforms to the <strong>in</strong>tuition that locations with high revenue potential are likely to be associated<br />
with high costs.<br />
5. Counterfactual Simulations<br />
We report two counterfactual simulations which help assess the relative importance of<br />
zon<strong>in</strong>g and agglomeration effects. We consider three alternative scenarios: (1) <strong>The</strong>re are no<br />
zon<strong>in</strong>g regulations <strong>in</strong> any market and consumers do not benefit from co-location of stores (i.e.,<br />
‘Neither Zon<strong>in</strong>g nor <strong>Agglomeration</strong>’), (2) Markets have zon<strong>in</strong>g regulations but there are no<br />
benefits from co-location (i.e., ‘Only Zon<strong>in</strong>g; No <strong>Agglomeration</strong>’), (3) <strong>Agglomeration</strong> benefits<br />
exist but there are no zon<strong>in</strong>g regulations (i.e., ‘No Zon<strong>in</strong>g; Only <strong>Agglomeration</strong>’), and (4) Both<br />
zon<strong>in</strong>g and agglomeration benefits exist. For the set of 98 sample markets we estimate the<br />
equilibrium CCPs under these alternative market conditions, assum<strong>in</strong>g that the equilibrium<br />
37
number of entrants rema<strong>in</strong>s unchanged (An appropriate change <strong>in</strong> the market-specific terms, m<br />
ξ ,<br />
would ensure this). We use the estimated model parameters and f<strong>in</strong>d the fixed po<strong>in</strong>t of the<br />
system of equations shown <strong>in</strong> Equation (23). For this, we use the NFXP approach.<br />
Figure 5(a) shows the distribution of <strong>in</strong>ter-store distance across the 98 markets, under the<br />
first scenario of ‘Neither Zon<strong>in</strong>g nor <strong>Agglomeration</strong>’. We see that only 28% of stores co-locate<br />
with<strong>in</strong> 1 mile of each other. 25<br />
This level of co-location may be due to concentration of high<br />
demand or low cost. When we turn on zon<strong>in</strong>g (‘Only Zon<strong>in</strong>g; No <strong>Agglomeration</strong>’), 32% of stores<br />
are located with<strong>in</strong> 1 mile of each other (Figure 5(b)). This suggests that zon<strong>in</strong>g may force firms<br />
to come a little closer to each other but it has very limited direct impact on their collocation<br />
behavior. With only agglomeration turned on (Figure 5(c) - ‘No Zon<strong>in</strong>g; Only <strong>Agglomeration</strong>’),<br />
43% of stores located with<strong>in</strong> 1 mile of each other, suggest<strong>in</strong>g that agglomeration effects have a<br />
substantial impact on collocation. Interest<strong>in</strong>gly, the <strong>in</strong>teraction between zon<strong>in</strong>g and<br />
agglomeration benefit is extremely high because when both effects coexist, co-location <strong>in</strong>creases<br />
to 60% (Figure 5(d) – ‘Both Zon<strong>in</strong>g and <strong>Agglomeration</strong>’). This is quite close to the amount of<br />
co-location that we observe <strong>in</strong> our sample data. Thus the impact of zon<strong>in</strong>g on co-location is high<br />
only <strong>in</strong> the presence of agglomeration benefits. Why is there an <strong>in</strong>teraction effect between zon<strong>in</strong>g<br />
and agglomeration benefits?<br />
To understand this <strong>in</strong>teraction, we perform our second counterfactual analysis <strong>in</strong> a<br />
hypothetical market where we gradually <strong>in</strong>crease the zon<strong>in</strong>g restriction which restricts the scope<br />
for spatial differentiation. For a set of four grocery stores, the optimal locations are shown <strong>in</strong><br />
Figure 6. In the less restrictive zon<strong>in</strong>g sett<strong>in</strong>g, we f<strong>in</strong>d that stores are located at the extremes of<br />
25 For this counterfactual simulation, we are count<strong>in</strong>g stores with<strong>in</strong> 1 mi. of a rival as a co-located store. It is<br />
plausible that the two stores belong to two neighbor<strong>in</strong>g 1 sq. mi. block retail location whose commercial centers<br />
happen to be with<strong>in</strong> 1 mi. of each other.<br />
38
the commercial zone, suggest<strong>in</strong>g that zon<strong>in</strong>g restrictions constra<strong>in</strong> the extent of spatial<br />
differentiation <strong>in</strong> this market. When zon<strong>in</strong>g is made more str<strong>in</strong>gent, one would expect that stores<br />
would cont<strong>in</strong>ue to be at the edges of the commercial zone. However, the optimal locations reveal<br />
a surpris<strong>in</strong>g pattern. When zon<strong>in</strong>g is more restrictive, we f<strong>in</strong>d that some stores actually<br />
agglomerate. In retrospect, we can understand the logic of why this happens. When zon<strong>in</strong>g is<br />
relaxed, stores can be more spread out allow<strong>in</strong>g for benefits of spatial differentiation to be large<br />
enough. However when zon<strong>in</strong>g is very restrictive, firms cannot differentiate enough; this leads to<br />
a discont<strong>in</strong>uity where stores now recognize that by co-locat<strong>in</strong>g they can ga<strong>in</strong> from agglomeration<br />
benefits which may outweigh the relatively constra<strong>in</strong>ed benefits from differentiation because of<br />
the tight zon<strong>in</strong>g regulations. This expla<strong>in</strong>s the high <strong>in</strong>teraction effect of zon<strong>in</strong>g and<br />
agglomeration that we f<strong>in</strong>d as we proceed from the scenario <strong>in</strong> Figure 5(a) to the scenario <strong>in</strong><br />
Figure 5(d).<br />
6. Conclusion<br />
<strong>The</strong> literature on retailer entry and location choices has thus far ignored the<br />
agglomeration-differentiation tradeoff. We developed a comprehensive static, structural,<br />
simultaneous move game model of firm entry and location choice that disentangles this tradeoff<br />
while controll<strong>in</strong>g for several alternative explanations for observed collocation. Tak<strong>in</strong>g advantage<br />
of a publicly available, digital land cover database, NLCD, we are able to control for the effect of<br />
zon<strong>in</strong>g on entry and location choices. To control for demand and cost based explanations for<br />
collocation, we decompose latent profits <strong>in</strong>to revenue and cost and augment entry and location<br />
data with store revenue data. To separate the benefits of agglomeration from the benefits of<br />
spatial differentiation, we further decompose revenue <strong>in</strong>to its components of consumer choice<br />
based volume and competition based price. We use recent advances <strong>in</strong> the empirical estimation<br />
39
literature of discrete games to address issues of multiple equilibria <strong>in</strong> the model and data as well<br />
as problems due to slow convergence of the estimation algorithm.<br />
<strong>The</strong> consumer and price model provided <strong>in</strong>terest<strong>in</strong>g <strong>in</strong>sights about the differences <strong>in</strong> the<br />
agglomeration and competition effects across store formats. <strong>The</strong>se results and the subsequent<br />
counterfactual analyses lead to the follow<strong>in</strong>g takeaways: First, zon<strong>in</strong>g, agglomeration effects,<br />
spatial differentiation and format differentiation are all key drivers of observed store location<br />
patterns. Second, zon<strong>in</strong>g may force firms to locate closer than what they would like but it has<br />
little direct effect on collocation of stores. F<strong>in</strong>ally, zon<strong>in</strong>g <strong>in</strong>teracts with agglomeration to drive<br />
observed collocation. <strong>The</strong> <strong>in</strong>teraction between zon<strong>in</strong>g and the agglomeration effect can have a<br />
discont<strong>in</strong>uous impact on the location pattern of stores. This highlights the value of a structural<br />
model <strong>in</strong> understand<strong>in</strong>g how a small perturbation of market characteristics can cause strategic<br />
firms to respond <strong>in</strong> complex and nonl<strong>in</strong>ear ways.<br />
We conclude with a discussion of some key limitations <strong>in</strong> this paper that warrant future<br />
research. First, our identification of the volume and price effects is partially aided by functional<br />
form assumptions for how locations of competitors affect volumes and prices differently. This is<br />
because we only have price <strong>in</strong>formation for a set of stores belong<strong>in</strong>g to one store cha<strong>in</strong>.<br />
Nonetheless, this is managerially <strong>in</strong>terest<strong>in</strong>g as it is closer to a realistic scenario where firms<br />
usually have more <strong>in</strong>formation about themselves than about others. Second, we treat entry<br />
decision <strong>in</strong> a static equilibrium framework, even though a dynamic model may be more<br />
appropriate given that these decisions are made over time. Such a model<strong>in</strong>g approach requires<br />
better data (tim<strong>in</strong>g of entry and exits) as well as richer model<strong>in</strong>g framework to solve the dynamic<br />
game. F<strong>in</strong>ally, we have treated store entry decisions across markets as <strong>in</strong>dependent, unlike recent<br />
work by Jia (2008), who models the cha<strong>in</strong> entry decision, tak<strong>in</strong>g <strong>in</strong>to account the<br />
40
<strong>in</strong>terdependence across markets. However, her model<strong>in</strong>g approach is restricted to a small number<br />
of compet<strong>in</strong>g cha<strong>in</strong>s and is hard to extend to our grocery market sett<strong>in</strong>g that <strong>in</strong>volves a large<br />
number of players. <strong>The</strong>se important issues await future research.<br />
41
Figure 1: Over 45% of big-box grocery stores are with<strong>in</strong> 0.5 mi. of a rival store<br />
1 2 3 …<br />
Figure 2(a): An illustrative square market<br />
with the geographical space discretized<br />
<strong>in</strong>to square blocks or ‘locations’.<br />
(Data for 3 U.S. states of NY, OH and PA)<br />
… L m<br />
1 2<br />
3 …<br />
Figure 2(b): Due to zon<strong>in</strong>g regulations,<br />
firms can only choose among ‘potential<br />
retail location’ (Area <strong>in</strong> white).<br />
42<br />
… lm
Figure 3(a): Graphical illustration of the standard NPL approach<br />
Figure 3(b): With multiple equilibria<br />
<strong>in</strong> the data, different start<strong>in</strong>g values<br />
may give different solutions<br />
43<br />
Figure 3(c): Depend<strong>in</strong>g on the local<br />
convergence properties, the contraction<br />
mapp<strong>in</strong>g may not converge to a fixed po<strong>in</strong>t
Figure 4(a): Construct<strong>in</strong>g market boundaries based on visual<br />
<strong>in</strong>spection of residential and commercial pixel density<br />
44<br />
Figure 4(b): Divid<strong>in</strong>g a rectangular market <strong>in</strong>to a grid of<br />
1 sq. mile blocks or discrete locations
Figure 4(c): Us<strong>in</strong>g commercial land pixel data to obta<strong>in</strong><br />
extent of commercial activity with<strong>in</strong> a location and the<br />
commercial center of the location<br />
45<br />
Figure 4(d): Us<strong>in</strong>g ‘Places of Interest’ <strong>in</strong> Google<br />
Earth to check for the presence of big-box stores <strong>in</strong><br />
commercial locations
120<br />
80<br />
40<br />
0<br />
Figure 5(a): Neither Zon<strong>in</strong>g nor <strong>Agglomeration</strong><br />
120<br />
80<br />
40<br />
0<br />
28%<br />
Figure 5(c): No Zon<strong>in</strong>g; Only <strong>Agglomeration</strong><br />
46<br />
120<br />
80<br />
40<br />
0<br />
Figure 5(b): Only Zon<strong>in</strong>g; No <strong>Agglomeration</strong><br />
180<br />
120<br />
60<br />
0<br />
0.5<br />
32%<br />
43% 60%<br />
1<br />
1.5<br />
2<br />
2.5<br />
3<br />
Figure 5(d): With Zon<strong>in</strong>g and <strong>Agglomeration</strong><br />
More
Notes: SM – Supermarket format; SS – Superstore format; LA – Limited Assortment format.<br />
Figure 6: Equilibrium Store Locations <strong>in</strong> a Simulated Market – Shr<strong>in</strong>k<strong>in</strong>g Retail Zone<br />
(Area <strong>in</strong> White represents retail locations)<br />
47
Store Format<br />
Examples of<br />
Retailers 26<br />
Total Number of<br />
Stores <strong>in</strong> 98<br />
Sample Markets<br />
Maximum Number<br />
of Stores <strong>in</strong> a<br />
Market<br />
Average Store<br />
Area<br />
(<strong>in</strong> sq. feet)<br />
Average Annual<br />
Store Revenue<br />
from Grocery<br />
Sales (<strong>in</strong> $ M)<br />
Average Ratio of<br />
Grocery Revenue<br />
to Total Store<br />
Revenue<br />
Supermarket<br />
(SM)<br />
Hi-Low Food<br />
Stores, Price<br />
Chopper, Vons<br />
Market<br />
Superstore<br />
(SS)<br />
Jewel Food<br />
Store, BI-LO,<br />
Vons Market,<br />
Albertsons,<br />
Safeway,<br />
Ltd. Assort.<br />
(LA)<br />
Save-A-Lot,<br />
Price Rite,<br />
Aldi, Smart &<br />
F<strong>in</strong>al<br />
48<br />
Natural Food<br />
(NF)<br />
Whole Foods,<br />
Trader Joes<br />
Food + Drug<br />
(FD)<br />
Jewel-Osco,<br />
Kroger,<br />
Albertsons,<br />
Safeway<br />
Supercenter<br />
(SC)<br />
Wal-Mart<br />
Supercenter,<br />
Super Target,<br />
Meijer, Sams<br />
Club, Costco<br />
84 69 96 20 103 66<br />
6 4 5 2 6 3<br />
13,500 35,500 14,500 10,500 41,500 163,000<br />
5.93 15.24 5.23 9.22 16.09 51.84<br />
1 1 1 1 0.71 0.62<br />
Table 1: Descriptive Statistics of Various Grocery Store Formats<br />
26 Some retailers have more than one type of stores (e.g., Vons, Albertsons, and Safeway). We follow the format classification of <strong>in</strong>dividual stores<br />
provided by AC Nielsen.
Variable 27<br />
Travel Cost ( Tvl ) gl<br />
Supermarket<br />
(SM)<br />
Superstore<br />
(SS)<br />
49<br />
Ltd. Assort.<br />
(LA)<br />
Distance ( d gl ) - 0.1827<br />
Natural<br />
Food (NF)<br />
Distance 2 ( d ) 0.9862***<br />
2<br />
gl<br />
med _ hhI g * d 0.0436<br />
gl<br />
med _ ageg * d 0.0579<br />
gl<br />
m<strong>in</strong> _ dg * d - 0.1482*<br />
gl<br />
Price ln ( pr fl )<br />
- 0.4255<br />
Economies of<br />
Scope<br />
Store<br />
<strong>Agglomeration</strong><br />
Format<br />
<strong>Agglomeration</strong><br />
Format<br />
Preference<br />
Customer Value<br />
(Equation 12)<br />
comm l<br />
1.5468***<br />
N l<br />
0.5342**<br />
Food + Drug<br />
(FD)<br />
Supercenter<br />
(SC)<br />
OF<br />
I fl<br />
0.6329 0.4917 0.9458** 1.2031 1.2974** 0.8337**<br />
-- -0.3251 -0.2735** -0.5809 0.3683*** 0.2196**<br />
CV fl 0.3932** 0.4885 0.3826* 0.5404 0.6391** 0.5673<br />
27 Note: * : p < 0.1, ** p < 0.05, *** : p < 0.01; All significant estimates <strong>in</strong> bold.<br />
Table 2(a): Consumers’ Shopp<strong>in</strong>g Location Choice Based Volume
Formatspecific<br />
Pric<strong>in</strong>g<br />
Ability<br />
Competition<br />
Effect<br />
Supermarket Superstore Ltd. Assort. Natural Food + Drug Supercenter<br />
(SM) (SS) (LA) Food (NF) (FD) (SC)<br />
Intr<strong>in</strong>sic Ability 1.4468* 1.7413* 0.9685* 1.2890 1.4963* 1.1372*<br />
Variable 28<br />
Per Capita Income <strong>in</strong><br />
2mi. radius ( x l )<br />
SM; 0-1mi. coeff. - 1.4895***<br />
SS; 0-1mi. coeff. -1.1043** -2.5038**<br />
LA; 0-1mi. coeff. - 0.4712* - 0.5620* - 1.7921**<br />
50<br />
0.0861<br />
NF; 0-1mi. coeff. - 0.8095 - 1.2597 - 0.5328 - 3.6044<br />
FD; 0-1mi. coeff. - 0.5991** - 1.6469** - 0.8129** - 1.3031 -4.2357**<br />
SC; 0-1mi. coeff. - 0.6344 - 0.3609* - 0.2315 - 0. 3988 -1.014* -3.9460*<br />
1-2mi. multiplier ( κ 2 ) 0.5806***<br />
2-3mi. multiplier ( κ 3 ) 0.3247***<br />
3-4mi. multiplier ( κ 4 ) - 0.0521<br />
4-5mi. multiplier ( κ 5 ) 0.0173<br />
28 Note: * : p < 0.1, ** p < 0.05, *** : p < 0.01; All significant estimates <strong>in</strong> bold.<br />
Table 2(b): Competition Based Price Index
Variable 29<br />
Supermarket<br />
(SM)<br />
Superstore<br />
(SS)<br />
Ltd. Assort.<br />
(LA)<br />
Natural<br />
Food (NF)<br />
Food + Drug<br />
(FD)<br />
Supercenter<br />
(SC)<br />
Cost Intercept -- - 0.2451 0.0521 - 0.4489 0.3902 - 0.5813**<br />
Commercial Activity ( comm ) l - 0.0774 0.2917* - 0.3265** 0.0592 - 0.1999* 0.3711**<br />
0-1mi. coefficient - 0.5996 - 0.6233** - 0.2908* - 0.1370 - 0.4391** -1.2028**<br />
Population<br />
Per Capita<br />
Income<br />
Common<br />
Unobserved<br />
Location-Level<br />
Shocks<br />
Common<br />
Unobserved<br />
Market-level Cost<br />
1-2mi. multiplier 0.7503**<br />
2-3mi. multiplier 0.3132*<br />
3-4mi. multiplier 0.0830<br />
4-5mi. multiplier - 0.1114<br />
0-1mi. coefficient - 0.7830 - 0.5548* - 1.2619* 0.3012 - 0.5693** - 0.8896*<br />
1-2mi. multiplier 0.2815*<br />
2-3mi. multiplier - 0.1002<br />
3-4mi. multiplier - 0.0036<br />
4-5mi. multiplier 0.0416<br />
Std., Price Shock: σ 0.7783<br />
p<br />
Std., Revenue Shock: σ 1.0928**<br />
r<br />
Std., Cost Shock: σ 1.6041*<br />
c<br />
Revenue-Cost Corr.: ρ 0.8932**<br />
µ ( ξ )<br />
- 3.2962***<br />
σ ( ξ )<br />
1.3901***<br />
29 Note: * : p < 0.1, ** p < 0.05, *** : p < 0.01; All significant estimates <strong>in</strong> bold.<br />
Table 2(c): Cost and Common Unobserved Components<br />
51
Appendix A<br />
Expected total number of compet<strong>in</strong>g stores <strong>in</strong> a location:<br />
[ ] m<br />
E N = N∑ p<br />
A.1<br />
l fl<br />
f<br />
Expectation that a location will have stores with other formats besides format-f:<br />
( )<br />
OF<br />
E ⎡<br />
⎣I⎤ fl ⎦ = 1− prob location has only format- f stores<br />
∏ ( ' )<br />
= 1− p 1−<br />
p<br />
fl f l<br />
f '≠<br />
f<br />
Expected number of format-f’ rivals <strong>in</strong> distance band b around a format-f store that is <strong>in</strong> location<br />
l (<strong>in</strong>terformat competition):<br />
⎛ ⎞<br />
m<br />
E ⎡<br />
⎣N ⎤ f 'bl ⎦ = ⎜N pf' j ⎟ ; f ' ≠ f<br />
⎜ ∑ ⎟<br />
⎝ j∈lb<br />
⎠<br />
where, is the set of locations <strong>in</strong> distance band b around location l.<br />
lb<br />
Expected number of format-f rivals <strong>in</strong> distance band b around a format-f store that is <strong>in</strong> location l<br />
(<strong>in</strong>traformat competition):<br />
When account<strong>in</strong>g for the number of rivals with the same format, we need to discount the<br />
choice probability of the focal firm, conditional on its decision to enter the market:<br />
⎛ ⎞ ⎛ ⎞<br />
m<br />
E⎡ ⎣N⎤ fbl ⎦ = ⎜N p fj ⎟−⎜1 ( p fj f Enters m)<br />
⎟<br />
⎜ ∑ <br />
⎟ ⎜ ∑<br />
⎟<br />
⎝ j∈lb ⎠ ⎝ j∈lb<br />
⎠<br />
Note that the probability that a f-format firm enters the market is simply<br />
the probability ( p fj f Enters m ) is given by<br />
fj fl<br />
l=<br />
1<br />
52<br />
lm<br />
A.2<br />
A.3<br />
A.4<br />
lm<br />
∑ p fl . Hence,<br />
l=<br />
1<br />
p ∑ p and Equation (A.4) can be rewritten as:<br />
⎛ ⎞ ⎛ lm<br />
⎞<br />
m<br />
E⎣ ⎡N fbl ⎦<br />
⎤ = ⎜N pfj⎟−⎜1 pfjpfl⎟ ⎜ ∑ <br />
⎟ ⎜ ∑ ∑ ⎟<br />
⎝ j∈lb ⎠ ⎝ j∈ lb<br />
l=<br />
1 ⎠<br />
A.5
Step 0: Initial Population:<br />
Appendix B<br />
Generate a set of T vectors of start<strong>in</strong>g values for retailers’ beliefs about rivals’ CCPs for<br />
1 2 T<br />
location choices, ⎡<br />
⎣P0; P0 ;...; P ⎤ 0 ⎦ Also, create an <strong>in</strong>itial guess for the parameter vector,<br />
θ = αβγσρ , , , , .<br />
( { } )<br />
Step 1: Locally Contractive, q-NPL Iteration:<br />
For the likelihood maximization, set up an <strong>in</strong>ternal loop to do the follow<strong>in</strong>g for each of the T<br />
CCP vectors:<br />
Given the current parameter values, pick a large number of Halton draws of price,<br />
revenue and cost shocks for all retail locations. Obta<strong>in</strong> the location choice probabilities<br />
(Equations 20 - 22) and the market specific cost parameters (Equations 26 - 27). Next, calculate<br />
the price <strong>in</strong>dices of firms, sans the unobserved component, for the chosen locations and with the<br />
observed configuration of stores. Compare the price estimates with the price data to obta<strong>in</strong> the<br />
pr<br />
price shocks at the chosen locations of the store cha<strong>in</strong> for which we have price data, ( ωobv θ ) .<br />
Also, calculate the revenues of stores, sans the unobserved component, for the chosen locations<br />
and with the observed store configuration. Compare the revenue estimates with the revenue data<br />
r<br />
for all stores to obta<strong>in</strong> the revenue shocks of firms <strong>in</strong> their chosen locations, ( ωobv θ ) . We now<br />
have all the components of the likelihood function. 30<br />
Maximize the pseudo likelihood (Equation 28) to obta<strong>in</strong> a set of T vectors of parameter<br />
t t<br />
estimates: Θ n = arg max ( L( Pn−1, Θ ) ) , and a new population of CCPs us<strong>in</strong>g the q-NPL operator:<br />
Θ<br />
ˆ t q t t<br />
P =Λ P −1,<br />
Θ .<br />
( )<br />
n n n<br />
With<strong>in</strong> each market, normalize the CCPs for each store format so that the CCPs of all formats<br />
add up to one. Essentially, for each format f, and market location l, we have :<br />
F lm<br />
ˆ t q t t q t t<br />
Pfln=Λ ( Pfln−1, Θn) ∑∑ Λ ( Pfln−1,<br />
Θn)<br />
B.1<br />
53<br />
f = 1 l=<br />
1<br />
30 We acknowledge that there is a potential selection bias because we only observe revenue data for locations that<br />
were chosen. Ellickson and Misra (2007) propose a selection correction function <strong>in</strong> their application where<br />
supermarkets choose from one of three pric<strong>in</strong>g strategies. However, their approach suffers from a curse of<br />
dimensionality <strong>in</strong> cases where the card<strong>in</strong>ality of firms’ action space is large, as is the case for firms choos<strong>in</strong>g from<br />
multiple locations with<strong>in</strong> a market.
Step 2: Selection of Parents - Based on their fitness, draw, with replacement, T ‘mother’ CCP<br />
1 2<br />
vectors and T ‘father’ CCP vectors from the set, ⎡ ˆ<br />
;<br />
ˆ<br />
;...;<br />
ˆT<br />
Pn Pn P ⎤<br />
⎢ n and form couples or Parents.<br />
⎣ ⎥⎦<br />
ˆ t t<br />
L P , Θ , and those closer to convergence (Absolute value<br />
CCPs with high likelihood values, ( n n)<br />
of<br />
ˆ t ( Pn t<br />
Pn−1) − closer to zero) are considered more fit to cont<strong>in</strong>ue. In our problem, we use the<br />
follow<strong>in</strong>g fitness criterion:<br />
( ) ln ( , )<br />
h P<br />
ˆ<br />
= λ ⎡L P<br />
ˆ<br />
Θ ⎤−λ<br />
P<br />
ˆ<br />
−P<br />
−<br />
⎢⎣ ⎥⎦<br />
t t t t t<br />
n 1 n n 2 n n 1<br />
where, λ 1 and λ 2 are small positive constants. <strong>The</strong> t th CCP vector gets selected with the<br />
probability:<br />
T<br />
ˆ ˆ<br />
( ( n ) ) ( ( n ) )<br />
54<br />
j=<br />
1<br />
B.2<br />
t t j<br />
S = exp h P ∑ exp h P<br />
B.3<br />
Now, we have the set of couples:<br />
ˆ1'ˆ1'' ˆ 2' ˆ 2'' ˆT'ˆT'' ( Pn , Pn ) ; ( Pn , Pn ) ;...; ( Pn , Pn<br />
)<br />
⎡ ⎤<br />
⎢⎣ ⎥⎦<br />
Step 3: Crossover and Mutation – Obta<strong>in</strong> an offspr<strong>in</strong>g from each couple as follows:<br />
ˆ ' ' '' ''<br />
( δ<br />
ˆ ) ( 1<br />
ˆ ˆ<br />
) ( δ )<br />
P = D• P + Z • • P + −D • P + Z • • P B.4<br />
t t t t t<br />
n n n n n n n n n<br />
where, D is a vector of <strong>in</strong>dicators for the identity of the parent who provides each element of the<br />
CCPs. Its elements are i.i.d. with Pr ( D j = 1) = 0.5 for the j th element. Zn is another vector of<br />
<strong>in</strong>dicators for the identity of the elements of the CCPs which undergo mutation. Its elements are<br />
also i.i.d. with Pr ( Z jn = 1) = 0.5 n.<br />
Hence, with multiple iterations, as we get closer to the<br />
global optimum, we allow the amount of mutations to reduce to zero. F<strong>in</strong>ally, δn is a vector<br />
whose elements represent the magnitude of a mutation. It is also def<strong>in</strong>ed such that its elements<br />
δ ∈U − 0.5 n, 0.5 n<br />
go to zero with multiple iterations. Specifically, we use: jn ( )<br />
As with Step 1, with<strong>in</strong> each market, aga<strong>in</strong> normalize the CCPs so that the CCPs of all<br />
1 2 T<br />
formats add up to one. Now, we have the new set of CCPs, ⎡<br />
⎣Pn; Pn ;...; P ⎤ n ⎦ .<br />
Iterate Steps 1-3 until the set of CCPs converges.
Appendix C<br />
<strong>The</strong> follow<strong>in</strong>g steps expla<strong>in</strong> the technical operations <strong>in</strong>volved <strong>in</strong> extract<strong>in</strong>g commercial<br />
land use pixel po<strong>in</strong>t data from NLCD. This is the authors’ orig<strong>in</strong>al approach. However, a more<br />
efficient approach may be plausible.<br />
1. Open NLCD data <strong>in</strong> ArcGIS<br />
2. Zoom <strong>in</strong> to the <strong>in</strong>terested market area and select the data frame for further process<strong>in</strong>g<br />
3. Change coord<strong>in</strong>ate system to WGS 1984<br />
4. Reclassify the raster data to show only commercial land pixel po<strong>in</strong>ts<br />
5. Convert the reclassified raster data <strong>in</strong>to Po<strong>in</strong>t Features and save as a Shapefile<br />
6. Convert the saved Shapefile <strong>in</strong>to a kml file us<strong>in</strong>g shp2kml software. <strong>The</strong> kml file can be<br />
opened <strong>in</strong> Google Earth (GE), allow<strong>in</strong>g us to see the pixel po<strong>in</strong>t data on GE<br />
7. Make a copy of the saved kml file and rename the file from “.kml” to “.xml” This xml file<br />
can be opened <strong>in</strong> Excel and the spreadsheet will show the coord<strong>in</strong>ates (latitude and longitude)<br />
of each pixel po<strong>in</strong>t, which may be used for further analysis<br />
8. <strong>The</strong> count of these pixel po<strong>in</strong>ts with<strong>in</strong> each 1 sq. mi. block market location gives the measure<br />
for the <strong>in</strong>tensity of commercial activity <strong>in</strong> the location and the mean of the coord<strong>in</strong>ates of<br />
the pixel po<strong>in</strong>ts with<strong>in</strong> the location gives the commercial center of the location<br />
In their classification of land types, NLCD 2001 comb<strong>in</strong>es high density residential land<br />
and commercial land but NLCD 1992 separates them. Hence, we match the two data sets us<strong>in</strong>g<br />
ArcGIS software to separate the pixel data for all residential land areas from land areas with<br />
commercial activity <strong>in</strong> 2001. We are able to do this separation because land areas which were<br />
high density residential <strong>in</strong> 1992 are unlikely to convert to commercial land areas by 2001, and<br />
vice versa. In the rare <strong>in</strong>stances where an area that was low-density residential <strong>in</strong> 1992 was<br />
classified as commercial land <strong>in</strong> the 2001 data, we do a quick visual <strong>in</strong>spection of the<br />
geographical area us<strong>in</strong>g Google Earth to confirm whether that area is truly commercial land or if<br />
it has converted <strong>in</strong>to a high density residential land.<br />
55
References<br />
Aguirregabiria, V., and P. Mira (2005), “A Genetic Algorithm for the Structural Estimation of<br />
Games with Multiple Equilibria,” Work<strong>in</strong>g Paper.<br />
Aguirregabiria, V., and G. Vicent<strong>in</strong>i (2006), “Dynamic Spatial Competition between Multi-Store<br />
Firms,” Work<strong>in</strong>g Paper.<br />
Aguirregabiria, V., and P. Mira (2007), “Sequential Estimation of Dynamic Discrete Games,”<br />
Econometrica, 75, 1, 1 - 53.<br />
Aguirregabiria, V., P. Bajari, M. Draganska, L. E<strong>in</strong>av, D. Horsky, S. Misra, S. Narayanan, Y.<br />
Orhun, P. Reiss, K. Seim, V. S<strong>in</strong>gh, R. Thomadsen, and T. Zhu (2008), “Discrete Choice<br />
Models with Strategic Interactions,” Market<strong>in</strong>g Letters, 19, 399-416.<br />
Arentze, T.A., O. H. Oppewal, and H.J.P. Timmermans (2005), “A Multipurpose Shopp<strong>in</strong>g Trip<br />
model to Assess Retail <strong>Agglomeration</strong> Effects,” Journal of Market<strong>in</strong>g Research, 42<br />
(February), 109-115.<br />
Bajari, P., Benkard, L., and Lev<strong>in</strong>, J. (2007), “Estimat<strong>in</strong>g Dynamic Models of Imperfect<br />
Competition,” Econometrica 75, 5, 1331 - 1370.<br />
Berry, S. (1992), “Estimation of a Model of Entry <strong>in</strong> the Airl<strong>in</strong>e Industry,” Econometrica, 60.<br />
889 – 917.<br />
Berry, S., and E. Tamer (2006): “Identification <strong>in</strong> Models of Oligopoly Entry,” Advances <strong>in</strong><br />
Economics and Econometrics: <strong>The</strong>ory and Applications, N<strong>in</strong>th World Congress, Vol. II.<br />
Bester, H. (1998), “Quality Uncerta<strong>in</strong>ty Mitigates Product <strong>Differentiation</strong>,” RAND Journal of<br />
Economics, 29 (W<strong>in</strong>ter), 828-844.<br />
Bresnahan, T., and P. Reiss (1991), “Entry and Competition <strong>in</strong> Concentrated Markets,” Journal<br />
of Political Economy, 99, 977-1009.<br />
Chan, T. Y., V. Padmanabhan, and P. B. Seetharaman (2007), “An Econometric Model of<br />
Location and Pric<strong>in</strong>g <strong>in</strong> the Gasol<strong>in</strong>e Market,” Journal of Market<strong>in</strong>g Research, 44, 4, 622<br />
- 635.<br />
Ciliberto, F. and E. Tamer (2009), “Market Structure and Multiple Equilibria <strong>in</strong> Airl<strong>in</strong>e<br />
Markets,” Econometrica, 77, 6, 1791-1828.<br />
Datta S. and K. Sudhir (2011), “Does Reduc<strong>in</strong>g Spatial <strong>Differentiation</strong> Increase Product<br />
<strong>Differentiation</strong>? Effects of Zon<strong>in</strong>g on Retail Entry and Format Variety,” forthcom<strong>in</strong>g <strong>in</strong><br />
Quantitative Market<strong>in</strong>g and Economics.<br />
Draganska, M., Mazzeo, M., and Seim, K. (2009), “Beyond Pla<strong>in</strong> Vanilla: Model<strong>in</strong>g Jo<strong>in</strong>t<br />
Product Assortment and Pric<strong>in</strong>g Decisions,” Quantitative Market<strong>in</strong>g and Economics, 7, 2,<br />
105 - 146.<br />
Duan, A.J. and C. F. Mela (2009), “<strong>The</strong> Role of Spatial Demand on Outlet Location and<br />
Pric<strong>in</strong>g,” Journal of Market<strong>in</strong>g Research, 46, 2, 260 – 278.<br />
56
Dudey, M. (1990), “Competition by Choice: <strong>The</strong> Effect of Consumer Search on Firm Location<br />
Decisions,” <strong>The</strong> American Economic Review, 80 (5), 1092-1104.<br />
Ellickson, P. B., and S. Misra (2011), “Enrich<strong>in</strong>g Interactions: Incorporat<strong>in</strong>g Revenue and Cost<br />
Data <strong>in</strong>to Static Discrete Games,” Quantitative Market<strong>in</strong>g and Economics,<br />
(Forthcom<strong>in</strong>g).<br />
Ellickson, P. B. and S. Misra (2008), “Supermarket Pric<strong>in</strong>g Strategies,” Market<strong>in</strong>g Science, 27,<br />
5, 811 – 828.<br />
Fischer, J.H., and J.E. Har<strong>in</strong>gton (1996), “Product Variety and Firm <strong>Agglomeration</strong>,” RAND<br />
Journal of Economics, 27, 281-309.<br />
Fox, J. (2007), “Semiparametric Estimation of Mult<strong>in</strong>omial Discrete Choice Models Us<strong>in</strong>g a<br />
Subset of Choices,” RAND Journal of Economics, 38, 4, 1002 - 1019.<br />
Fox, E. J., S. Postrel and A. McLaughl<strong>in</strong> (2007), “<strong>The</strong> Impact of Retail Location on Retailer<br />
Revenues: An Empirical Investigation,” Work<strong>in</strong>g paper<br />
Holmes, T. (2008), “<strong>The</strong> Diffusion of Wal-Mart and Economies of Density,” forthcom<strong>in</strong>g <strong>in</strong><br />
Econometrica.<br />
Homer, C., C. Huang, L. Yang, B. Wylie, and M. Coan (2004), “Development of a 2001<br />
National Land-Cover Database for the United States,” Photogrammetric Eng<strong>in</strong>eer<strong>in</strong>g &<br />
Remote Sens<strong>in</strong>g, 70, 7, 829 – 840.<br />
Jia, P. (2008), “What Happens When Wal-Mart Comes to Town: An Empirical Analysis of the<br />
Discount Retail<strong>in</strong>g Industry,” Econometrica, 76, 6, 1263 - 1316.<br />
Kasahara, H. and K. Shimotsu (2008), “Sequential Estimation of Structural Models with a Fixed<br />
Po<strong>in</strong>t Constra<strong>in</strong>t,” Work<strong>in</strong>g Paper.<br />
Konishi, H. (2005), “Concentration of Compet<strong>in</strong>g Retail Stores,” Journal of Urban Economics,<br />
58, 488-512.<br />
Mazzeo, M. (2002), “Product Choice and Oligopoly Market Structure,” RAND Journal of<br />
Economics, 33, 221-242.<br />
Orhun, Y. (2005), “Spatial differentiation <strong>in</strong> the Supermarket Industry,” Work<strong>in</strong>g Paper.<br />
Pakes, A., M. Ostrovsky, and S. Berry (2007), “Simple Estimators for the Parameters of Discrete<br />
Dynamic Games, with Entry/Exit Examples,” RAND Journal of Economics, 38, 373 - 399<br />
Pesendorfer, M., and Schmidt-Dengler, P. (2008), “Asymptotic least Squares Estimators for<br />
Dynamic Games,” Review of Economic Studies, 75, 901-928.<br />
Seim, K. (2006), “An Empirical Model of Firm Entry with Endogenous Product-Type Choices,”<br />
RAND Journal of Economics, 37 (3), 619-640.<br />
Shlay, A. B. and P. H. Rossi (1982), “Keep<strong>in</strong>g up the Neighborhood: Estimat<strong>in</strong>g Net Effects of<br />
Zon<strong>in</strong>g,” American Sociological Review, 46, 703-719.<br />
57
Stahl, K. (1982), “Differentiated Products, Consumer Search, and Locational Oligopoly,” <strong>The</strong><br />
Journal of Industrial Economics, 31 (1-2), 97-113.<br />
Su, C., and K. L. Judd (2010), “Constra<strong>in</strong>ed Optimization Approaches to Estimation of<br />
Structural Models,” Work<strong>in</strong>g Paper<br />
Thomadsen, R. (2007), “Product Position<strong>in</strong>g and Competition: <strong>The</strong> Role of Location <strong>in</strong> the Fast<br />
Food Industry,” Market<strong>in</strong>g Science, 26, 6, 792 – 804.<br />
Varian, R. H. (1980), “A Model of Sales,” American Economic Review, 70, 651-659.<br />
Vitor<strong>in</strong>o, M. A. (2011), “Empirical Entry Games with Complementarities: An Application to the<br />
Shopp<strong>in</strong>g Center Industry,” Work<strong>in</strong>g Paper.<br />
Vogelmann, J.E., S.M. Howard, L. Yang, C.R. Larson, B.K. Wylie, and J.N. Van Driel (2001),<br />
“Completion of the 1990’s National Land Cover Data Set for the Conterm<strong>in</strong>ous United<br />
States,” PhotogrammetricEng<strong>in</strong>eer<strong>in</strong>g & Remote Sens<strong>in</strong>g, 67, 6, 650 – 662.<br />
Watson, R. (2005), “Entry and Location choice <strong>in</strong> Eyewear Retail<strong>in</strong>g,” mimeo., <strong>University</strong> of<br />
Texas-Aust<strong>in</strong>.<br />
Wernerfelt, B. (1994), “Sell<strong>in</strong>g Formats for Search Goods,” Market<strong>in</strong>g Science, 13 (3), 298-309.<br />
Wol<strong>in</strong>sky, A. (1983), “Retail Trade Concentration Due to Consumers’ Imperfect Information,”<br />
<strong>The</strong> Bell Journal of Economics, 14 (1), 275-282.<br />
Zhu, T., V. S<strong>in</strong>gh, and M. Manuszak (2009), “Market Structure and Competition <strong>in</strong> the Retail<br />
Discount Industry,” Journal of Market<strong>in</strong>g Research, 46, 4, 453-466.<br />
Zhu, T. and V. S<strong>in</strong>gh (2009), “Spatial Competition with Endogenous Location Choices: An<br />
Application to Discount Retail<strong>in</strong>g,” Quantitative Market<strong>in</strong>g and Economics, 7, 1, 1 - 35.<br />
58