05.04.2013 Views

The Agglomeration-Differentiation Tradeoff in ... - Yale University

The Agglomeration-Differentiation Tradeoff in ... - Yale University

The Agglomeration-Differentiation Tradeoff in ... - Yale University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>The</strong> <strong>Agglomeration</strong>-<strong>Differentiation</strong> <strong>Tradeoff</strong> <strong>in</strong> Spatial Location Choice<br />

Sumon Datta<br />

Krannert School of Management<br />

Purdue <strong>University</strong><br />

403 W. State Street<br />

West Lafayette, IN 47907<br />

Email: sdatta@purdue.edu<br />

Phone: (765) 496-7747<br />

Fax: (765) 494-9658<br />

K. Sudhir<br />

<strong>Yale</strong> School of Management<br />

135 Prospect St, PO Box 208200<br />

New Haven, CT 06520<br />

Email: k.sudhir@yale.edu<br />

Phone: (203) 432-3289<br />

Fax: (203) 432-3003<br />

June 2011


Abstract<br />

Retailers often co-locate spatially to draw consumers, even though it <strong>in</strong>creases price competition.<br />

<strong>The</strong> paper develops a structural model of entry and location choice that isolates the<br />

agglomeration benefit of co-location, after controll<strong>in</strong>g for pure differentiation rationales for co-<br />

location such as (1) high demand and/or low cost at the location; (2) zon<strong>in</strong>g restrictions and (3)<br />

format differentiation that m<strong>in</strong>imizes the need for spatial differentiation. We augment entry and<br />

location choice data used <strong>in</strong> the literature with revenue and price data to help identify the<br />

agglomeration effect. We <strong>in</strong>troduce a new approach to obta<strong>in</strong> zon<strong>in</strong>g data across a large number<br />

of markets that should be of general <strong>in</strong>terest for a large stream of spatial location applications.<br />

We f<strong>in</strong>d that agglomeration benefits expla<strong>in</strong> a significant fraction of observed co-location. While<br />

zon<strong>in</strong>g restrictions have little direct impact on co-location, <strong>in</strong> comb<strong>in</strong>ation with the<br />

agglomeration benefit, they expla<strong>in</strong> a surpris<strong>in</strong>gly large fraction of observed co-location.<br />

Keywords: Entry, Location Choice, <strong>Agglomeration</strong>, <strong>Differentiation</strong>, Zon<strong>in</strong>g, Retail<br />

Competition, Store Format, Discrete Games, Multiple Equilibria, Structural Model<strong>in</strong>g


1. Introduction<br />

Spatial cluster<strong>in</strong>g is a common phenomenon <strong>in</strong> many types of retail markets such as<br />

restaurants, automobile dealerships, electronics shops and bridal boutiques. That the<br />

phenomenon is well recognized <strong>in</strong> the popular imag<strong>in</strong>ation is seen <strong>in</strong> the popular labels for such<br />

retail clusters: hamburger alleys, restaurant rows, automobile malls etc. Consider for example,<br />

the retail locations of compet<strong>in</strong>g grocery stores. Figure 1 shows the distribution of distance<br />

between a grocery store and its nearest competitor <strong>in</strong> the three US states of New York,<br />

Pennsylvania and Ohio. Somewhat surpris<strong>in</strong>gly, over 45% of stores are located with<strong>in</strong> 0.5 miles<br />

of a competitor. What expla<strong>in</strong>s the high observed levels of co-location <strong>in</strong> grocery stores?<br />

When a grow<strong>in</strong>g retailer embarks on a store expansion strategy it faces two key<br />

questions: (1) Should it enter a particular market (entry decision), and if so, (2) Where with<strong>in</strong> the<br />

market should it locate the new store (location decision)? Economists have long recognized that<br />

locat<strong>in</strong>g close to a competitor could <strong>in</strong>crease profits by <strong>in</strong>creas<strong>in</strong>g aggregate demand at the<br />

location even though the lack of spatial differentiation is likely to <strong>in</strong>crease price competition<br />

(Marshall, 1920). This agglomeration-differentiation tradeoff or volume-price tradeoff is a<br />

central tradeoff <strong>in</strong> spatial location choice. Indeed, there is ample theoretical research (Varian<br />

1980, Stahl 1982, Wol<strong>in</strong>sky 1983, Dudey 1990, Fischer and Harr<strong>in</strong>gton 1996, Bester 1998,<br />

Arentze et al., 2005, Konishi 2005) to suggest that agglomeration benefits can act as an <strong>in</strong>centive<br />

for firms to forego spatial differentiation. 1<br />

But how can we measure the volume and price effects<br />

due to competitors?<br />

1 Some empirical evidence of the benefits of spatial co-location can be found <strong>in</strong> Fox et al. 2007 and Watson 2005.<br />

Vitor<strong>in</strong>o (2011) f<strong>in</strong>ds evidence for <strong>in</strong>ter-store spillovers <strong>in</strong> a particular k<strong>in</strong>d of retail cluster - Shopp<strong>in</strong>g Malls. But <strong>in</strong><br />

a mall sett<strong>in</strong>g, firms only make a strategic entry decision; they do not face the tradeoff of whether to co-locate or<br />

spatially differentiate with rivals.<br />

1


A retailer could use detailed household level data on consumer store choices across<br />

several markets that vary <strong>in</strong> market characteristics (like population and <strong>in</strong>come) and the number<br />

of stores of different formats, and their locations, to estimate the benefit of agglomeration<br />

(volume effect due to co-location) through a household level model of store choice. 2<br />

Given these<br />

household level estimates, one can then solve for a competitive pric<strong>in</strong>g equilibrium to identify<br />

the benefit of spatial differentiation (price effect due to differentiation). Such a method, however,<br />

tends to be impractical because such detailed household level data across multiple retailers are<br />

difficult to obta<strong>in</strong> and a household level analysis across markets is too onerous.<br />

Another approach could be to use firm level data on revenues and prices of all stores<br />

across several markets. Assum<strong>in</strong>g store locations as given, one could develop a consumer<br />

shopp<strong>in</strong>g behavior model to identify the benefit of agglomeration, coupled with a price<br />

competition model to identify the benefit of differentiation. This approach, however, could suffer<br />

from serious issues due to the endogeneity of market structure (i.e., number of firms that enter a<br />

market and their locations). For <strong>in</strong>stance, a location with a high unobserved demand shock is<br />

likely to have higher revenues but the location is also likely to attract more firms. Not account<strong>in</strong>g<br />

for the endogeneity of market structure can give biased estimates of the parameters captur<strong>in</strong>g the<br />

agglomeration benefit and the competitive <strong>in</strong>teractions.<br />

To <strong>in</strong>fer the strategic <strong>in</strong>teractions between firms, researchers <strong>in</strong> market<strong>in</strong>g and economics<br />

have adopted an alternative empirical approach that uses readily observed entry and location<br />

decisions of firms. <strong>The</strong> approach is built on the idea that firms take <strong>in</strong>to account their<br />

competitors’ actions when mak<strong>in</strong>g their decisions. Thus, by solv<strong>in</strong>g for the location choice game<br />

between firms we can <strong>in</strong>fer the strategic <strong>in</strong>teractions between firms. A vast majority of papers<br />

2 For example, Fox et al., (2007) use data from a multi-outlet panel to study consumers’ shopp<strong>in</strong>g behavior and its<br />

impact on store revenues. However their data is from a s<strong>in</strong>gle major metropolitan market.<br />

2


have used this approach to study firms’ entry decisions (e.g., Bresnahan and Reiss 1991; Berry<br />

1992; Mazzeo 2002; Aguirregabiria and Mira 2007; Bajari et al., 2007; Vitor<strong>in</strong>o 2011; Zhu,<br />

S<strong>in</strong>gh and Manuszak 2009; Ciliberto and Tamer 2009). 3<br />

<strong>The</strong> reduced form approach is <strong>in</strong>capable of separat<strong>in</strong>g the ‘net effect’ of competitors <strong>in</strong>to<br />

a volume effect and a price competition effect which can <strong>in</strong>dependently describe the<br />

agglomeration benefit from co-location and the benefit from spatial differentiation, respectively.<br />

In particular, if consumers are <strong>in</strong> fact attracted to a location with multiple compet<strong>in</strong>g stores then<br />

the demand at any location will be endogenous to a firm’s location choice decision and the<br />

decisions of competitors. <strong>The</strong> reduced form approach cannot dist<strong>in</strong>guish such endogenous<br />

demand from the latent profit and is therefore unsuitable for study<strong>in</strong>g firms’ agglomeration-<br />

differentiation tradeoff.<br />

3<br />

A smaller literature has analyzed the<br />

strategic location choice decisions (e.g., Seim 2006; Watson 2005; Orhun 2005; Zhu and S<strong>in</strong>gh<br />

2009), where firms not only decide whether to enter <strong>in</strong>to a market but if they enter, where to<br />

locate and how far to locate from a competitor. <strong>The</strong>se structural models of location choice use a<br />

reduced form profit function that allows latent profit <strong>in</strong> a location to depend on the number of<br />

competitors, and their distances from that location. As we can only make the <strong>in</strong>ference that a<br />

firm’s chosen location must be more profitable <strong>in</strong> expectation than any alternative location, at<br />

best, we can only estimate the average ‘net effect’ of competitors on firm profit. A net negative<br />

effect is characterized as the competition effect and a decrease <strong>in</strong> the negative effect with the<br />

distance of the competitor is highlighted to emphasize only the benefit of spatial differentiation.<br />

Also, a crucial challenge <strong>in</strong> disentangl<strong>in</strong>g the agglomeration-differentiation tradeoff is<br />

that observed co-location may be consistent with pure differentiation rationales. That is, even if<br />

3 Some have addressed entry decisions of retail cha<strong>in</strong>s, consider<strong>in</strong>g how these cha<strong>in</strong>s build up their network (Jia<br />

2008; Holmes 2008; Ellickson, Houghton, and Timm<strong>in</strong>s 2008).


there are no agglomeration benefits firms may still locate close to each other when (a) there is<br />

high demand at the location; (b) there is low cost at the location; (c) zon<strong>in</strong>g regulations restrict<br />

retailers to set-up stores <strong>in</strong> very concentrated areas and (d) the need for spatial differentiation is<br />

lower when retailers can differentiate on other attributes or dimensions such as store formats.<br />

Aga<strong>in</strong>, the exist<strong>in</strong>g structural models of firms’ entry and location choice decisions do not<br />

<strong>in</strong>corporate most of these features. In this paper we develop a comprehensive structural model<br />

that disentangles the agglomeration-differentiation tradeoff while simultaneously controll<strong>in</strong>g for<br />

the alternative explanations for co-location.<br />

We make three crucial contributions to the literature. First, we <strong>in</strong>troduce a novel<br />

approach to obta<strong>in</strong> spatial zon<strong>in</strong>g regulation data for any number of markets. Most towns and<br />

cities <strong>in</strong> the U.S. practice s<strong>in</strong>gle-use zon<strong>in</strong>g where<strong>in</strong> locations with high population and <strong>in</strong>come<br />

are often zoned as residential land where big-box retailers are not allowed to open stores.<br />

Previous studies wrongly viewed the absence of stores <strong>in</strong> such locations as a strategic choice of<br />

firms. Similarly, smaller and concentrated retail zones might force rivals to cluster together. But<br />

previous models would <strong>in</strong>fer such cluster<strong>in</strong>g by rivals to be the result of low competition. Datta<br />

and Sudhir (2011) exhibit the role of zon<strong>in</strong>g <strong>in</strong> firms’ entry location and store format choice<br />

decisions, and the potential biases <strong>in</strong> <strong>in</strong>ference that can result from ignor<strong>in</strong>g zon<strong>in</strong>g. Even though<br />

the critical importance of spatial zon<strong>in</strong>g is fairly well-known, extant research has completely<br />

ignored this issue because of lack of availability of zon<strong>in</strong>g data on a national scale across many<br />

markets. To control for spatial zon<strong>in</strong>g regulations, we use a publicly available, digital dataset<br />

called National Land Cover Dataset (NLCD). In conjunction with Geographic Information<br />

System (GIS) tools such as ArcGIS and Google Earth, we can recover zon<strong>in</strong>g data <strong>in</strong> any number<br />

of markets across the entire U.S. This is the first application of digital land cover data <strong>in</strong><br />

4


Market<strong>in</strong>g and the approach should be of general <strong>in</strong>terest for a large stream of spatial location<br />

applications.<br />

Second, we decompose store profits <strong>in</strong>to revenue and cost, and <strong>in</strong>corporate common<br />

unobserved demand and cost shocks – Location specific demand (cost) characteristics such as,<br />

say, traffic patterns (tax-breaks), which may be common knowledge for firms but which are<br />

unobserved by the researcher. For this, we augment firms’ entry and location choice data with<br />

store revenue data. 4 Extant structural models use a reduced form profit function that cannot<br />

discern whether a location was chosen because of high demand or because of low costs.<br />

Furthermore, when firms cluster <strong>in</strong> a location because the <strong>in</strong>creased competition is more than<br />

offset by an unobserved positive revenue (negative cost) shock at the location, exist<strong>in</strong>g models<br />

would misattribute the co-location to low competition. 5<br />

In our approach, the portion of observed<br />

store revenue that is not expla<strong>in</strong>ed by the observable demand factors or the observed market<br />

structure is attributed to an unobserved revenue shock at the location which is a draw from a<br />

distribution. Hav<strong>in</strong>g accounted for revenue, we then identify the residual cost function through<br />

the latent profit function and the data on observed entry and location choice decisions of firms.<br />

Thus, we are able to <strong>in</strong>fer how the observed market characteristics affect revenue and cost<br />

differently which gives us better <strong>in</strong>sights about the drivers of store location choice.<br />

Third, we show how to disentangle the agglomeration-differentiation tradeoff by further<br />

decompos<strong>in</strong>g store revenue <strong>in</strong>to its components of consumer shopp<strong>in</strong>g location choice based<br />

volume and spatial competition based price. Specifically, for volume we model consumers’<br />

4 Some recent research that has also used post-action performance data to ga<strong>in</strong> richer <strong>in</strong>sights about the drivers of<br />

firms’ strategic decisions <strong>in</strong>clude Ellickson and Misra (2007) and Draganska et al., (2009).<br />

5 Orhun (2005) attempts to control for location-specific common profit shocks. However, with only choice data, one<br />

can only model latent profits whose errors have to be normalized for estimation. For <strong>in</strong>stance, Orhun (2005)<br />

assumed that the distribution of common profit shocks have a standard normal distribution.<br />

5


shopp<strong>in</strong>g location choice, which <strong>in</strong>corporates the spatial configuration of firms around<br />

consumers, and we model price as a function of the spatial configuration of rivals around a store.<br />

Hence, the benefits of agglomeration are realized through <strong>in</strong>creased volume potential while the<br />

benefits of spatial differentiation emerge from acquir<strong>in</strong>g a greater share of that potential as a<br />

result of decreased price competition. As competitors affect both volume and price, a non-<br />

parametric identification of the two effects would require additional data on sales or prices.<br />

Without this data, one would have to rely on suitable functional form assumptions so that the<br />

locations of rivals affect store volume differently from the way they affect store prices. Hence we<br />

further augment our data with price data for a set of stores that belong to one store cha<strong>in</strong>.<br />

F<strong>in</strong>ally, when different store formats specialize <strong>in</strong> different product categories or pric<strong>in</strong>g<br />

strategies or services, the need for spatial differentiation may be lower. In the context of grocery<br />

stores, format types <strong>in</strong>clude Supermarkets, Superstores, Limited Assortment and Warehouse<br />

stores, Natural Foods stores, Food and Drug stores and Supercenters and Wholesale Clubs.<br />

Datta and Sudhir (2011) show that when zon<strong>in</strong>g restrictions <strong>in</strong>crease <strong>in</strong> a market the enter<strong>in</strong>g<br />

grocery retailers are more likely to exhibit greater diversity <strong>in</strong> their store formats as a means to<br />

mitigate the reduced the scope for spatial differentiation. Hence, we control for format<br />

differentiation by account<strong>in</strong>g for the different store formats.<br />

<strong>The</strong> empirical strategy to <strong>in</strong>vestigate firms’ entry and location decisions <strong>in</strong>volves solv<strong>in</strong>g<br />

a choice game where firms’ strategies are <strong>in</strong>terrelated. We estimate a static, structural<br />

simultaneous move game for firms’ entry and location choice decisions with <strong>in</strong>complete<br />

<strong>in</strong>formation between firms. 6<br />

We use maximum likelihood estimation (MLE) for estimation of the<br />

6 We do not have store entry dates which are required to solve a dynamic choice game. However, our model can be<br />

extended to a dynamic set-up similar to Aguirregabiria and Vicent<strong>in</strong>i (2006) who have proposed a dynamic model of<br />

an oligopoly <strong>in</strong>dustry characterized by spatial competition.<br />

6


static discrete game. 7 Estimation challenges <strong>in</strong>clude the possibility of multiple equilibria <strong>in</strong> the<br />

model, multiple equilibria <strong>in</strong> the data, and slow convergence or potential non-convergence of the<br />

MLE algorithm. 8<br />

We build on recent developments <strong>in</strong> the empirical literature to address each of<br />

these challenges and these are expla<strong>in</strong>ed <strong>in</strong> detail <strong>in</strong> the estimation section.<br />

Our estimates and counterfactual analysis show that the agglomeration effect is strong<br />

and expla<strong>in</strong>s a significant fraction of observed co-location of grocery stores across several<br />

markets. Surpris<strong>in</strong>gly, zon<strong>in</strong>g has little direct effect on co-location. But tighter zon<strong>in</strong>g<br />

restrictions <strong>in</strong>teract with the agglomeration effect to expla<strong>in</strong> a surpris<strong>in</strong>gly large fraction of<br />

observed collocation. We f<strong>in</strong>d that a small change <strong>in</strong> zon<strong>in</strong>g can cause a discont<strong>in</strong>uous impact on<br />

the location pattern. <strong>The</strong> f<strong>in</strong>d<strong>in</strong>g that zon<strong>in</strong>g regulations and the agglomeration effect <strong>in</strong>teract to<br />

shape market structure has important policy implications for local government bodies that make<br />

zon<strong>in</strong>g decisions. It also highlights the value of a structural model <strong>in</strong> understand<strong>in</strong>g how a small<br />

perturbation of market characteristics can cause strategic firms to respond <strong>in</strong> complex and<br />

nonl<strong>in</strong>ear ways.<br />

<strong>The</strong> rest of the discussion is organized as follows: Section 2 describes the model and<br />

estimation strategy. Section 3 describes the data and the approach for recover<strong>in</strong>g spatial zon<strong>in</strong>g<br />

data. Section 4 describes the estimates of the model. Section 5 presents the results of<br />

counterfactual simulations. Section 6 concludes with a summary of the f<strong>in</strong>d<strong>in</strong>gs and the<br />

limitations of this research.<br />

2. Model and Estimation Strategy<br />

7<br />

Alternatives to likelihood based approaches <strong>in</strong>clude method of moments (Thomadsen 2005; Draganska et al.,<br />

2009), m<strong>in</strong>imum distance or asymptotic least square estimators (Pakes et al., 2007; Bajari et al., 2007; Pesendorfer<br />

and Schmidt-Dengler 2008) and maximum score estimators (Fox and Bajari 2010; Fox 2007; Ellickson et al., 2010).<br />

8<br />

See Aguirregabiria et. al., (2008) for a discussion on the dist<strong>in</strong>ction between multiple equilibria <strong>in</strong> model and<br />

multiple equilibria <strong>in</strong> data.<br />

7


2.1. A Comprehensive Model of Strategic Entry and Location Choice<br />

<strong>The</strong> entry and location choice game <strong>in</strong>volves a nested framework with two stages. In the<br />

first stage, each firm, i, decides whether or not to enter a market m (m = 1, 2,…, M). In the<br />

second stage, the enter<strong>in</strong>g firms simultaneously choose their respective store type or format, f (f<br />

= 1, 2,…, F) and the store location with<strong>in</strong> the market.<br />

For the purposes of illustration, imag<strong>in</strong>e a square city with a grid of L m discrete blocks or<br />

‘locations’ (Figure 2(a)). In extant models, firm i's payoff at each location, l (l = 1, 2,…, L m ), is<br />

modeled as a reduced-form function of the market characteristics at the location, xl, the actions<br />

(entry and location choices) of all firms, a = (ai, a-i), and an idiosyncratic profit shock, ε il , which<br />

is the firm’s private <strong>in</strong>formation and is known to rivals (and the researcher) only <strong>in</strong> distribution:<br />

m m<br />

π ( a ) =Π ( x , a)<br />

+ ε<br />

(1)<br />

ifl i f l il<br />

In this <strong>in</strong>complete <strong>in</strong>formation setup, a firm cannot predict rivals’ discrete actions but it<br />

has rational beliefs about their strategies. For example, suppose firms are homogeneous, then<br />

each firm will make its decision based on its belief about the number of firms that would enter<br />

the market ,<br />

m<br />

N , and its belief that an enter<strong>in</strong>g rival will choose a particular location as<br />

represented by a vector of conditional location choice probabilities,<br />

8<br />

( )<br />

m m<br />

P P { p1, p2,..., p m<br />

L }<br />

= .<br />

For <strong>in</strong>stance, the firm may have a belief that a rival, conditional on entry, will choose location ‘j’<br />

with probability p j . Hence, for homogeneous firms the expected profit at location l can be<br />

written as (after dropp<strong>in</strong>g subscript ‘f’ for format):


9<br />

( ( ⎡ ⎤ ) )<br />

m m m m<br />

E[ πil( ai)] =Π xl, E⎣N ⎦ , P + εil(2)<br />

We build on this popular model<strong>in</strong>g approach and <strong>in</strong>troduce several new features. First, <strong>in</strong><br />

the extant models, firms are allowed to consider all L m locations <strong>in</strong> the market so that each<br />

location has some positive probability of be<strong>in</strong>g chosen by a firm. However, s<strong>in</strong>ce firms are not<br />

allowed to set up stores <strong>in</strong> residential locations, we use our zon<strong>in</strong>g data to exclude such locations<br />

and concentrate only on a subset of potential retail locations, l = {1, 2,…, lm} (Figure 2(b)).<br />

Second, we break down the reduced-form profit <strong>in</strong>to revenue and a cost multiplier. 9<br />

We<br />

allow both revenue and cost to <strong>in</strong>clude observed and unobserved (to the researcher) components.<br />

Third, <strong>in</strong>stead of an idiosyncratic profit shock, we assume an idiosyncratic cost shock. Formally,<br />

we revise Equation (1) as follows:<br />

( ) ( )<br />

m r m c<br />

π ( a ) = R x , a, υ * C x , υ , ς<br />

(3)<br />

ifl i fl l l ifl l l il<br />

where, revenue has the follow<strong>in</strong>g multiplicative form:<br />

( ) ˆ ( )<br />

r r<br />

R x , a, υ = R x , a * υ<br />

(4)<br />

fl l l fl l l<br />

R ˆ<br />

fl is the observed component of store revenue that is a function of the store format, f,<br />

the market characteristics at the location, xl, and the actions (entry, location and format choices)<br />

r<br />

of all firms, a. <strong>The</strong> unobserved component of revenue, υ l , is a common location-specific shock<br />

that is common knowledge for all firms at the time of entry. It accounts for location-specific<br />

demand characteristics such as traffic density that are unobserved by the researcher.<br />

<strong>The</strong> cost multiplier <strong>in</strong> Equation (3) has the follow<strong>in</strong>g multiplicative form:<br />

( , υ , ε ) = ˆ ( ) * υ *exp ( ξ ) *exp ( ε )<br />

C x C x<br />

m c c m<br />

ifl l l il fl l l il<br />

9<br />

As described later (Equation 15), we will consider the log transformation of Equation (3) which will yield the<br />

familiar form: Profit = Revenue – Cost.<br />

(5)


where, the observed component, ˆ C fl , is a function of the store format, f, and the market<br />

characteristics at the location, xl,. <strong>The</strong> unobserved component of cost consists of three elements:<br />

c<br />

(a) A common location-specific shock, υ l , that is common knowledge for all firms at the time of<br />

entry. It accounts for location-specific cost characteristics such as commercial taxes that are<br />

unobserved by the researcher. Now, the common unobserved cost shock at a location is likely to<br />

be correlated with the common unobserved revenue shock at the location. We empirically check<br />

for this correlation through the follow<strong>in</strong>g assumption about the distribution of the two shocks:<br />

( )<br />

c ( υl<br />

)<br />

r ⎛ln υ ⎞ r<br />

2<br />

l ⎛ω⎞ ⎛ l 0 ⎡ σ r ρσ rσ⎤⎞ ⎜ ⎟=<br />

⎛ ⎞<br />

c<br />

⎜ ⎟ N ⎜ ,<br />

c ⎜ ⎟ ⎢ 2 ⎥⎟<br />

⎜ln ⎟ ⎜ω⎟ ⎜ 0 ρσ l ⎝ ⎠ rσc σ ⎟<br />

⎝ ⎠ ⎝ ⎣ c<br />

⎝ ⎠<br />

⎦⎠<br />

m<br />

(b) An overall market-specific attractiveness parameter, exp(<br />

)<br />

all firms but is unobserved by the researcher.<br />

10<br />

(6)<br />

ξ , that is common knowledge for<br />

(c) <strong>The</strong> firm’s idiosyncratic cost shock at the location, exp( ε il ) , that is the firm’s private<br />

<strong>in</strong>formation and known to rivals and the researcher only <strong>in</strong> distribution.<br />

F<strong>in</strong>ally, to separate the agglomeration-differentiation effect, we decompose the observed<br />

component of revenue, ˆ fl R , <strong>in</strong>to a consumer shopp<strong>in</strong>g location choice based volume ( v fl ) and a<br />

competition effect based price <strong>in</strong>dex ( pr fl ).<br />

( ) ( )<br />

Rˆ = v * pr<br />

fl fl fl<br />

This decomposition of revenue will enable us to separate the volume and price effects of<br />

competitors and thus dist<strong>in</strong>guish the benefits of agglomeration that <strong>in</strong>crease volume, from the<br />

benefits of spatial differentiation that reduce price competition. We now describe the volume and<br />

price components of revenue.<br />

2.1.1. Consumers’ Shopp<strong>in</strong>g Location Choice Based Volume<br />

(7)


We have detailed <strong>in</strong>formation about consumers up to the Census Block Group (CBG)<br />

level. Hence, <strong>in</strong> what follows, we use demographic data at the CBG level and assume that<br />

consumers are located at the population density weighted center of their respective CBG.<br />

However, the model can easily be extended to a household level.<br />

Consumers <strong>in</strong> each CBG, g, choose the store format and the retail location where they<br />

want to shop. <strong>The</strong>y <strong>in</strong>cur a travel cost (Tgl) to go to a retail location l. This travel cost could be a<br />

non-l<strong>in</strong>ear function of the distance, dgl, between the consumer’s location and the retail location.<br />

We also allow the travel cost to differ by the median household <strong>in</strong>come of the CBG (med_hhI),<br />

the median age (med_age), and the m<strong>in</strong>imum distance consumers have to travel before they can<br />

get to the nearest retail location (m<strong>in</strong>_d). For <strong>in</strong>stance, a consumer, who is located deep with<strong>in</strong> a<br />

residential zone and is far from the nearest retail location, may be more will<strong>in</strong>g to go to a store<br />

that is farther away, than say, a consumer who is close to several retail locations. Formally, the<br />

travel cost is given by:<br />

( )<br />

Tvl = α d + α d + α med _ hhI + α med _ age + α m<strong>in</strong>_ d * d<br />

gl 1 gl 2<br />

2<br />

gl 3 g 4 g 5<br />

g gl<br />

A consumer who wants to buy, let’s say, groceries, may be attracted to a particular<br />

grocery store <strong>in</strong> location l if the location also consists of other commercial activities that cater to<br />

the consumer’s non-grocery needs (e.g., electronics and apparel stores). That is, there could also<br />

be economies of scope from one-stop shopp<strong>in</strong>g or multipurpose shopp<strong>in</strong>g ( α MS ). Hence, we<br />

account for the extent of commercial activity <strong>in</strong> the location ( comm l ). In addition, if consumers<br />

expect low prices at the store then they may be even more likely to visit the store. To control for<br />

this price effect ( α pr ), we account for the price <strong>in</strong>dex of the store format, pr fl . <strong>The</strong> price <strong>in</strong>dex<br />

specification is described <strong>in</strong> the next sub-section.<br />

11<br />

(8)


Next, a consumer shopp<strong>in</strong>g for groceries likely frequents locations where multiple<br />

grocery stores are collocated (store agglomeration effect). Hence, we consider the effect of the<br />

total number of compet<strong>in</strong>g stores at the location, Nl. We also consider any scope economies of<br />

shopp<strong>in</strong>g with<strong>in</strong> the grocery sector when grocery stores with different formats collocate (format<br />

agglomeration effect). For <strong>in</strong>stance, consumers may be more likely to visit a particular Food and<br />

Drug store when it is located close to a Supercenter. Hence, we use an <strong>in</strong>dicator, OF<br />

I fl , for the<br />

presence of store formats other than the focal format f, and we also allow the format<br />

agglomeration effect to be format-specific. F<strong>in</strong>ally, consumers may simply have a strong<br />

<strong>in</strong>tr<strong>in</strong>sic preference ( α f , Pref ) for the store format f and there could also have an unobserved<br />

preference for the location, η gl . Formally, for a consumer <strong>in</strong> CBG g, the utility of shopp<strong>in</strong>g <strong>in</strong><br />

stores with format f <strong>in</strong> location l is:<br />

U = Uˆ+ η<br />

gfl gfl gl<br />

ˆ =− + + ln + + + (9)<br />

OF<br />

and U Tvl α comm α ( pr ) α N α , I α ,<br />

gfl gl MS l pr fl SA l f FA fl f Pref<br />

We assume i.i.d. Type 1 extreme value distribution for the preference shock so that the<br />

probability that a consumer <strong>in</strong> CBG g will shop <strong>in</strong> stores with format f <strong>in</strong> location l is given by<br />

the standard logit form:<br />

p<br />

=<br />

csr<br />

gfl F<br />

exp<br />

∑ ∑<br />

f '= 1 csr<br />

j∈Lg ( Uˆ<br />

gfl )<br />

exp(<br />

Uˆ<br />

gf ' j )<br />

where, the superscript ‘csr’ for the probability denotes that this is the choice probability of<br />

consumers. We put a cap on consumers’ choice set by specify<strong>in</strong>g that consumers may shop at<br />

12<br />

(10)


any retail location with<strong>in</strong> a radius, Rad. 10<br />

That is, we assume that Rad is the maximum distance<br />

that a consumer will travel for shopp<strong>in</strong>g, and so <strong>in</strong> Equation (10), L is the set of retail locations<br />

with<strong>in</strong> the radius, Rad, from CBG g. Eventually, we estimate our model with different<br />

specifications for Rad <strong>in</strong> order to empirically <strong>in</strong>fer the maximum distance that consumers are<br />

will<strong>in</strong>g to travel.<br />

Note that this maximum travel distance automatically implies that the trade radius of a<br />

store (catchment area from where the store gets its customers) is Rad. Next, us<strong>in</strong>g consumers’<br />

per capita <strong>in</strong>come as a proxy for their consumption capacity or their purchas<strong>in</strong>g ability, we<br />

construct a metric called Customer Value ( CV fl ) for measur<strong>in</strong>g the net worth of the consumers<br />

who are attracted towards stores with format f <strong>in</strong> location l. For this, note that Equation (10) is<br />

also the share of consumers located <strong>in</strong> CBG g, who will shop <strong>in</strong> stores with format f <strong>in</strong> location l.<br />

We weigh consumers’ choice probability by the number of such consumers (CBG population,<br />

Popg) and their per capita <strong>in</strong>come (PCIg). 11<br />

13<br />

csr<br />

g<br />

<strong>The</strong>n the customer value metric, CVfl<br />

, is obta<strong>in</strong>ed by<br />

aggregat<strong>in</strong>g the <strong>in</strong>flux of consumers from different CBGs around the location:<br />

where,<br />

location l:<br />

csr ( )<br />

CV = ∑ p Pop PCI . (11)<br />

fl gfl g g<br />

ret<br />

g∈Ll ret<br />

L l is the set of CBGs that lie with<strong>in</strong> the trade radius, Rad, of location l.<br />

We then transform this customer value metric <strong>in</strong>to volume for stores with format f <strong>in</strong><br />

v fl CVfl α<br />

= ⎡<br />

⎣<br />

⎤<br />

⎦<br />

10 If we do not impose such a cap on the maximum distance then the estimation becomes very cumbersome and slow<br />

as our dataset consists of several large markets that consist of large number locations and CBGs.<br />

11 We use per capita <strong>in</strong>come for convenience. Alternatively, one could, of course, use other better variables such as<br />

per capita expenditure on grocery.<br />

fV ,<br />

(12)


OF<br />

Hence, <strong>in</strong> our framework, volume is endogenous to firms’ actions (through N l and I fl )<br />

and it also depends on the market characteristics and consumer preferences.<br />

2.1.2. Competitive Effect Based Price Index<br />

Firms would like to differentiate spatially from rivals to reduce price competition. We<br />

model the effect of competition on the price <strong>in</strong>dex of stores with format f that are <strong>in</strong> location l. 12<br />

We use a flexible, semi-parametric approach so that the competition effect is split differentially<br />

as a function of the store formats and distances of rivals from the location.<br />

Similar to Seim (2006), we divide the area around a location (up to the trade radius, Rad)<br />

<strong>in</strong>to concentric circles or distance bands.<br />

13<br />

All rivals of a particular format type that are on a<br />

distance band b (b = 1, 2,…, B) around location l are assumed to have the same effect on price.<br />

Formally, the price <strong>in</strong>dex of a store with format f that is <strong>in</strong> location l is given by:<br />

and<br />

, * ( ) *exp ' ' *<br />

'<br />

pr<br />

βx<br />

⎛ ⎞<br />

prfl = β f pr xl ⎜∑β f −fbNfbl+ ∑∑ β f −fbNfbl⎟<br />

υl<br />

⎝ b b f ≠ f ⎠<br />

pr pr<br />

2<br />

ln( υl) ωlN(0, σ pr )<br />

= (13)<br />

where, β f , pr is a format-specific parameter which allows the <strong>in</strong>tr<strong>in</strong>sic pric<strong>in</strong>g ability of a store to<br />

differ by the store format. This <strong>in</strong>tr<strong>in</strong>sic pric<strong>in</strong>g ability of a store format could be due to format-<br />

specific differences <strong>in</strong> cost, efficiency, product mix, and service quality. However, we rema<strong>in</strong><br />

agnostic about the specific reasons. <strong>The</strong> second component on the right-hand side allows the<br />

pric<strong>in</strong>g ability of firms to depend on exogenous observable location characteristics, xl. In our<br />

application we use the per capita <strong>in</strong>come of consumers with<strong>in</strong> a 2 mile radius of the location to<br />

allow for price discrim<strong>in</strong>ation or the ability to sell premium products <strong>in</strong> affluent neighborhoods.<br />

12 We use sales-weighted prices across all categories <strong>in</strong> a store as the price <strong>in</strong>dex of the store.<br />

13 Alternatively, one could employ a cont<strong>in</strong>uous distance weight<strong>in</strong>g approach as <strong>in</strong> Orhun (2005).<br />

14


<strong>The</strong> third component on the right-hand side of Equation 13 <strong>in</strong>cludes the <strong>in</strong>traformat<br />

competition effect and the <strong>in</strong>terformat competition effect. For <strong>in</strong>traformat competition we<br />

consider the number of rivals that have the same format, f, as the focal firm and that are located<br />

<strong>in</strong> distance band b around location l ( N fbl ). Here, β f − fb<br />

15<br />

is the competitive effect of one such<br />

rival. For <strong>in</strong>terformat competition, we consider the number of rivals that have a different format,<br />

f’ ( f ' f<br />

≠ ), and f ' fb<br />

β − is the competitive effect of one such f’-format rival. If the estimates<br />

reveal a weaken<strong>in</strong>g of the competitive effects at greater distance bands then that will <strong>in</strong>dicate the<br />

benefits of spatial differentiation whereas, estimates of weaker <strong>in</strong>terformat competitive effects<br />

relative to <strong>in</strong>traformat competitive effects with<strong>in</strong> the same distance band will <strong>in</strong>dicate the<br />

benefits of format differentiation.<br />

F<strong>in</strong>ally, we <strong>in</strong>troduce a common, location-specific price shock ( υ ) that is common<br />

knowledge for all firms at the time of entry but is unknown to the researcher. We assume that<br />

this price shock has a log-normal distribution.<br />

To summarize, like volume, the price <strong>in</strong>dex also depends on market characteristics and is<br />

endogenous to firms’ actions. As firms’ actions affect both volume and price, a non-parametric<br />

identification of the competition effects would require price data for all stores. But we only have<br />

price data for one store cha<strong>in</strong>. Fortunately, this cha<strong>in</strong> operates more than one store format and is<br />

present <strong>in</strong> most markets <strong>in</strong> our dataset, and, therefore, experiences large variations <strong>in</strong> the spatial<br />

distribution of market characteristics and rivals. Hence, despite its <strong>in</strong>complete nature, the data<br />

partly assists identification. However, we also rely on our functional form assumption of how<br />

locations of rivals affect volume <strong>in</strong> a different way from how they affect prices.<br />

pr<br />

l


2.1.3. <strong>The</strong> Profit Function<br />

Conform<strong>in</strong>g to the multiplicative specifications so far, the observed component of the cost<br />

multiplier, ˆ C fl (Equation 5), is specified as:<br />

B<br />

ˆ ⎛ ⎞<br />

Cfl( xl) = exp⎜∑<br />

γ fbxxbl ⎟<br />

(14)<br />

⎝ b=<br />

1 ⎠<br />

where, bl x are the observed cost shifters at distance band b around location l and γ fbx are format<br />

and band specific cost parameters.<br />

Substitut<strong>in</strong>g the expressions for revenue and cost <strong>in</strong>to our profit specification, Equation<br />

(3), then tak<strong>in</strong>g the log transformation, and after mak<strong>in</strong>g some trivial sign reversals, we have a<br />

equation for the transformed profit function that is very similar to equation (1):<br />

( v pr ) ( Cˆ<br />

)<br />

( ) ( ) ( )<br />

16<br />

( )<br />

r c m<br />

π = ln π = ln + ln + ω − ln + ω + ξ + ε<br />

ifl ifl fl fl l fl l il<br />

2.4 Equilibrium Choice Probabilities:<br />

Recall that the idiosyncratic cost shock, ε il , is known to rivals only <strong>in</strong> distribution. Due<br />

to such <strong>in</strong>complete <strong>in</strong>formation about rivals’ profits, a firm cannot exactly predict rivals’ discrete<br />

actions but it can have rational expectations about rivals’ strategies. Hence, for a given set of<br />

pr r c<br />

vectors of price, revenue and cost shocks across all locations ( ω , ω , ω ), firm i can form<br />

rational expectations about the number of firms that will enter the market, N m , and the location<br />

and format choices of the (N m m m m m<br />

-1) enter<strong>in</strong>g rivals, P P1 , P2 ,... PF<br />

(15)<br />

= ⎡<br />

⎣<br />

⎤<br />

⎦<br />

. That is, correspond<strong>in</strong>g<br />

to each format f (f’) firms we will have a vector of lm conditional choice probabilities (CCPs),<br />

{ 1, 2,...,<br />

}<br />

P = p p p<br />

m<br />

f f f flm<br />

m ( Pf ' { pf '1, pf '2 ,..., pf<br />

'l<br />

} ) m<br />

= . For <strong>in</strong>stance, fj<br />

p ( f ' j)<br />

p is a CCP of a f-


format (f’-format) rival and it represents the focal firm’s belief that a f-format (f’-format) rival<br />

will choose location j when a total of<br />

m<br />

N firms enter the market.<br />

Based on these beliefs, we can obta<strong>in</strong> expressions for the total number of compet<strong>in</strong>g<br />

stores <strong>in</strong> a location ( [ l ] )<br />

OF ( E⎡I ⎤ fl )<br />

E N , the chance that there will be rivals with other formats <strong>in</strong> a location,<br />

⎣ ⎦ and the number of f-format (f’-format) rivals <strong>in</strong> distance band b, E ⎡N⎤ fbl<br />

17<br />

⎣ ⎦ ( E ⎡<br />

⎣N⎤ f 'bl<br />

⎦ )<br />

(Expressions for these expectations are shown <strong>in</strong> Appendix A). Consequently, given model<br />

parameters and the vectors of location-specific shocks, firms can derive the expected values of<br />

volume and price <strong>in</strong>dex which would then lead to the follow<strong>in</strong>g expression for expected profit:<br />

( ( ) ( ) ) ( ˆ )<br />

( )<br />

pr r c r c m<br />

E ⎡ πiflω , ω , ω ⎤ E ⎡ln v ⎤ fl E ⎡ln pr ⎤<br />

⎣ ⎦<br />

=<br />

⎣ ⎦<br />

+<br />

⎣ fl ⎦<br />

+ ωl − ln c fl + ωl+ ξ + εil<br />

(16)<br />

S<strong>in</strong>ce fl v is a highly non-l<strong>in</strong>ear function of OF<br />

N l and I , we will make the follow<strong>in</strong>g<br />

simplify<strong>in</strong>g assumption:<br />

OF<br />

( ( ⎡ ⎤ ) )<br />

( ) [ ]<br />

E[ln vkl ] = E⎡ln vkl X, E Nl , E I fl ; α ⎤<br />

⎣ ⎣ ⎦ ⎦ (17)<br />

This expected volume can be calculated based on firms’ prediction of the outcome of<br />

consumers’ shopp<strong>in</strong>g behavior as Equation (9) transforms <strong>in</strong>to:<br />

( ) [ ]<br />

Uˆ =− Tvl + α comm + α E ⎡ln pr ⎤ α E N α E ⎡I⎤ ⎣ ⎦<br />

+ + ⎣ ⎦+<br />

α<br />

gfl gl MS l pr fl SA l f , FA<br />

OF<br />

fl f ,Pref<br />

We also have:<br />

⎣<br />

ln ( ) ⎦<br />

ln ( ) ln ( ) ∑ ∑∑<br />

E⎡ pr ⎤ = β + β x + β E⎡N ⎤+ β E⎡N ⎤+<br />

ω<br />

pr<br />

fl f , pr x l f −fb⎣ fbl ⎦ f '−fb⎣ f 'bl⎦<br />

l<br />

b b f '≠<br />

f<br />

fl<br />

. (18)<br />

Thus, the expected profit <strong>in</strong> equation (16) can be rewritten as a function of the<br />

equilibrium number of entrants <strong>in</strong> the market,<br />

(19)<br />

m<br />

N , the equilibrium location choice probabilities


<strong>in</strong> the market for firms of all formats,<br />

pr r c<br />

ω , ω , ω ), and a set of model parameters, θ { αβγσρ , , , , }<br />

m<br />

P , the specific draws of price, revenue and cost shocks (<br />

( )<br />

= :<br />

( )<br />

pr r c m m pr r c m<br />

E⎡ π , , ˆ<br />

ifl ω ω ω ⎤<br />

⎣ ⎦<br />

= π fl N , P , ω , ω , ω , θ + ξ + εil<br />

Note that m<br />

ξ is common for all locations <strong>in</strong> the market and therefore does not <strong>in</strong>fluence<br />

the location choice after firm i has decided to enter the market. Thus, if we assume that the<br />

idiosyncratic component, ε il , has a Type 1 extreme value distribution that is <strong>in</strong>dependent across<br />

locations and firms then the conditional probability (conditional on entry) that a f-format firm<br />

chooses location l is given by the logit form:<br />

m m pr r c<br />

( N , P , , , , )<br />

ψ ω ω ω θ =<br />

fl F lm<br />

∑∑<br />

f '= 1 j=<br />

1<br />

18<br />

m ( ˆ π fl ( N<br />

m pr r c<br />

P ω ω ω θ)<br />

)<br />

m<br />

ˆ π f ' j(<br />

N<br />

m pr<br />

P ω<br />

r c<br />

ω ω θ)<br />

exp , , , , ,<br />

( )<br />

exp , , , , ,<br />

Integrat<strong>in</strong>g over the distributions of the common unobserved shocks, we have the<br />

location choice probability, conditional only on entry:<br />

( ) ( )<br />

m m m m pr r c pr r c pr r c<br />

Ψ N , θ = ∫ ∫∫ Ψ N , ω , ω , ω , θ g( ω ) f( ω , ω ) dω dω dω<br />

(22)<br />

In equilibrium firms’ beliefs must match with rivals’ strategies. So:<br />

( ; θ) ( , ; θ)<br />

(20)<br />

(21)<br />

m m m m<br />

P N = Ψ N P<br />

(23)<br />

This represents a system of equations that describes firms’ CCPs as the fixed po<strong>in</strong>t of a<br />

cont<strong>in</strong>uous mapp<strong>in</strong>g between firms’ strategies and their beliefs about rivals’ strategies. As the<br />

CCPs with<strong>in</strong> market m must add up to 1, by Brouwer’s fixed po<strong>in</strong>t theorem, this system of<br />

equations has at least one solution or fixed po<strong>in</strong>t.<br />

Next, we normalize the profit from not enter<strong>in</strong>g a market to one so that the log of profit is<br />

normalized to zero. <strong>The</strong>n the entry probability for a firm is given by the nested logit form:


pr r c<br />

( ω , ω , ω , θ)<br />

p Entry<br />

F<br />

lm<br />

∑∑<br />

f = 1 l=<br />

1<br />

19<br />

( ˆ fl ( N P<br />

) )<br />

m m m pr r c<br />

exp( ξ )* exp π , , ω , ω , ω , θ<br />

f = 1 l=<br />

1<br />

=<br />

F lm<br />

m m m pr r c<br />

1+ exp( ξ )* exp π , , ω , ω , ω , θ<br />

∑∑<br />

( ˆ fl ( N P<br />

) )<br />

Hence, if there are, say, E potential retail entrants then the expected total number of<br />

entrants <strong>in</strong> market m is given by:<br />

(24)<br />

m<br />

N = E * p( Entry)<br />

(25)<br />

By exogenously fix<strong>in</strong>g E, and by observ<strong>in</strong>g the actual number of entrants,<br />

the market specific cost parameter is:<br />

( )<br />

F m<br />

( N ) ( E N ) ∑∑ ˆ fl ( N P<br />

)<br />

m<br />

N , the estimate for<br />

l<br />

m pr r c m m ⎛ m m pr r c ⎞<br />

ξ ω , ω , ω , θ = ln −ln − −ln⎜ exp π , , ω , ω , ω , θ ⎟<br />

⎝ f = 1 l=<br />

1<br />

⎠<br />

Aga<strong>in</strong> <strong>in</strong>tegrat<strong>in</strong>g over the distributions of the common unobserved shocks, we have:<br />

( )<br />

m m pr r c pr r c pr r c<br />

ξ θ = ∫ ∫∫ ξ ω , ω , ω , θ g( ω ) f(<br />

ω , ω ) dω dω dω<br />

A simultaneous solution for Equations (23) and (27) gives the jo<strong>in</strong>t equilibrium<br />

predictions for the number of entrants, and the format and location decisions of those entrants.<br />

We assume that m<br />

ξ is i.i.d. across markets, and follows a normal distribution,<br />

Thus the probability that a total of<br />

(26)<br />

(27)<br />

2<br />

N ( µσ , ) .<br />

m<br />

N firms enter the market is given by the p.d.f. of this normal<br />

distribution at the value obta<strong>in</strong>ed <strong>in</strong> Equation (27). Note that the value of m<br />

ξ adjusts to the size<br />

of E <strong>in</strong> relation to the outside option of no entry. Hence, although the size of E is not observed by


the researcher, vary<strong>in</strong>g the size will have only a m<strong>in</strong>iscule effect on our <strong>in</strong>ferences about firms’<br />

strategies (See discussion <strong>in</strong> Seim (2006)).<br />

Next, note that for a given θ and<br />

m<br />

N , we can get estimates of price and revenue when<br />

firms’ locations are set to be identical to the observed spatial configuration of stores <strong>in</strong> the data.<br />

We can compare these estimates with our price and revenue data and thus obta<strong>in</strong> the price and<br />

pr r<br />

revenue shocks, ( obv , obv )<br />

ω θ ω θ , for the set of chosen locations that correspond to the observed<br />

spatial configuration of stores <strong>in</strong> the data. <strong>The</strong>se price and revenue shocks are <strong>in</strong>cluded <strong>in</strong> the<br />

likelihood function:<br />

L<br />

( )<br />

Θ =<br />

M<br />

∏<br />

m=<br />

1<br />

l ⎧ F m<br />

I ( fl)<br />

⎫<br />

m m pr 2<br />

r<br />

⎨∏∏( ψ fl ( N , P ; θ) ) ⎬*<br />

∏φ( ωobv θ,0, σ pr ) * ∏φ( ωobv θ,0,<br />

Σ)<br />

⎩ f = 1 l=<br />

1<br />

⎭<br />

Price Data<br />

Revenue Data<br />

<br />

⎡ ⎤<br />

⎢ ⎥<br />

⎢ ⎥<br />

⎢ ⎥<br />

⎢ ⎥<br />

⎢ ⎥<br />

⎢ Location Choice<br />

⎥<br />

⎢ ⎥<br />

⎢<br />

m 2 * φξ ( ; µσ , ) ⎥<br />

⎢ ⎥<br />

⎢⎣ Entry Choice ⎥⎦<br />

( θ) ( θ)<br />

m m m m<br />

s.t. P N ; =Ψ N , P ; , ∀ m<br />

(28)<br />

( )<br />

2<br />

where, Θ is the set of all model parameters { θ , µσ , }<br />

Θ= , and I( fl ) is an <strong>in</strong>dicator that<br />

equals one if location l is chosen by a f-format firm, and is zero otherwise. φ is the pdf of a<br />

normal distribution whereas φ has been used to <strong>in</strong>dicate the pdf of the marg<strong>in</strong>al distribution of<br />

revenue shocks.<br />

2.2 Estimation Strategy<br />

2.2.1. Simplify<strong>in</strong>g Restrictions<br />

20


In the generalized model specification the number of model parameters <strong>in</strong>creases<br />

exponentially with the number of format types (F) due to the <strong>in</strong>terformat and <strong>in</strong>traformat<br />

competition effects (Equation 13). <strong>The</strong> number of distance bands (B) around each location<br />

further explodes the number of parameters. For <strong>in</strong>stance, <strong>in</strong> our empirical application <strong>in</strong> this<br />

paper, we have six format types (F = 6). When we consider five 1-mile width distance bands<br />

around each location (B = 5), the number of competition effect parameters is 180 (F 2 *B =<br />

6*6*5). Also, the number of parameters for the observable component of cost (Equation 14) is<br />

proportional to F*B. Furthermore, we are also constra<strong>in</strong>ed by data for only a limited set of<br />

sample markets (small M). Hence, we make two restrictions <strong>in</strong> the model specification to reduce<br />

the model parameters to a manageable number.<br />

First, we assume that the competition effect between a pair of rivals is symmetric. That is,<br />

for any distance band, b, and for two rivals with formats f and f’, we assume β f ' fb = β f f 'b.<br />

In<br />

21<br />

− −<br />

our empirical application for the grocery <strong>in</strong>dustry, this restriction implies that we treat the<br />

competition effect of a Supermarket on a Superstore to be the same as the competition effect of a<br />

Superstore on a Supermarket. Note however that we allow <strong>in</strong>traformat and <strong>in</strong>ter-format effects to<br />

be heterogeneous. <strong>The</strong>refore, (1) the competition effect between two Supermarkets can be<br />

different from that between two Superstores; and (2) the competition effect between, say, a<br />

Supermarket and a Supercenter can be different from that between a Superstore and a<br />

Supercenter.<br />

Second, we assume that the ratio between the competition effect from a rival at a<br />

particular distance band, b (b ≠ 1) and the competition effect from that rival <strong>in</strong> the first 0-1 mile<br />

distance band is a constant value ( κ b ) . That is, we have:


β = β κ ; β = β κ ; ...; β = β κ<br />

f −f 2 f −f1 2 f −f 3 f −f1 3 f −fB f −f1<br />

B<br />

β = β κ ; β = β κ ; ...; β = β κ<br />

(29.1)<br />

f '−f 2 f '−f1 2 f '−f 3 f '−f1 3 f '−fB f '−f1 B<br />

( # competition effect parameters = ( F*( F + 1) / 2 ) + ( B-1)<br />

)<br />

Similarly, the impact of market characteristics on cost ( γ fbx ) are allowed to be format-<br />

specific but we assume a constant ratio between the impact of a variable at a particular distance<br />

band to the impact <strong>in</strong> the first 0-1 mile distance band. <strong>The</strong> constant is specific to the variable and<br />

the particular distance band. For <strong>in</strong>stance, suppose for cost (Equation 14) the coefficients of<br />

population and per capita <strong>in</strong>come <strong>in</strong> different distance bands are denoted by γ 1fb and γ 2 fb ,<br />

respectively; then the restriction implies:<br />

γ1= γ1 λ ; γ1= γ1 λ ; ... ; γ1= γ1 λ<br />

f 2 f 1 2 f 3 f 1 3 fB f 1 B<br />

γ2= γ2 ζ ; γ2= γ2 ζ ; ... ; γ2= γ2ζ f 2 f 1 2 f 3 f 1 3 fB f 1 B<br />

( # observable cost component parameters ∝ ( F + B)<br />

)<br />

22<br />

(29.2)<br />

Note that if we allow the ratios or the multipliers, κb, λb and ζ b to be format-specific<br />

then that is equivalent to directly estimat<strong>in</strong>g the format-specific coefficients, such as β f − fb , γ 1fb and γ 2 fb . In our estimation, we do not impose any restrictions on the values that the multipliers<br />

can take at different distance bands. If these multipliers turn out to be decreas<strong>in</strong>g with distance<br />

and less than one then that would imply that the impact of the variable weakens with distance. In<br />

particular, weaken<strong>in</strong>g of the competitive effects at greater distances would <strong>in</strong>dicate the benefits<br />

of spatial differentiation.<br />

2.2.2. Multiple Equilibria <strong>in</strong> the Model


* *<br />

Estimation <strong>in</strong>volves f<strong>in</strong>d<strong>in</strong>g the equilibrium solution, ( MLE , MLE )<br />

23<br />

P Θ , which is the global<br />

optimum of Equation (28) where, *<br />

Θ MLE are the Maximum Likelihood Estimates (MLE) and<br />

*<br />

P MLE are the correspond<strong>in</strong>g equilibrium CCPs. Us<strong>in</strong>g a nested fixed-po<strong>in</strong>t (NFXP) approach for<br />

estimation is computationally demand<strong>in</strong>g as it <strong>in</strong>volves solv<strong>in</strong>g for the fixed-po<strong>in</strong>t of Equation<br />

pr r c<br />

(22) for each draw of ⎡<br />

⎣ω , ω , ω ⎤<br />

⎦<br />

and at each step of the likelihood maximization. More<br />

importantly, NFXP suffers from the possibility of multiple equilibria <strong>in</strong> the model. Specifically,<br />

for a value of θ , if Equation (22) has multiple solutions for CCPs then the likelihood is not well<br />

def<strong>in</strong>ed. 14<br />

A recursive extension of the PML, called the Nested Pseudo Likelihood (NPL) approach<br />

addresses this problem at a relatively small additional computational cost (Aguirregabiria and<br />

Mira, 2007).<br />

Researchers have, therefore, developed two-step estimation approaches that avoid<br />

these problems. In a two-step Pseudo Maximum Likelihood (PML) approach, the CCPs are<br />

estimated <strong>in</strong> a parametric or nonparametric first step and the parameter estimates are obta<strong>in</strong>ed by<br />

maximiz<strong>in</strong>g the result<strong>in</strong>g likelihood <strong>in</strong> the second step (Bajari et. al., 2007). However, <strong>in</strong> most<br />

empirical contexts, consistent and precise first-stage estimates of CCPs are <strong>in</strong>feasible.<br />

15<br />

<strong>The</strong> standard NPL approach starts with an <strong>in</strong>itial guess of the CCPs, and<br />

converges to an equilibrium solution <strong>in</strong> the limit. For example, <strong>in</strong> our case, we would start with<br />

<strong>in</strong>itial guess values for firms’ beliefs about rivals’ CCPs, P . <strong>The</strong>n, us<strong>in</strong>g Equations (21) through<br />

(28) we would obta<strong>in</strong> the likelihood, ( 0 ) ,<br />

parameter estimates, 1<br />

Θ , and new CCPs, 1<br />

0<br />

L P Θ . Maximiz<strong>in</strong>g the likelihood would give the<br />

P . This would constitute one iteration, and the new<br />

14 One way to deal with this problem is to provide sufficient conditions that the parameters, θ, must satisfy to ensure<br />

a unique equilibrium (e.g., Seim, 2006; Zhu and S<strong>in</strong>gh, 2009).<br />

15 Another application of the NPL approach for a static game can be found <strong>in</strong> Ellickson and Misra (2008).


CCPs would be used for firms’ beliefs about rivals’ actions <strong>in</strong> the next iteration. <strong>The</strong> n th iteration<br />

of the standard NPL approach can be denoted by the follow<strong>in</strong>g contraction mapp<strong>in</strong>g, M:<br />

( Pn, n) ( Pn−1) where, n arg max L( Pn−1, ) ; Pn ( Pn−1,<br />

n)<br />

Θ =Μ Θ = Θ =Ψ Θ (30)<br />

Θ<br />

For a graphical illustration of the NPL iterations, suppose that the set ( P, Θ ) could be<br />

‘collapsed’ onto one axis. In Figure 2(a) the X-axis corresponds to the vector Pn − 1,<br />

the Y-axis<br />

corresponds to the set ( Pn, Θ n)<br />

, and the solid curve represents the contraction mapp<strong>in</strong>g ( P)<br />

24<br />

Μ .<br />

<strong>The</strong> dotted l<strong>in</strong>es represent the ‘track’ followed by the NPL iterations correspond<strong>in</strong>g to a<br />

particular start<strong>in</strong>g value, P 0 . Note that a different start<strong>in</strong>g value,<br />

'<br />

P 0 , would result <strong>in</strong> a different<br />

track for the NPL iterations. With multiple iterations, if there is convergence, the contraction<br />

* *<br />

mapp<strong>in</strong>g would converge to an equilibrium solution or a NPL fixed po<strong>in</strong>t, ( , )<br />

P Θ . In Figure<br />

3(a), this is the po<strong>in</strong>t where Μ ( P)<br />

<strong>in</strong>tersects the 45 o l<strong>in</strong>e. Furthermore, if the fixed po<strong>in</strong>t is<br />

* *<br />

unique then it is, <strong>in</strong> fact, the global optimum, ( MLE , MLE )<br />

2.2.2. Multiple Equilibria <strong>in</strong> the Data<br />

P Θ .<br />

<strong>The</strong> standard NPL approach, however, does not address the possibility of multiple<br />

equilibria <strong>in</strong> the data which is when the contraction mapp<strong>in</strong>g <strong>in</strong> Equation (31) does not have a<br />

unique NPL fixed po<strong>in</strong>t. <strong>The</strong> multiple eqilibria or the multiple NPL fixed po<strong>in</strong>ts are essentially<br />

the different ‘local optima’ of Equation (29). This is illustrated <strong>in</strong> Figure 3(b) where Μ ( P)<br />

<strong>in</strong>tersects the 45 o l<strong>in</strong>e at multiple po<strong>in</strong>ts. Consequently, the NPL iterations may potentially<br />

converge to a ‘local optima’ and not the global optimum. Further, as the track followed by the


NPL iterations depends on the start<strong>in</strong>g value, P 0 , different start<strong>in</strong>g values would result <strong>in</strong> dist<strong>in</strong>ct<br />

tracks which could potentially converge to different ‘local optima’. One option is to spread the<br />

search for the global optimum over a wide range of the contraction mapp<strong>in</strong>g, Μ ( P)<br />

, by us<strong>in</strong>g<br />

parallel-NPL where a large number of NPL algorithms, say, T, are run <strong>in</strong> parallel with different<br />

start<strong>in</strong>g values. By thus follow<strong>in</strong>g T dist<strong>in</strong>ct tracks for the NPL iterations, this approach, upon<br />

1* 1* 2* 2* T* T*<br />

convergence, would give us a set of T fixed po<strong>in</strong>ts, ( P , Θ ) ; ( P , Θ ) ;...; ( P , Θ )<br />

25<br />

⎡ ⎤<br />

⎣ ⎦ 16<br />

.<br />

* * ( PMLE , ΘMLE<br />

)<br />

However, it does not guarantee that this set will conta<strong>in</strong> the global optimum, .<br />

For a more efficient search of the global optimum, Aguirregabiria and Mira (2005)<br />

propose comb<strong>in</strong><strong>in</strong>g the parallel-NPL with a Genetic Algorithm (GA). GA is a search heuristic<br />

that mimics natural evolution processes such as ‘selection’, ‘crossover’ or ‘reproduction’ and<br />

‘mutation’, and can be used to obta<strong>in</strong> the global optimum of complex optimization problems.<br />

Comb<strong>in</strong><strong>in</strong>g the parallel-NPL with GA has two advantages – (1) <strong>The</strong> crossover and mutation<br />

steps spread the search for the global optimum over a much wider range of the contraction<br />

mapp<strong>in</strong>g than what is feasible with just the parallel-NPL, and (2) <strong>The</strong> selection step steers the<br />

tracks of the parallel-NPL iterations towards those regions of the contraction mapp<strong>in</strong>g that are<br />

more likely to conta<strong>in</strong> the global optimum. 17<br />

In our estimation, we <strong>in</strong>sert two GA steps after each iteration of the parallel-NPL. Note<br />

that after the n th iteration of the parallel-NPL, we will have T vectors of CCPs,<br />

⎡ ⎤<br />

1 2 T<br />

⎣Pn; Pn ;...; Pn<br />

⎦ .<br />

First, <strong>in</strong> a selection step, we evaluate each vector of CCPs by us<strong>in</strong>g a ‘fitness criterion’ where the<br />

16 Many of the fixed po<strong>in</strong>ts may be identical.<br />

17 Su and Judd (2010) suggest us<strong>in</strong>g a Mathematical Programm<strong>in</strong>g with Equilibrium Constra<strong>in</strong>ts approach that f<strong>in</strong>ds<br />

the parameter estimates and the equilibrium CCPs simultaneously. However, like the parallel-NPL, this approach<br />

also relies on multiple runs with different start<strong>in</strong>g values to f<strong>in</strong>d different equilibria. Hence, its ability to f<strong>in</strong>d the<br />

global optimum <strong>in</strong> problems that have a large action space (as <strong>in</strong> our entry and location choice problem) is unclear.


CCPs that are likely to be closer to the global optimum are considered to be more fit. Analogous<br />

to the natural selection process <strong>in</strong> nature, the more fit CCPs are given a greater chance of<br />

survival and reproduction so that future search for the global optimum is concentrated <strong>in</strong> their<br />

neighborhood. This is done by draw<strong>in</strong>g, with replacement, T ‘mother’ CCPs,<br />

and T ‘father’ CCPs,<br />

1'' 2'' T ''<br />

⎡<br />

⎣Pn ; Pn ;...; P ⎤ n ⎦<br />

, from the orig<strong>in</strong>al set,<br />

more fit CCPs have a greater chance of gett<strong>in</strong>g selected.<br />

1' 1'' 2' 2'' T'T'' Next, each of the T ‘couples’, ( Pn , Pn ) ; ( Pn , Pn ) ;...; ( Pn , Pn<br />

)<br />

26<br />

⎡ ⎤<br />

1' 2' T '<br />

⎣Pn ; Pn ;...; Pn<br />

⎦ ,<br />

1 2 T<br />

⎡<br />

⎣Pn; Pn ;...; P ⎤ n ⎦<br />

, such that the<br />

⎡ ⎤<br />

⎣ ⎦<br />

, go through a<br />

crossover step to produce an ‘offspr<strong>in</strong>g’ that <strong>in</strong>herits the traits of both its parent CCPs. To the<br />

extent that both parents are likely to be fit, the result<strong>in</strong>g offspr<strong>in</strong>g also has a high chance of be<strong>in</strong>g<br />

fit. Hence, we obta<strong>in</strong> a new generation of T vectors of CCPs that are likely to be quite close to<br />

the global optimum. To further reduce the chances of miss<strong>in</strong>g the global optimum, some<br />

mutations may be implanted <strong>in</strong>to the offspr<strong>in</strong>gs so that the search cont<strong>in</strong>ues to span a wide range<br />

of the contraction mapp<strong>in</strong>g. With multiple iterations of the parallel-NPL and GA steps, if there is<br />

convergence, we would obta<strong>in</strong> a set of T fixed po<strong>in</strong>ts which almost certa<strong>in</strong>ly would conta<strong>in</strong> the<br />

global optimum.<br />

2.2.3. Convergence<br />

<strong>The</strong> algorithm may not converge to the global optimum if the contraction mapp<strong>in</strong>g does<br />

not have good local convergence properties around the global optimum. Intuitively, as shown <strong>in</strong><br />

Figure 3(c), convergence to a fixed po<strong>in</strong>t depends on the concavity or the convexity of the<br />

mapp<strong>in</strong>g <strong>in</strong> the neighborhood of that fixed po<strong>in</strong>t. Kasahara and Shimotsu (2008) recommend<br />

transform<strong>in</strong>g the mapp<strong>in</strong>g by replac<strong>in</strong>g ( P,<br />

)<br />

( P,<br />

)<br />

Ψ Θ and P :<br />

Ψ Θ with the follow<strong>in</strong>g log-l<strong>in</strong>ear comb<strong>in</strong>ation of


( ) ( )<br />

δ 1−δ<br />

Λ P, Θ = ⎡ P, ⎤<br />

⎣<br />

Ψ Θ<br />

⎦<br />

⎡⎣P⎤⎦ ; δ ∈[0,1]<br />

Note that P =Λ( P,<br />

Θ ) and P ( P,<br />

)<br />

27<br />

(31)<br />

=Ψ Θ have the same fixed-po<strong>in</strong>t solution(s). An<br />

appropriate value of δ can modify the concavity or convexity of the mapp<strong>in</strong>g such that the<br />

transformed mapp<strong>in</strong>g is Locally Contractive around the fixed po<strong>in</strong>t and will converge even if the<br />

orig<strong>in</strong>al mapp<strong>in</strong>g does not. 18<br />

F<strong>in</strong>ally, even when the mapp<strong>in</strong>g does converge, the rate of<br />

convergence could be extremely slow and may require a large number of iterations. To avoid<br />

this, Kasahara and Shimotsu (2008) propose the follow<strong>in</strong>g q-stage operator called q-NPL:<br />

( ( ( ( ) ) ) )<br />

q<br />

Λ ( P, Θ ) =Λ Λ ... Λ P,<br />

Θ , Θ ,..., Θ , Θ<br />

<br />

q times<br />

q<br />

Aga<strong>in</strong>, P =Λ ( P,<br />

Θ ) and P ( P,<br />

)<br />

(32)<br />

=Ψ Θ have the same fixed-po<strong>in</strong>t solution(s). In<br />

q<br />

addition, Λ ( P,<br />

Θ ) also has the locally contractive property of ( P,<br />

)<br />

Λ Θ . Hence, <strong>in</strong> our<br />

estimation, we replace the standard NPL operator, Ψ , with the Locally Contractive, q-NPL<br />

operator,<br />

q<br />

Λ . <strong>The</strong> result<strong>in</strong>g parallel NPL iterations are then comb<strong>in</strong>ed with GA as described<br />

above. This procedure searches efficiently over the space of possible equilibria and converges<br />

fast to a set of equilibria which almost certa<strong>in</strong>ly conta<strong>in</strong>s the global optimum. Details of the<br />

sequence of steps <strong>in</strong>volved <strong>in</strong> estimation are provided <strong>in</strong> Appendix A2.<br />

2.3 Identification<br />

18 Kasahara and Shimotsu (2008) suggest the follow<strong>in</strong>g procedure for select<strong>in</strong>g the value of δ : Simulate a sequence<br />

N<br />

{ P n} n=<br />

0<br />

by iterat<strong>in</strong>g the transformed mapp<strong>in</strong>g for different values of δ , say for δ ∈ { 0.1,0.2,...,0.9}<br />

. <strong>The</strong>n<br />

pick the value of δ that leads to the smallest value of the mean of<br />

P P<br />

P − P<br />

n+ 1 n −<br />

n N<br />

across n = 1,…, N.


Extant models of location choice use only entry and spatial location choice data. <strong>The</strong>y<br />

exploit the variation <strong>in</strong> exogenous market characteristics around a location and the number and<br />

geographical locations of rivals, <strong>in</strong> order to identify the effects of market characteristics and the<br />

nature of competition. Given the entry and location choice data, they can only obta<strong>in</strong> make<br />

<strong>in</strong>ferences that the level of profits where a firm locates is greater than <strong>in</strong> locations where they do<br />

not locate, conditional on what they expect competitors to do.<br />

However, for identify<strong>in</strong>g the agglomeration effect, we need to go beyond the profits and<br />

decompose the quantity (demand) enhanc<strong>in</strong>g effects of agglomeration and the marg<strong>in</strong> effects of<br />

differentiation. We augment extant models with revenue data and price data. <strong>The</strong> revenue data<br />

now helps isolate the cost impact from profits. <strong>The</strong> price data helps separate revenues <strong>in</strong>to its<br />

quantity and price components.<br />

Identify<strong>in</strong>g the quantity component helps to isolate the agglomeration benefit. We note<br />

that the price data we have is only from one cha<strong>in</strong> which has stores of different formats. Hence<br />

the competitive effect on price is identified non-parametrically only <strong>in</strong> areas where this cha<strong>in</strong><br />

locates its stores. We make the assumption that the price effect is identical across all stores of the<br />

same format to facilitate identification <strong>in</strong> other locations.<br />

3 Data<br />

3.1 Store Data and Sample Markets<br />

We <strong>in</strong>vestigate the spatial configuration of big-box grocery stores. We have store location<br />

(latitude and longitude), store format and weekly revenue data at the national level for the period<br />

2007-08 from Nielsen’s ‘Trade Dimensions’. For our analysis, we use average weekly store<br />

revenue data. In a different dataset, we have store location and store format data (but no revenue<br />

data) for the period 2000-01 and for a sample of local markets <strong>in</strong> the three states of New York,<br />

28


Pennsylvania and Ohio. This second dataset also has price <strong>in</strong>dex data for stores belong<strong>in</strong>g to one<br />

store cha<strong>in</strong>. 19<br />

In our price model, price data from a different time period can be used to estimate<br />

the competition parameters and the distribution of price shocks as long as we use the market<br />

configuration and market characteristics correspond<strong>in</strong>g to that period, and if we assume that the<br />

price shocks do not change over the seven year period. Hence, we comb<strong>in</strong>e the two data sets so<br />

that for a sample of markets we have the market configuration and revenue data for all stores <strong>in</strong><br />

one period (2008), and the market configuration and price data for one of the stores <strong>in</strong> many<br />

markets, but for a different period (2001). <strong>The</strong> data constra<strong>in</strong>t of hav<strong>in</strong>g prices for only one store<br />

cha<strong>in</strong> may appear as a serious weakness. However, as discussed above, it aids our identification.<br />

Also, it is <strong>in</strong>terest<strong>in</strong>g from a managerial perspective as it mimics a more realistic situation where<br />

firms are likely to have more <strong>in</strong>formation about themselves (own prices and own revenue) and<br />

relatively less <strong>in</strong>formation about rivals (only revenue <strong>in</strong>formation about rivals but no price <strong>in</strong>dex<br />

<strong>in</strong>formation).<br />

Among the markets for which we have price data, we select a sample of 98 fairly<br />

isolated, small and medium sized towns to avoid the problems associated with large markets and<br />

suburbs such as unclear market boundaries, cannibalization due to multiple stores of a firm <strong>in</strong> the<br />

same market, and complex sub-zon<strong>in</strong>g regulations. In 2008, our 98 sample markets had<br />

19<br />

We have weekly product category-level price <strong>in</strong>dex data for a one year period for 27 grocery product categories<br />

and for each store that belongs to the store cha<strong>in</strong> ( pr = ∑∑ w * pr ; where, w ciuts is the revenue share<br />

cts ciuts ciuts<br />

∀∈ i c∀u∈i of UPC, u, of item, i, with<strong>in</strong> product category, c, for week t <strong>in</strong> store s). To construct store-level price <strong>in</strong>dices we<br />

adopt an approach similar to Chevalier et al., (2003; p. 22). That is, we aggregate over the product categories and<br />

27 52<br />

weeks to form a store-level price <strong>in</strong>dex ( pr = ∑∑ w * pr ; where w cts is the dollar share of category c <strong>in</strong><br />

week t <strong>in</strong> store s).<br />

s cts cts<br />

c= 1 t=<br />

1<br />

29


altogether 438 big-box grocery stores. 20<br />

<strong>The</strong>se stores have been classified <strong>in</strong>to six format types<br />

(i.e., F = 6): Supermarkets (SM), Superstores (SS), Supercenters and Wholesale Clubs (SC),<br />

Limited Assortment and Warehouse stores (LA), Natural Foods stores (NF) and Food and Drug<br />

stores (FD). Table 1 provides a description of these store formats.<br />

3.2. Consumer and Retail Locations<br />

Data on market characteristics are obta<strong>in</strong>ed from the U.S. Census. Although detailed<br />

demographic data at a Census Block Group (CBG) level are available only for the year 2000, the<br />

U.S. Census provides annual census projections for the county level. Hence, we project the CBG<br />

level census data to their 2008 values by the proportion of change <strong>in</strong> the respective counties<br />

between 2000 and 2008. As we do not have <strong>in</strong>formation about consumers beyond the CBG level,<br />

we follow the convention <strong>in</strong> the literature and place consumers <strong>in</strong> a CBG at the population<br />

weighted center of the CBG. <strong>The</strong>se are our consumer locations.<br />

For the location choice game, we divide a market <strong>in</strong>to a uniform grid of discrete 1 sq.<br />

mile blocks or market locations. Our 98 sample markets have a total of 4,792 such locations. But<br />

zon<strong>in</strong>g regulations dictate which of these locations are available for big-box retailers. Below, we<br />

discuss our approach for identify<strong>in</strong>g these potential retail locations and their commercial<br />

centers. Just as consumers are placed at the population weighted center of CBGs, we place<br />

retailers with<strong>in</strong> a retail location at the commercial center of the location.<br />

Our concept of market locations deviates from the standard approach <strong>in</strong> earlier research<br />

that treats census divisions as market locations and places retail stores at the population weighted<br />

center along with consumers. <strong>The</strong> standard approach simplifies the data setup process but it has<br />

severe drawbacks: (1) <strong>The</strong> population weighted center of a census division is likely to be a<br />

20 A comparison of the market configurations between 2001 and 2008 showed that the number of stores <strong>in</strong> these<br />

markets <strong>in</strong>creased less than 10% from 399 to 438.<br />

30


esidential zone so that plac<strong>in</strong>g retail stores there would confound the <strong>in</strong>clusion of zon<strong>in</strong>g<br />

regulations; (2) Stores are rarely present <strong>in</strong> the <strong>in</strong>terior of a census division, rather, they are<br />

present on roads that border these census divisions; (3) Census divisions vary extensively <strong>in</strong> size<br />

so that, for large census divisions, stores may be located quite far from the center and also quite<br />

far from each other. Such artificial distortions <strong>in</strong> distances between rivals can be very damag<strong>in</strong>g<br />

for our application as we are <strong>in</strong>terested <strong>in</strong> expla<strong>in</strong><strong>in</strong>g co-location of rivals through consumers’<br />

will<strong>in</strong>gness to travel to such locations. Our concept of market locations not only allows us to<br />

<strong>in</strong>corporate spatial zon<strong>in</strong>g regulations but it also avoids major distortions of the distances<br />

between rivals and the distances of stores from population centers. 21<br />

We next describe the National Land Cover Dataset (NLCD) and discuss how it is used <strong>in</strong><br />

conjunction with Geographical Information System tools such as ArcGIS and Google Earth to<br />

recover the potential retail locations and their commercial centers.<br />

3.3 Spatial Zon<strong>in</strong>g Data<br />

Multi-Resolution Land Characteristics Consortium, a conglomerate of several federal<br />

agencies, has created two NLCD datasets that provide consistent and accurate digital land-cover<br />

<strong>in</strong>formation for the coterm<strong>in</strong>ous U.S. <strong>The</strong> first national land-cover mapp<strong>in</strong>g project, NLCD 1992,<br />

was derived from the early to mid-1990s Landsat <strong>The</strong>matic Mapper satellite data. It applied a 21-<br />

class, geo-referenced, land-cover classification (see Vogelmann et al., 2001). <strong>The</strong> second project,<br />

NLCD 2001, updated the data for the year 2001 (see Homer et al., 2004). Both datasets have a<br />

spatial resolution of 30 meters. That is, every 30 sq. meter area of land is classified as a specific<br />

land type (e.g., deciduous forest, grassland, open water, etc.) and is allocated one pixel po<strong>in</strong>t with<br />

21 In this paper, distance between two po<strong>in</strong>ts always refers to the great-circle distance.<br />

31


a dist<strong>in</strong>ct color code and the associated latitude and longitude. 22<br />

Step 1: Construct<strong>in</strong>g Market Boundaries and Market Locations<br />

32<br />

Interest<strong>in</strong>gly, the land type<br />

classifications <strong>in</strong>clude residential and commercial land. Residential land is further classified <strong>in</strong>to<br />

low and high <strong>in</strong>tensity residential land, and commercial land comprises of highly developed<br />

areas that do not <strong>in</strong>clude residential areas. We use the NLCD data <strong>in</strong> the follow<strong>in</strong>g three steps to<br />

identify the potential retail locations and their commercial centers.<br />

We use the data <strong>in</strong> NLCD 2001 to construct the market boundaries of our sample markets.<br />

<strong>The</strong> residential and commercial land area pixel po<strong>in</strong>ts <strong>in</strong> each market are projected on a map by<br />

us<strong>in</strong>g the ArcGIS software. This gives us the spatial area of <strong>in</strong>terest for a market. A simple visual<br />

<strong>in</strong>spection of the pixel density is used to construct the market boundaries where the pixels fade<br />

away (See Figure 4(a)). As our sample markets are reasonably isolated from other towns and<br />

cities, we can be flexible <strong>in</strong> choos<strong>in</strong>g the shape of their boundaries. A rectangular shape is<br />

preferred so that a market can be easily divided <strong>in</strong>to a uniform grid of discrete blocks or market<br />

locations. Thus, we construct imag<strong>in</strong>ary rectangular borders (L miles X H miles where L and H<br />

are <strong>in</strong>tegers that vary across markets) around the residential and commercial pixel po<strong>in</strong>ts of each<br />

market and then divide the market, specifically, <strong>in</strong>to 1 sq. mile locations (See Figure 4(b)).<br />

Step 2: Commercial Activity and Commercial Center <strong>in</strong> a Location<br />

<strong>The</strong> extent of commercial activity <strong>in</strong> a location (as def<strong>in</strong>ed above) could affect firms’<br />

profit <strong>in</strong> the location if consumers have a preference for multi-purpose shopp<strong>in</strong>g or one-stop<br />

shopp<strong>in</strong>g. For <strong>in</strong>stance, when shopp<strong>in</strong>g for groceries, consumers may like to comb<strong>in</strong>e their<br />

shopp<strong>in</strong>g trip with non-grocery purchases such as cloth<strong>in</strong>g and electronics so that locations with<br />

more retail bus<strong>in</strong>esses may be more attractive to firms. We isolate the NLCD 2001 pixel po<strong>in</strong>ts<br />

22<br />

A pixel po<strong>in</strong>t is one of the <strong>in</strong>dividual dots that make up a graphical image. Each pixel po<strong>in</strong>t comb<strong>in</strong>es red, green,<br />

and blue phosphors to create a specific color.


that correspond to commercial land with retail bus<strong>in</strong>esses (See Appendix C for technical details)<br />

and use the number of pixel po<strong>in</strong>ts <strong>in</strong> a location as a measure for the extent of commercial<br />

activity <strong>in</strong> that location. <strong>The</strong> mean of the latitudes and longitudes of the commercial land pixel<br />

po<strong>in</strong>ts <strong>in</strong> a location gives us the commercial center of the location (See Figure 4(c)). We place all<br />

retail stores with<strong>in</strong> a location at the commercial center of that location.<br />

Step 3: Discern<strong>in</strong>g Potential Retail Locations from other Commercial Locations<br />

<strong>The</strong> market locations which conta<strong>in</strong> the commercial land pixel po<strong>in</strong>ts are the commercial<br />

locations and they constitute a very small share of all market locations. <strong>The</strong> locations without<br />

any commercial activity are mostly residential locations and some barren land. Hence, we<br />

account for residential zon<strong>in</strong>g by exclud<strong>in</strong>g locations that do not have any commercial land pixel<br />

po<strong>in</strong>ts. But even with<strong>in</strong> commercial locations, not all locations may be open to big-box retailers.<br />

For <strong>in</strong>stance, some commercial zones like, say, downtown areas, might only allow small<br />

bus<strong>in</strong>esses such as banks and restaurants. An obvious candidate for a potential retail location for<br />

big-box stores is any commercial location that has at least one big-box store which could be a<br />

grocery store or a non-grocery store. Hence, we project the locations on to Google Earth and use<br />

a tool called ‘Places Categories’ which shows the locations of various types of bus<strong>in</strong>esses <strong>in</strong> a<br />

region (See Figure 4(d)). We carefully comb through the commercial locations, and specifically<br />

check for the presence of major retail stores, major grocery stores and shopp<strong>in</strong>g centers to<br />

identify the commercial locations that have at least one big-box store.<br />

Now, the absence of big-box stores <strong>in</strong> a commercial location does not necessarily imply<br />

that such stores are not allowed <strong>in</strong> that location. In particular, a commercial location that is open<br />

to big-box stores may not have any such store if it is <strong>in</strong> an unfavorable or poor neighborhood and<br />

33


cannot support a big store. 23<br />

As we do not have a precise method for identify<strong>in</strong>g such locations,<br />

we use a stylized selection procedure. For each market, we f<strong>in</strong>d the m<strong>in</strong>imum value of the total<br />

<strong>in</strong>come of consumers with<strong>in</strong> a 2-mile radius of the commercial locations that have big-box<br />

stores. We use this m<strong>in</strong>imum as a benchmark for a commercial location <strong>in</strong> the market to be<br />

attractive enough to support at least one big-box retail store. That is, if a commercial location<br />

does not have any big-box store and the total <strong>in</strong>come of consumers with<strong>in</strong> a 2-mile radius of the<br />

location is less than the market benchmark then we presume that the absence of a big-box store is<br />

due to the unattractiveness of the location and not necessarily because of zon<strong>in</strong>g restrictions.<br />

Hence, a commercial location with no big-box store is still treated as a potential retail location<br />

when the follow<strong>in</strong>g condition is satisfied:<br />

Income <strong>in</strong> 2-mile radius of a commercial<br />

location that has no big-box store<br />

≤<br />

⎧Income <strong>in</strong> 2-mile radius of a commercial⎫<br />

m<strong>in</strong> ⎨ ⎬<br />

⎩location that has a big-box store ⎭<br />

To summarize, we use the NLCD data to construct market boundaries so that each market<br />

can be divided <strong>in</strong>to a grid of 1 sq. mile locations. <strong>The</strong>n the commercial land pixel po<strong>in</strong>ts are used<br />

to obta<strong>in</strong> the extent of commercial activity <strong>in</strong> a location and also to locate the commercial center<br />

of the location. Extant models that do not account for zon<strong>in</strong>g, assume that firms are allowed to<br />

set up stores <strong>in</strong> any market location. In contrast, we account for residential zon<strong>in</strong>g by exclud<strong>in</strong>g<br />

locations that do not have any commercial land pixel po<strong>in</strong>ts. F<strong>in</strong>ally, we account for zon<strong>in</strong>g<br />

regulations particularly aga<strong>in</strong>st big-box retailers, with<strong>in</strong> commercial locations, by def<strong>in</strong><strong>in</strong>g<br />

potential retail locations as those commercial locations that (1) have at least one big-box store<br />

which is either a grocery or a non-grocery store, and (2) do not have a big-box store and are <strong>in</strong> a<br />

poor neighborhood which is below the market benchmark as described above.<br />

23 Note that competition between stores <strong>in</strong> neighbor<strong>in</strong>g locations cannot expla<strong>in</strong> the absence of big-box stores <strong>in</strong> a<br />

location as we are consider<strong>in</strong>g big-box stores across any segment of the retail <strong>in</strong>dustry.<br />

34


4. Results<br />

<strong>The</strong> estimation results are presented <strong>in</strong> three parts. Table 2(a) presents the estimates for<br />

the consumer shopp<strong>in</strong>g location choice or the demand side of the model. Table 2(b) presents the<br />

results of the price <strong>in</strong>dex portion of the model. F<strong>in</strong>ally, the estimates of cost and unobserved<br />

shocks are presented <strong>in</strong> Table 2(c).<br />

<strong>The</strong> demand side estimates <strong>in</strong>dicate that consumers experience a negative travel cost that<br />

is convex with respect to distance (<strong>The</strong> coefficient of 2<br />

dgl is positive and significant). 24<br />

Consumers who are far away from the nearest retail location (That is, when the value of<br />

35<br />

m<strong>in</strong>_dg<br />

is large), are more will<strong>in</strong>g to travel long distances to get to a grocery store. Demographic<br />

characteristics seem to have very little explanatory power for consumers’ travel costs.<br />

<strong>The</strong> results show that consumers not only value economies of scope from the presence of<br />

other, non-grocery bus<strong>in</strong>esses at a location but they also value the agglomeration of multiple<br />

grocery stores at the location. <strong>The</strong> store agglomeration parameter ( α SA = 0.5342) is positive and<br />

significant which suggests that consumers likely visit locations with multiple grocery stores. <strong>The</strong><br />

format agglomeration effect ( α f , FA ) is also positive and even significant for a few store formats<br />

(Supercenters, Limited Assortment stores, and Food and Drug stores). Hence, consumers are<br />

more likely to visit locations with multiple grocery stores when the cluster of stores consists of<br />

different formats. Consequently, strategic store and format agglomeration <strong>in</strong>crease consumers’<br />

propensity to shop at a location, thus <strong>in</strong>creas<strong>in</strong>g volume at that location.<br />

24 Compar<strong>in</strong>g the results (not presented here) with different specifications for the maximum distance that consumers<br />

may travel for shopp<strong>in</strong>g, Rad, suggested that a distance of 5 miles was sufficient. Rad values of 6 miles and above<br />

did not change parameter estimates or <strong>in</strong>crease the likelihood value significantly (vis-à-vis AIC and BIC criteria).<br />

On the other hand, Rad values of 4 miles and below resulted <strong>in</strong> significantly different estimates for some model<br />

parameters and also gave significantly smaller likelihood values.


F<strong>in</strong>ally, consumers have a high preference for Supercenters and for Food and Drug stores<br />

relative to the Supermarket format. Hence, consumers may be more will<strong>in</strong>g to travel long<br />

distances to get to such stores. Consumers have a relatively low preference for Limited<br />

Assortment stores. This could be because Limited Assortment stores generally carry more name-<br />

brand products and very few national brand products.<br />

<strong>The</strong> results of the price <strong>in</strong>dex portion of the model (Table 2(b)) show that the format-<br />

specific price constant, β f , pr , is lowest for Limited Assortment stores and Supercenters. This is<br />

expected s<strong>in</strong>ce the stores with these formats are typically EDLP stores or they offer relatively<br />

more name-brand products that have low prices. For the effect of competition, recall that we<br />

allowed for separate <strong>in</strong>tra-format and <strong>in</strong>ter-format competition, and we considered competition<br />

from rivals <strong>in</strong> various 1-mile width distance bands (B = 5 when Rad = 5 miles). Our results show<br />

that the competition effect decreases dramatically with distance.<br />

Not surpris<strong>in</strong>gly, <strong>in</strong>traformat competition is generally more severe than <strong>in</strong>terformat<br />

competition. <strong>The</strong> extent of <strong>in</strong>traformat competition is the highest between Food and Drug<br />

comb<strong>in</strong>ation stores, which is comparable to the competition between Supercenters. Superstores<br />

are also found to compete quite heavily with each other. Interest<strong>in</strong>gly, for some formats, the<br />

<strong>in</strong>terformat competition effect is found to be comparable to the <strong>in</strong>traformat competition effect.<br />

For <strong>in</strong>stance, the competition effect between Supermarkets and Superstores is quite comparable<br />

to that between two Supermarkets. <strong>The</strong> competition effect between Superstores and Food and<br />

Drug comb<strong>in</strong>ation stores also seems to be quite high. <strong>The</strong> results highlight the importance of<br />

account<strong>in</strong>g for format differentiation, <strong>in</strong> addition to spatial differentiation.<br />

To explore the value of separat<strong>in</strong>g the agglomeration-differentiation effects of rivals, we<br />

estimated a model that did not <strong>in</strong>corporate agglomeration benefits <strong>in</strong> the consumer model<br />

36


(Parameter estimates not shown). <strong>The</strong> results showed that the competition effects are biased<br />

downwards for all format types. <strong>The</strong> bias was more severe for <strong>in</strong>ter-format competition. Hence,<br />

not model<strong>in</strong>g the agglomeration-differentiation tradeoff can highly underestimates the<br />

competition <strong>in</strong>tensity between stores with different formats. In retrospect, this is expected<br />

because without an agglomeration effect the model would misattribute observed collocation of<br />

stores <strong>in</strong> the data to low competition. S<strong>in</strong>ce the agglomeration benefit is higher when the cluster<br />

of stores has different formats, it is understandable that the <strong>in</strong>ter-format competition effect is<br />

more biased.<br />

F<strong>in</strong>ally, the estimates of cost and unobserved shocks (Table 2(c)) also give some<br />

<strong>in</strong>terest<strong>in</strong>g <strong>in</strong>sights. Although the Supercenter format enjoys a high preference from consumers,<br />

it also tends to <strong>in</strong>cur high costs <strong>in</strong> densely populated neighborhoods. We f<strong>in</strong>d a strong negative<br />

correlation between the location-specific cost shocks and demand shocks (-0.8932). This<br />

conforms to the <strong>in</strong>tuition that locations with high revenue potential are likely to be associated<br />

with high costs.<br />

5. Counterfactual Simulations<br />

We report two counterfactual simulations which help assess the relative importance of<br />

zon<strong>in</strong>g and agglomeration effects. We consider three alternative scenarios: (1) <strong>The</strong>re are no<br />

zon<strong>in</strong>g regulations <strong>in</strong> any market and consumers do not benefit from co-location of stores (i.e.,<br />

‘Neither Zon<strong>in</strong>g nor <strong>Agglomeration</strong>’), (2) Markets have zon<strong>in</strong>g regulations but there are no<br />

benefits from co-location (i.e., ‘Only Zon<strong>in</strong>g; No <strong>Agglomeration</strong>’), (3) <strong>Agglomeration</strong> benefits<br />

exist but there are no zon<strong>in</strong>g regulations (i.e., ‘No Zon<strong>in</strong>g; Only <strong>Agglomeration</strong>’), and (4) Both<br />

zon<strong>in</strong>g and agglomeration benefits exist. For the set of 98 sample markets we estimate the<br />

equilibrium CCPs under these alternative market conditions, assum<strong>in</strong>g that the equilibrium<br />

37


number of entrants rema<strong>in</strong>s unchanged (An appropriate change <strong>in</strong> the market-specific terms, m<br />

ξ ,<br />

would ensure this). We use the estimated model parameters and f<strong>in</strong>d the fixed po<strong>in</strong>t of the<br />

system of equations shown <strong>in</strong> Equation (23). For this, we use the NFXP approach.<br />

Figure 5(a) shows the distribution of <strong>in</strong>ter-store distance across the 98 markets, under the<br />

first scenario of ‘Neither Zon<strong>in</strong>g nor <strong>Agglomeration</strong>’. We see that only 28% of stores co-locate<br />

with<strong>in</strong> 1 mile of each other. 25<br />

This level of co-location may be due to concentration of high<br />

demand or low cost. When we turn on zon<strong>in</strong>g (‘Only Zon<strong>in</strong>g; No <strong>Agglomeration</strong>’), 32% of stores<br />

are located with<strong>in</strong> 1 mile of each other (Figure 5(b)). This suggests that zon<strong>in</strong>g may force firms<br />

to come a little closer to each other but it has very limited direct impact on their collocation<br />

behavior. With only agglomeration turned on (Figure 5(c) - ‘No Zon<strong>in</strong>g; Only <strong>Agglomeration</strong>’),<br />

43% of stores located with<strong>in</strong> 1 mile of each other, suggest<strong>in</strong>g that agglomeration effects have a<br />

substantial impact on collocation. Interest<strong>in</strong>gly, the <strong>in</strong>teraction between zon<strong>in</strong>g and<br />

agglomeration benefit is extremely high because when both effects coexist, co-location <strong>in</strong>creases<br />

to 60% (Figure 5(d) – ‘Both Zon<strong>in</strong>g and <strong>Agglomeration</strong>’). This is quite close to the amount of<br />

co-location that we observe <strong>in</strong> our sample data. Thus the impact of zon<strong>in</strong>g on co-location is high<br />

only <strong>in</strong> the presence of agglomeration benefits. Why is there an <strong>in</strong>teraction effect between zon<strong>in</strong>g<br />

and agglomeration benefits?<br />

To understand this <strong>in</strong>teraction, we perform our second counterfactual analysis <strong>in</strong> a<br />

hypothetical market where we gradually <strong>in</strong>crease the zon<strong>in</strong>g restriction which restricts the scope<br />

for spatial differentiation. For a set of four grocery stores, the optimal locations are shown <strong>in</strong><br />

Figure 6. In the less restrictive zon<strong>in</strong>g sett<strong>in</strong>g, we f<strong>in</strong>d that stores are located at the extremes of<br />

25 For this counterfactual simulation, we are count<strong>in</strong>g stores with<strong>in</strong> 1 mi. of a rival as a co-located store. It is<br />

plausible that the two stores belong to two neighbor<strong>in</strong>g 1 sq. mi. block retail location whose commercial centers<br />

happen to be with<strong>in</strong> 1 mi. of each other.<br />

38


the commercial zone, suggest<strong>in</strong>g that zon<strong>in</strong>g restrictions constra<strong>in</strong> the extent of spatial<br />

differentiation <strong>in</strong> this market. When zon<strong>in</strong>g is made more str<strong>in</strong>gent, one would expect that stores<br />

would cont<strong>in</strong>ue to be at the edges of the commercial zone. However, the optimal locations reveal<br />

a surpris<strong>in</strong>g pattern. When zon<strong>in</strong>g is more restrictive, we f<strong>in</strong>d that some stores actually<br />

agglomerate. In retrospect, we can understand the logic of why this happens. When zon<strong>in</strong>g is<br />

relaxed, stores can be more spread out allow<strong>in</strong>g for benefits of spatial differentiation to be large<br />

enough. However when zon<strong>in</strong>g is very restrictive, firms cannot differentiate enough; this leads to<br />

a discont<strong>in</strong>uity where stores now recognize that by co-locat<strong>in</strong>g they can ga<strong>in</strong> from agglomeration<br />

benefits which may outweigh the relatively constra<strong>in</strong>ed benefits from differentiation because of<br />

the tight zon<strong>in</strong>g regulations. This expla<strong>in</strong>s the high <strong>in</strong>teraction effect of zon<strong>in</strong>g and<br />

agglomeration that we f<strong>in</strong>d as we proceed from the scenario <strong>in</strong> Figure 5(a) to the scenario <strong>in</strong><br />

Figure 5(d).<br />

6. Conclusion<br />

<strong>The</strong> literature on retailer entry and location choices has thus far ignored the<br />

agglomeration-differentiation tradeoff. We developed a comprehensive static, structural,<br />

simultaneous move game model of firm entry and location choice that disentangles this tradeoff<br />

while controll<strong>in</strong>g for several alternative explanations for observed collocation. Tak<strong>in</strong>g advantage<br />

of a publicly available, digital land cover database, NLCD, we are able to control for the effect of<br />

zon<strong>in</strong>g on entry and location choices. To control for demand and cost based explanations for<br />

collocation, we decompose latent profits <strong>in</strong>to revenue and cost and augment entry and location<br />

data with store revenue data. To separate the benefits of agglomeration from the benefits of<br />

spatial differentiation, we further decompose revenue <strong>in</strong>to its components of consumer choice<br />

based volume and competition based price. We use recent advances <strong>in</strong> the empirical estimation<br />

39


literature of discrete games to address issues of multiple equilibria <strong>in</strong> the model and data as well<br />

as problems due to slow convergence of the estimation algorithm.<br />

<strong>The</strong> consumer and price model provided <strong>in</strong>terest<strong>in</strong>g <strong>in</strong>sights about the differences <strong>in</strong> the<br />

agglomeration and competition effects across store formats. <strong>The</strong>se results and the subsequent<br />

counterfactual analyses lead to the follow<strong>in</strong>g takeaways: First, zon<strong>in</strong>g, agglomeration effects,<br />

spatial differentiation and format differentiation are all key drivers of observed store location<br />

patterns. Second, zon<strong>in</strong>g may force firms to locate closer than what they would like but it has<br />

little direct effect on collocation of stores. F<strong>in</strong>ally, zon<strong>in</strong>g <strong>in</strong>teracts with agglomeration to drive<br />

observed collocation. <strong>The</strong> <strong>in</strong>teraction between zon<strong>in</strong>g and the agglomeration effect can have a<br />

discont<strong>in</strong>uous impact on the location pattern of stores. This highlights the value of a structural<br />

model <strong>in</strong> understand<strong>in</strong>g how a small perturbation of market characteristics can cause strategic<br />

firms to respond <strong>in</strong> complex and nonl<strong>in</strong>ear ways.<br />

We conclude with a discussion of some key limitations <strong>in</strong> this paper that warrant future<br />

research. First, our identification of the volume and price effects is partially aided by functional<br />

form assumptions for how locations of competitors affect volumes and prices differently. This is<br />

because we only have price <strong>in</strong>formation for a set of stores belong<strong>in</strong>g to one store cha<strong>in</strong>.<br />

Nonetheless, this is managerially <strong>in</strong>terest<strong>in</strong>g as it is closer to a realistic scenario where firms<br />

usually have more <strong>in</strong>formation about themselves than about others. Second, we treat entry<br />

decision <strong>in</strong> a static equilibrium framework, even though a dynamic model may be more<br />

appropriate given that these decisions are made over time. Such a model<strong>in</strong>g approach requires<br />

better data (tim<strong>in</strong>g of entry and exits) as well as richer model<strong>in</strong>g framework to solve the dynamic<br />

game. F<strong>in</strong>ally, we have treated store entry decisions across markets as <strong>in</strong>dependent, unlike recent<br />

work by Jia (2008), who models the cha<strong>in</strong> entry decision, tak<strong>in</strong>g <strong>in</strong>to account the<br />

40


<strong>in</strong>terdependence across markets. However, her model<strong>in</strong>g approach is restricted to a small number<br />

of compet<strong>in</strong>g cha<strong>in</strong>s and is hard to extend to our grocery market sett<strong>in</strong>g that <strong>in</strong>volves a large<br />

number of players. <strong>The</strong>se important issues await future research.<br />

41


Figure 1: Over 45% of big-box grocery stores are with<strong>in</strong> 0.5 mi. of a rival store<br />

1 2 3 …<br />

Figure 2(a): An illustrative square market<br />

with the geographical space discretized<br />

<strong>in</strong>to square blocks or ‘locations’.<br />

(Data for 3 U.S. states of NY, OH and PA)<br />

… L m<br />

1 2<br />

3 …<br />

Figure 2(b): Due to zon<strong>in</strong>g regulations,<br />

firms can only choose among ‘potential<br />

retail location’ (Area <strong>in</strong> white).<br />

42<br />

… lm


Figure 3(a): Graphical illustration of the standard NPL approach<br />

Figure 3(b): With multiple equilibria<br />

<strong>in</strong> the data, different start<strong>in</strong>g values<br />

may give different solutions<br />

43<br />

Figure 3(c): Depend<strong>in</strong>g on the local<br />

convergence properties, the contraction<br />

mapp<strong>in</strong>g may not converge to a fixed po<strong>in</strong>t


Figure 4(a): Construct<strong>in</strong>g market boundaries based on visual<br />

<strong>in</strong>spection of residential and commercial pixel density<br />

44<br />

Figure 4(b): Divid<strong>in</strong>g a rectangular market <strong>in</strong>to a grid of<br />

1 sq. mile blocks or discrete locations


Figure 4(c): Us<strong>in</strong>g commercial land pixel data to obta<strong>in</strong><br />

extent of commercial activity with<strong>in</strong> a location and the<br />

commercial center of the location<br />

45<br />

Figure 4(d): Us<strong>in</strong>g ‘Places of Interest’ <strong>in</strong> Google<br />

Earth to check for the presence of big-box stores <strong>in</strong><br />

commercial locations


120<br />

80<br />

40<br />

0<br />

Figure 5(a): Neither Zon<strong>in</strong>g nor <strong>Agglomeration</strong><br />

120<br />

80<br />

40<br />

0<br />

28%<br />

Figure 5(c): No Zon<strong>in</strong>g; Only <strong>Agglomeration</strong><br />

46<br />

120<br />

80<br />

40<br />

0<br />

Figure 5(b): Only Zon<strong>in</strong>g; No <strong>Agglomeration</strong><br />

180<br />

120<br />

60<br />

0<br />

0.5<br />

32%<br />

43% 60%<br />

1<br />

1.5<br />

2<br />

2.5<br />

3<br />

Figure 5(d): With Zon<strong>in</strong>g and <strong>Agglomeration</strong><br />

More


Notes: SM – Supermarket format; SS – Superstore format; LA – Limited Assortment format.<br />

Figure 6: Equilibrium Store Locations <strong>in</strong> a Simulated Market – Shr<strong>in</strong>k<strong>in</strong>g Retail Zone<br />

(Area <strong>in</strong> White represents retail locations)<br />

47


Store Format<br />

Examples of<br />

Retailers 26<br />

Total Number of<br />

Stores <strong>in</strong> 98<br />

Sample Markets<br />

Maximum Number<br />

of Stores <strong>in</strong> a<br />

Market<br />

Average Store<br />

Area<br />

(<strong>in</strong> sq. feet)<br />

Average Annual<br />

Store Revenue<br />

from Grocery<br />

Sales (<strong>in</strong> $ M)<br />

Average Ratio of<br />

Grocery Revenue<br />

to Total Store<br />

Revenue<br />

Supermarket<br />

(SM)<br />

Hi-Low Food<br />

Stores, Price<br />

Chopper, Vons<br />

Market<br />

Superstore<br />

(SS)<br />

Jewel Food<br />

Store, BI-LO,<br />

Vons Market,<br />

Albertsons,<br />

Safeway,<br />

Ltd. Assort.<br />

(LA)<br />

Save-A-Lot,<br />

Price Rite,<br />

Aldi, Smart &<br />

F<strong>in</strong>al<br />

48<br />

Natural Food<br />

(NF)<br />

Whole Foods,<br />

Trader Joes<br />

Food + Drug<br />

(FD)<br />

Jewel-Osco,<br />

Kroger,<br />

Albertsons,<br />

Safeway<br />

Supercenter<br />

(SC)<br />

Wal-Mart<br />

Supercenter,<br />

Super Target,<br />

Meijer, Sams<br />

Club, Costco<br />

84 69 96 20 103 66<br />

6 4 5 2 6 3<br />

13,500 35,500 14,500 10,500 41,500 163,000<br />

5.93 15.24 5.23 9.22 16.09 51.84<br />

1 1 1 1 0.71 0.62<br />

Table 1: Descriptive Statistics of Various Grocery Store Formats<br />

26 Some retailers have more than one type of stores (e.g., Vons, Albertsons, and Safeway). We follow the format classification of <strong>in</strong>dividual stores<br />

provided by AC Nielsen.


Variable 27<br />

Travel Cost ( Tvl ) gl<br />

Supermarket<br />

(SM)<br />

Superstore<br />

(SS)<br />

49<br />

Ltd. Assort.<br />

(LA)<br />

Distance ( d gl ) - 0.1827<br />

Natural<br />

Food (NF)<br />

Distance 2 ( d ) 0.9862***<br />

2<br />

gl<br />

med _ hhI g * d 0.0436<br />

gl<br />

med _ ageg * d 0.0579<br />

gl<br />

m<strong>in</strong> _ dg * d - 0.1482*<br />

gl<br />

Price ln ( pr fl )<br />

- 0.4255<br />

Economies of<br />

Scope<br />

Store<br />

<strong>Agglomeration</strong><br />

Format<br />

<strong>Agglomeration</strong><br />

Format<br />

Preference<br />

Customer Value<br />

(Equation 12)<br />

comm l<br />

1.5468***<br />

N l<br />

0.5342**<br />

Food + Drug<br />

(FD)<br />

Supercenter<br />

(SC)<br />

OF<br />

I fl<br />

0.6329 0.4917 0.9458** 1.2031 1.2974** 0.8337**<br />

-- -0.3251 -0.2735** -0.5809 0.3683*** 0.2196**<br />

CV fl 0.3932** 0.4885 0.3826* 0.5404 0.6391** 0.5673<br />

27 Note: * : p < 0.1, ** p < 0.05, *** : p < 0.01; All significant estimates <strong>in</strong> bold.<br />

Table 2(a): Consumers’ Shopp<strong>in</strong>g Location Choice Based Volume


Formatspecific<br />

Pric<strong>in</strong>g<br />

Ability<br />

Competition<br />

Effect<br />

Supermarket Superstore Ltd. Assort. Natural Food + Drug Supercenter<br />

(SM) (SS) (LA) Food (NF) (FD) (SC)<br />

Intr<strong>in</strong>sic Ability 1.4468* 1.7413* 0.9685* 1.2890 1.4963* 1.1372*<br />

Variable 28<br />

Per Capita Income <strong>in</strong><br />

2mi. radius ( x l )<br />

SM; 0-1mi. coeff. - 1.4895***<br />

SS; 0-1mi. coeff. -1.1043** -2.5038**<br />

LA; 0-1mi. coeff. - 0.4712* - 0.5620* - 1.7921**<br />

50<br />

0.0861<br />

NF; 0-1mi. coeff. - 0.8095 - 1.2597 - 0.5328 - 3.6044<br />

FD; 0-1mi. coeff. - 0.5991** - 1.6469** - 0.8129** - 1.3031 -4.2357**<br />

SC; 0-1mi. coeff. - 0.6344 - 0.3609* - 0.2315 - 0. 3988 -1.014* -3.9460*<br />

1-2mi. multiplier ( κ 2 ) 0.5806***<br />

2-3mi. multiplier ( κ 3 ) 0.3247***<br />

3-4mi. multiplier ( κ 4 ) - 0.0521<br />

4-5mi. multiplier ( κ 5 ) 0.0173<br />

28 Note: * : p < 0.1, ** p < 0.05, *** : p < 0.01; All significant estimates <strong>in</strong> bold.<br />

Table 2(b): Competition Based Price Index


Variable 29<br />

Supermarket<br />

(SM)<br />

Superstore<br />

(SS)<br />

Ltd. Assort.<br />

(LA)<br />

Natural<br />

Food (NF)<br />

Food + Drug<br />

(FD)<br />

Supercenter<br />

(SC)<br />

Cost Intercept -- - 0.2451 0.0521 - 0.4489 0.3902 - 0.5813**<br />

Commercial Activity ( comm ) l - 0.0774 0.2917* - 0.3265** 0.0592 - 0.1999* 0.3711**<br />

0-1mi. coefficient - 0.5996 - 0.6233** - 0.2908* - 0.1370 - 0.4391** -1.2028**<br />

Population<br />

Per Capita<br />

Income<br />

Common<br />

Unobserved<br />

Location-Level<br />

Shocks<br />

Common<br />

Unobserved<br />

Market-level Cost<br />

1-2mi. multiplier 0.7503**<br />

2-3mi. multiplier 0.3132*<br />

3-4mi. multiplier 0.0830<br />

4-5mi. multiplier - 0.1114<br />

0-1mi. coefficient - 0.7830 - 0.5548* - 1.2619* 0.3012 - 0.5693** - 0.8896*<br />

1-2mi. multiplier 0.2815*<br />

2-3mi. multiplier - 0.1002<br />

3-4mi. multiplier - 0.0036<br />

4-5mi. multiplier 0.0416<br />

Std., Price Shock: σ 0.7783<br />

p<br />

Std., Revenue Shock: σ 1.0928**<br />

r<br />

Std., Cost Shock: σ 1.6041*<br />

c<br />

Revenue-Cost Corr.: ρ 0.8932**<br />

µ ( ξ )<br />

- 3.2962***<br />

σ ( ξ )<br />

1.3901***<br />

29 Note: * : p < 0.1, ** p < 0.05, *** : p < 0.01; All significant estimates <strong>in</strong> bold.<br />

Table 2(c): Cost and Common Unobserved Components<br />

51


Appendix A<br />

Expected total number of compet<strong>in</strong>g stores <strong>in</strong> a location:<br />

[ ] m<br />

E N = N∑ p<br />

A.1<br />

l fl<br />

f<br />

Expectation that a location will have stores with other formats besides format-f:<br />

( )<br />

OF<br />

E ⎡<br />

⎣I⎤ fl ⎦ = 1− prob location has only format- f stores<br />

∏ ( ' )<br />

= 1− p 1−<br />

p<br />

fl f l<br />

f '≠<br />

f<br />

Expected number of format-f’ rivals <strong>in</strong> distance band b around a format-f store that is <strong>in</strong> location<br />

l (<strong>in</strong>terformat competition):<br />

⎛ ⎞<br />

m<br />

E ⎡<br />

⎣N ⎤ f 'bl ⎦ = ⎜N pf' j ⎟ ; f ' ≠ f<br />

⎜ ∑ ⎟<br />

⎝ j∈lb<br />

⎠<br />

where, is the set of locations <strong>in</strong> distance band b around location l.<br />

lb<br />

Expected number of format-f rivals <strong>in</strong> distance band b around a format-f store that is <strong>in</strong> location l<br />

(<strong>in</strong>traformat competition):<br />

When account<strong>in</strong>g for the number of rivals with the same format, we need to discount the<br />

choice probability of the focal firm, conditional on its decision to enter the market:<br />

⎛ ⎞ ⎛ ⎞<br />

m<br />

E⎡ ⎣N⎤ fbl ⎦ = ⎜N p fj ⎟−⎜1 ( p fj f Enters m)<br />

⎟<br />

⎜ ∑ <br />

⎟ ⎜ ∑<br />

⎟<br />

⎝ j∈lb ⎠ ⎝ j∈lb<br />

⎠<br />

Note that the probability that a f-format firm enters the market is simply<br />

the probability ( p fj f Enters m ) is given by<br />

fj fl<br />

l=<br />

1<br />

52<br />

lm<br />

A.2<br />

A.3<br />

A.4<br />

lm<br />

∑ p fl . Hence,<br />

l=<br />

1<br />

p ∑ p and Equation (A.4) can be rewritten as:<br />

⎛ ⎞ ⎛ lm<br />

⎞<br />

m<br />

E⎣ ⎡N fbl ⎦<br />

⎤ = ⎜N pfj⎟−⎜1 pfjpfl⎟ ⎜ ∑ <br />

⎟ ⎜ ∑ ∑ ⎟<br />

⎝ j∈lb ⎠ ⎝ j∈ lb<br />

l=<br />

1 ⎠<br />

A.5


Step 0: Initial Population:<br />

Appendix B<br />

Generate a set of T vectors of start<strong>in</strong>g values for retailers’ beliefs about rivals’ CCPs for<br />

1 2 T<br />

location choices, ⎡<br />

⎣P0; P0 ;...; P ⎤ 0 ⎦ Also, create an <strong>in</strong>itial guess for the parameter vector,<br />

θ = αβγσρ , , , , .<br />

( { } )<br />

Step 1: Locally Contractive, q-NPL Iteration:<br />

For the likelihood maximization, set up an <strong>in</strong>ternal loop to do the follow<strong>in</strong>g for each of the T<br />

CCP vectors:<br />

Given the current parameter values, pick a large number of Halton draws of price,<br />

revenue and cost shocks for all retail locations. Obta<strong>in</strong> the location choice probabilities<br />

(Equations 20 - 22) and the market specific cost parameters (Equations 26 - 27). Next, calculate<br />

the price <strong>in</strong>dices of firms, sans the unobserved component, for the chosen locations and with the<br />

observed configuration of stores. Compare the price estimates with the price data to obta<strong>in</strong> the<br />

pr<br />

price shocks at the chosen locations of the store cha<strong>in</strong> for which we have price data, ( ωobv θ ) .<br />

Also, calculate the revenues of stores, sans the unobserved component, for the chosen locations<br />

and with the observed store configuration. Compare the revenue estimates with the revenue data<br />

r<br />

for all stores to obta<strong>in</strong> the revenue shocks of firms <strong>in</strong> their chosen locations, ( ωobv θ ) . We now<br />

have all the components of the likelihood function. 30<br />

Maximize the pseudo likelihood (Equation 28) to obta<strong>in</strong> a set of T vectors of parameter<br />

t t<br />

estimates: Θ n = arg max ( L( Pn−1, Θ ) ) , and a new population of CCPs us<strong>in</strong>g the q-NPL operator:<br />

Θ<br />

ˆ t q t t<br />

P =Λ P −1,<br />

Θ .<br />

( )<br />

n n n<br />

With<strong>in</strong> each market, normalize the CCPs for each store format so that the CCPs of all formats<br />

add up to one. Essentially, for each format f, and market location l, we have :<br />

F lm<br />

ˆ t q t t q t t<br />

Pfln=Λ ( Pfln−1, Θn) ∑∑ Λ ( Pfln−1,<br />

Θn)<br />

B.1<br />

53<br />

f = 1 l=<br />

1<br />

30 We acknowledge that there is a potential selection bias because we only observe revenue data for locations that<br />

were chosen. Ellickson and Misra (2007) propose a selection correction function <strong>in</strong> their application where<br />

supermarkets choose from one of three pric<strong>in</strong>g strategies. However, their approach suffers from a curse of<br />

dimensionality <strong>in</strong> cases where the card<strong>in</strong>ality of firms’ action space is large, as is the case for firms choos<strong>in</strong>g from<br />

multiple locations with<strong>in</strong> a market.


Step 2: Selection of Parents - Based on their fitness, draw, with replacement, T ‘mother’ CCP<br />

1 2<br />

vectors and T ‘father’ CCP vectors from the set, ⎡ ˆ<br />

;<br />

ˆ<br />

;...;<br />

ˆT<br />

Pn Pn P ⎤<br />

⎢ n and form couples or Parents.<br />

⎣ ⎥⎦<br />

ˆ t t<br />

L P , Θ , and those closer to convergence (Absolute value<br />

CCPs with high likelihood values, ( n n)<br />

of<br />

ˆ t ( Pn t<br />

Pn−1) − closer to zero) are considered more fit to cont<strong>in</strong>ue. In our problem, we use the<br />

follow<strong>in</strong>g fitness criterion:<br />

( ) ln ( , )<br />

h P<br />

ˆ<br />

= λ ⎡L P<br />

ˆ<br />

Θ ⎤−λ<br />

P<br />

ˆ<br />

−P<br />

−<br />

⎢⎣ ⎥⎦<br />

t t t t t<br />

n 1 n n 2 n n 1<br />

where, λ 1 and λ 2 are small positive constants. <strong>The</strong> t th CCP vector gets selected with the<br />

probability:<br />

T<br />

ˆ ˆ<br />

( ( n ) ) ( ( n ) )<br />

54<br />

j=<br />

1<br />

B.2<br />

t t j<br />

S = exp h P ∑ exp h P<br />

B.3<br />

Now, we have the set of couples:<br />

ˆ1'ˆ1'' ˆ 2' ˆ 2'' ˆT'ˆT'' ( Pn , Pn ) ; ( Pn , Pn ) ;...; ( Pn , Pn<br />

)<br />

⎡ ⎤<br />

⎢⎣ ⎥⎦<br />

Step 3: Crossover and Mutation – Obta<strong>in</strong> an offspr<strong>in</strong>g from each couple as follows:<br />

ˆ ' ' '' ''<br />

( δ<br />

ˆ ) ( 1<br />

ˆ ˆ<br />

) ( δ )<br />

P = D• P + Z • • P + −D • P + Z • • P B.4<br />

t t t t t<br />

n n n n n n n n n<br />

where, D is a vector of <strong>in</strong>dicators for the identity of the parent who provides each element of the<br />

CCPs. Its elements are i.i.d. with Pr ( D j = 1) = 0.5 for the j th element. Zn is another vector of<br />

<strong>in</strong>dicators for the identity of the elements of the CCPs which undergo mutation. Its elements are<br />

also i.i.d. with Pr ( Z jn = 1) = 0.5 n.<br />

Hence, with multiple iterations, as we get closer to the<br />

global optimum, we allow the amount of mutations to reduce to zero. F<strong>in</strong>ally, δn is a vector<br />

whose elements represent the magnitude of a mutation. It is also def<strong>in</strong>ed such that its elements<br />

δ ∈U − 0.5 n, 0.5 n<br />

go to zero with multiple iterations. Specifically, we use: jn ( )<br />

As with Step 1, with<strong>in</strong> each market, aga<strong>in</strong> normalize the CCPs so that the CCPs of all<br />

1 2 T<br />

formats add up to one. Now, we have the new set of CCPs, ⎡<br />

⎣Pn; Pn ;...; P ⎤ n ⎦ .<br />

Iterate Steps 1-3 until the set of CCPs converges.


Appendix C<br />

<strong>The</strong> follow<strong>in</strong>g steps expla<strong>in</strong> the technical operations <strong>in</strong>volved <strong>in</strong> extract<strong>in</strong>g commercial<br />

land use pixel po<strong>in</strong>t data from NLCD. This is the authors’ orig<strong>in</strong>al approach. However, a more<br />

efficient approach may be plausible.<br />

1. Open NLCD data <strong>in</strong> ArcGIS<br />

2. Zoom <strong>in</strong> to the <strong>in</strong>terested market area and select the data frame for further process<strong>in</strong>g<br />

3. Change coord<strong>in</strong>ate system to WGS 1984<br />

4. Reclassify the raster data to show only commercial land pixel po<strong>in</strong>ts<br />

5. Convert the reclassified raster data <strong>in</strong>to Po<strong>in</strong>t Features and save as a Shapefile<br />

6. Convert the saved Shapefile <strong>in</strong>to a kml file us<strong>in</strong>g shp2kml software. <strong>The</strong> kml file can be<br />

opened <strong>in</strong> Google Earth (GE), allow<strong>in</strong>g us to see the pixel po<strong>in</strong>t data on GE<br />

7. Make a copy of the saved kml file and rename the file from “.kml” to “.xml” This xml file<br />

can be opened <strong>in</strong> Excel and the spreadsheet will show the coord<strong>in</strong>ates (latitude and longitude)<br />

of each pixel po<strong>in</strong>t, which may be used for further analysis<br />

8. <strong>The</strong> count of these pixel po<strong>in</strong>ts with<strong>in</strong> each 1 sq. mi. block market location gives the measure<br />

for the <strong>in</strong>tensity of commercial activity <strong>in</strong> the location and the mean of the coord<strong>in</strong>ates of<br />

the pixel po<strong>in</strong>ts with<strong>in</strong> the location gives the commercial center of the location<br />

In their classification of land types, NLCD 2001 comb<strong>in</strong>es high density residential land<br />

and commercial land but NLCD 1992 separates them. Hence, we match the two data sets us<strong>in</strong>g<br />

ArcGIS software to separate the pixel data for all residential land areas from land areas with<br />

commercial activity <strong>in</strong> 2001. We are able to do this separation because land areas which were<br />

high density residential <strong>in</strong> 1992 are unlikely to convert to commercial land areas by 2001, and<br />

vice versa. In the rare <strong>in</strong>stances where an area that was low-density residential <strong>in</strong> 1992 was<br />

classified as commercial land <strong>in</strong> the 2001 data, we do a quick visual <strong>in</strong>spection of the<br />

geographical area us<strong>in</strong>g Google Earth to confirm whether that area is truly commercial land or if<br />

it has converted <strong>in</strong>to a high density residential land.<br />

55


References<br />

Aguirregabiria, V., and P. Mira (2005), “A Genetic Algorithm for the Structural Estimation of<br />

Games with Multiple Equilibria,” Work<strong>in</strong>g Paper.<br />

Aguirregabiria, V., and G. Vicent<strong>in</strong>i (2006), “Dynamic Spatial Competition between Multi-Store<br />

Firms,” Work<strong>in</strong>g Paper.<br />

Aguirregabiria, V., and P. Mira (2007), “Sequential Estimation of Dynamic Discrete Games,”<br />

Econometrica, 75, 1, 1 - 53.<br />

Aguirregabiria, V., P. Bajari, M. Draganska, L. E<strong>in</strong>av, D. Horsky, S. Misra, S. Narayanan, Y.<br />

Orhun, P. Reiss, K. Seim, V. S<strong>in</strong>gh, R. Thomadsen, and T. Zhu (2008), “Discrete Choice<br />

Models with Strategic Interactions,” Market<strong>in</strong>g Letters, 19, 399-416.<br />

Arentze, T.A., O. H. Oppewal, and H.J.P. Timmermans (2005), “A Multipurpose Shopp<strong>in</strong>g Trip<br />

model to Assess Retail <strong>Agglomeration</strong> Effects,” Journal of Market<strong>in</strong>g Research, 42<br />

(February), 109-115.<br />

Bajari, P., Benkard, L., and Lev<strong>in</strong>, J. (2007), “Estimat<strong>in</strong>g Dynamic Models of Imperfect<br />

Competition,” Econometrica 75, 5, 1331 - 1370.<br />

Berry, S. (1992), “Estimation of a Model of Entry <strong>in</strong> the Airl<strong>in</strong>e Industry,” Econometrica, 60.<br />

889 – 917.<br />

Berry, S., and E. Tamer (2006): “Identification <strong>in</strong> Models of Oligopoly Entry,” Advances <strong>in</strong><br />

Economics and Econometrics: <strong>The</strong>ory and Applications, N<strong>in</strong>th World Congress, Vol. II.<br />

Bester, H. (1998), “Quality Uncerta<strong>in</strong>ty Mitigates Product <strong>Differentiation</strong>,” RAND Journal of<br />

Economics, 29 (W<strong>in</strong>ter), 828-844.<br />

Bresnahan, T., and P. Reiss (1991), “Entry and Competition <strong>in</strong> Concentrated Markets,” Journal<br />

of Political Economy, 99, 977-1009.<br />

Chan, T. Y., V. Padmanabhan, and P. B. Seetharaman (2007), “An Econometric Model of<br />

Location and Pric<strong>in</strong>g <strong>in</strong> the Gasol<strong>in</strong>e Market,” Journal of Market<strong>in</strong>g Research, 44, 4, 622<br />

- 635.<br />

Ciliberto, F. and E. Tamer (2009), “Market Structure and Multiple Equilibria <strong>in</strong> Airl<strong>in</strong>e<br />

Markets,” Econometrica, 77, 6, 1791-1828.<br />

Datta S. and K. Sudhir (2011), “Does Reduc<strong>in</strong>g Spatial <strong>Differentiation</strong> Increase Product<br />

<strong>Differentiation</strong>? Effects of Zon<strong>in</strong>g on Retail Entry and Format Variety,” forthcom<strong>in</strong>g <strong>in</strong><br />

Quantitative Market<strong>in</strong>g and Economics.<br />

Draganska, M., Mazzeo, M., and Seim, K. (2009), “Beyond Pla<strong>in</strong> Vanilla: Model<strong>in</strong>g Jo<strong>in</strong>t<br />

Product Assortment and Pric<strong>in</strong>g Decisions,” Quantitative Market<strong>in</strong>g and Economics, 7, 2,<br />

105 - 146.<br />

Duan, A.J. and C. F. Mela (2009), “<strong>The</strong> Role of Spatial Demand on Outlet Location and<br />

Pric<strong>in</strong>g,” Journal of Market<strong>in</strong>g Research, 46, 2, 260 – 278.<br />

56


Dudey, M. (1990), “Competition by Choice: <strong>The</strong> Effect of Consumer Search on Firm Location<br />

Decisions,” <strong>The</strong> American Economic Review, 80 (5), 1092-1104.<br />

Ellickson, P. B., and S. Misra (2011), “Enrich<strong>in</strong>g Interactions: Incorporat<strong>in</strong>g Revenue and Cost<br />

Data <strong>in</strong>to Static Discrete Games,” Quantitative Market<strong>in</strong>g and Economics,<br />

(Forthcom<strong>in</strong>g).<br />

Ellickson, P. B. and S. Misra (2008), “Supermarket Pric<strong>in</strong>g Strategies,” Market<strong>in</strong>g Science, 27,<br />

5, 811 – 828.<br />

Fischer, J.H., and J.E. Har<strong>in</strong>gton (1996), “Product Variety and Firm <strong>Agglomeration</strong>,” RAND<br />

Journal of Economics, 27, 281-309.<br />

Fox, J. (2007), “Semiparametric Estimation of Mult<strong>in</strong>omial Discrete Choice Models Us<strong>in</strong>g a<br />

Subset of Choices,” RAND Journal of Economics, 38, 4, 1002 - 1019.<br />

Fox, E. J., S. Postrel and A. McLaughl<strong>in</strong> (2007), “<strong>The</strong> Impact of Retail Location on Retailer<br />

Revenues: An Empirical Investigation,” Work<strong>in</strong>g paper<br />

Holmes, T. (2008), “<strong>The</strong> Diffusion of Wal-Mart and Economies of Density,” forthcom<strong>in</strong>g <strong>in</strong><br />

Econometrica.<br />

Homer, C., C. Huang, L. Yang, B. Wylie, and M. Coan (2004), “Development of a 2001<br />

National Land-Cover Database for the United States,” Photogrammetric Eng<strong>in</strong>eer<strong>in</strong>g &<br />

Remote Sens<strong>in</strong>g, 70, 7, 829 – 840.<br />

Jia, P. (2008), “What Happens When Wal-Mart Comes to Town: An Empirical Analysis of the<br />

Discount Retail<strong>in</strong>g Industry,” Econometrica, 76, 6, 1263 - 1316.<br />

Kasahara, H. and K. Shimotsu (2008), “Sequential Estimation of Structural Models with a Fixed<br />

Po<strong>in</strong>t Constra<strong>in</strong>t,” Work<strong>in</strong>g Paper.<br />

Konishi, H. (2005), “Concentration of Compet<strong>in</strong>g Retail Stores,” Journal of Urban Economics,<br />

58, 488-512.<br />

Mazzeo, M. (2002), “Product Choice and Oligopoly Market Structure,” RAND Journal of<br />

Economics, 33, 221-242.<br />

Orhun, Y. (2005), “Spatial differentiation <strong>in</strong> the Supermarket Industry,” Work<strong>in</strong>g Paper.<br />

Pakes, A., M. Ostrovsky, and S. Berry (2007), “Simple Estimators for the Parameters of Discrete<br />

Dynamic Games, with Entry/Exit Examples,” RAND Journal of Economics, 38, 373 - 399<br />

Pesendorfer, M., and Schmidt-Dengler, P. (2008), “Asymptotic least Squares Estimators for<br />

Dynamic Games,” Review of Economic Studies, 75, 901-928.<br />

Seim, K. (2006), “An Empirical Model of Firm Entry with Endogenous Product-Type Choices,”<br />

RAND Journal of Economics, 37 (3), 619-640.<br />

Shlay, A. B. and P. H. Rossi (1982), “Keep<strong>in</strong>g up the Neighborhood: Estimat<strong>in</strong>g Net Effects of<br />

Zon<strong>in</strong>g,” American Sociological Review, 46, 703-719.<br />

57


Stahl, K. (1982), “Differentiated Products, Consumer Search, and Locational Oligopoly,” <strong>The</strong><br />

Journal of Industrial Economics, 31 (1-2), 97-113.<br />

Su, C., and K. L. Judd (2010), “Constra<strong>in</strong>ed Optimization Approaches to Estimation of<br />

Structural Models,” Work<strong>in</strong>g Paper<br />

Thomadsen, R. (2007), “Product Position<strong>in</strong>g and Competition: <strong>The</strong> Role of Location <strong>in</strong> the Fast<br />

Food Industry,” Market<strong>in</strong>g Science, 26, 6, 792 – 804.<br />

Varian, R. H. (1980), “A Model of Sales,” American Economic Review, 70, 651-659.<br />

Vitor<strong>in</strong>o, M. A. (2011), “Empirical Entry Games with Complementarities: An Application to the<br />

Shopp<strong>in</strong>g Center Industry,” Work<strong>in</strong>g Paper.<br />

Vogelmann, J.E., S.M. Howard, L. Yang, C.R. Larson, B.K. Wylie, and J.N. Van Driel (2001),<br />

“Completion of the 1990’s National Land Cover Data Set for the Conterm<strong>in</strong>ous United<br />

States,” PhotogrammetricEng<strong>in</strong>eer<strong>in</strong>g & Remote Sens<strong>in</strong>g, 67, 6, 650 – 662.<br />

Watson, R. (2005), “Entry and Location choice <strong>in</strong> Eyewear Retail<strong>in</strong>g,” mimeo., <strong>University</strong> of<br />

Texas-Aust<strong>in</strong>.<br />

Wernerfelt, B. (1994), “Sell<strong>in</strong>g Formats for Search Goods,” Market<strong>in</strong>g Science, 13 (3), 298-309.<br />

Wol<strong>in</strong>sky, A. (1983), “Retail Trade Concentration Due to Consumers’ Imperfect Information,”<br />

<strong>The</strong> Bell Journal of Economics, 14 (1), 275-282.<br />

Zhu, T., V. S<strong>in</strong>gh, and M. Manuszak (2009), “Market Structure and Competition <strong>in</strong> the Retail<br />

Discount Industry,” Journal of Market<strong>in</strong>g Research, 46, 4, 453-466.<br />

Zhu, T. and V. S<strong>in</strong>gh (2009), “Spatial Competition with Endogenous Location Choices: An<br />

Application to Discount Retail<strong>in</strong>g,” Quantitative Market<strong>in</strong>g and Economics, 7, 1, 1 - 35.<br />

58

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!