2data sets each transaction is labeled with codes identify<strong>in</strong>gthe members <strong>of</strong> the exchange who made the transaction.In most cases the member is act<strong>in</strong>g as a broker,i.e. they are h<strong>and</strong>l<strong>in</strong>g a trade for another <strong>in</strong>stitutionwho is not a member <strong>of</strong> the exchange. In other cases themember may trade for their own account. Thus a s<strong>in</strong>glemembership code may lump together trades from manydifferent <strong>in</strong>stitutions, orig<strong>in</strong>ated by many different <strong>trad<strong>in</strong>g</strong>strategies. As a consequence several hidden <strong>orders</strong>from different <strong>trad<strong>in</strong>g</strong> accounts may be active with thesame broker at the same time, mak<strong>in</strong>g it impossible tobe certa<strong>in</strong> that all <strong>orders</strong> are correctly identified.The detection problem is aided by the fact that thesize <strong>of</strong> hidden <strong>orders</strong> has a heavy-tailed distribution, <strong>and</strong><strong>large</strong> hidden <strong>orders</strong> can cause dramatic changes <strong>in</strong> therate at which a given membership code participates <strong>in</strong>trades to either buy or to sell. The method we use wasorig<strong>in</strong>ally developed for detection <strong>of</strong> local stationary regions<strong>in</strong> physiological time series [28]. As we use it here, itlooks for time <strong>in</strong>tervals <strong>in</strong> the time series <strong>of</strong> <strong>orders</strong> com<strong>in</strong>gthrough a given membership code when the firm acts asa net buyer or seller at an approximately constant rate.As <strong>in</strong> reference [27] we <strong>in</strong>terpret these series <strong>of</strong> tradesas belong<strong>in</strong>g to hidden <strong>orders</strong>. This method can neverbe perfectly accurate, but a variety <strong>of</strong> tests performed<strong>in</strong> reference [27] suggest that the reconstruction is goodenough to recover the most important statistical properties<strong>of</strong> hidden <strong>orders</strong> reasonably well. An importantadvantage <strong>of</strong> this approach is that we are able to study<strong>large</strong> samples <strong>of</strong> hidden <strong>orders</strong> com<strong>in</strong>g from the wholemarket rather than only a subset belong<strong>in</strong>g to a specific<strong>in</strong>stitution.The paper is organized as follows. In Section II wediscuss our data sets, the algorithm for detect<strong>in</strong>g hidden<strong>orders</strong> <strong>and</strong> the <strong>in</strong>vestigated variables. In Section IIIwe discuss the statistical properties <strong>of</strong> the variables characteriz<strong>in</strong>gthe hidden <strong>orders</strong>. In Section IV <strong>and</strong> V wepresent our empirical results on the <strong>impact</strong> <strong>of</strong> hidden <strong>orders</strong><strong>and</strong> on their <strong>trad<strong>in</strong>g</strong> <strong>pr<strong>of</strong>ile</strong>, respectively. Section VIconcludes.II.DATA AND INVESTIGATED VARIABLESOur databases conta<strong>in</strong> the on-book (SETS) markettransactions <strong>of</strong> the London Stock Exchange (LSE) fromJanuary 2002 to December 2004 <strong>and</strong> the electronic openbookmarket (SIBE) <strong>of</strong> the Spanish Stock Exchange(BME, Bolsas y Mercados Españoles) from January 2001to December 2004. Roughly 62% <strong>of</strong> the transactionsat the LSE are executed <strong>in</strong> the open book market <strong>and</strong>roughly 90% <strong>of</strong> the transactions at the BME are executed<strong>in</strong> the electronic market.We have <strong>in</strong>itially considered a subset consist<strong>in</strong>g <strong>of</strong> themost heavily traded <strong>stock</strong>s <strong>in</strong> the two markets, 74 <strong>stock</strong>straded <strong>in</strong> the LSE <strong>and</strong> 23 <strong>stock</strong>s traded <strong>in</strong> the BME. Forboth markets we have considered exchange members whomade at least one trade per day for at least 200 <strong>trad<strong>in</strong>g</strong>days per year <strong>and</strong> with a m<strong>in</strong>imum <strong>of</strong> 1000 transactionsper year. This filter yielded approximately 60 exchangemember firms per <strong>stock</strong>. We then applied the algorithmfor detect<strong>in</strong>g hidden <strong>orders</strong> described <strong>in</strong> Ref. [27], whichwe have already discussed, to identify hidden <strong>orders</strong> thatconsist <strong>of</strong> at least ten transactions. It is worth not<strong>in</strong>gthat the detected patches are not necessarily composed<strong>of</strong> the same type <strong>of</strong> trades (buy or sell) but that at least75% <strong>of</strong> the transacted volume <strong>in</strong> the patch must have thesame sign. The algorithm detected 90,393 hidden <strong>orders</strong><strong>in</strong> the LSE <strong>and</strong> 55,309 <strong>in</strong> the BME.This study is based entirely on trades that take placethrough a cont<strong>in</strong>uous double auction. “Cont<strong>in</strong>uous”refers to the fact that <strong>trad<strong>in</strong>g</strong> takes places cont<strong>in</strong>uously<strong>and</strong> asynchronously, <strong>and</strong> “double” to the fact that bothbuyers <strong>and</strong> sellers are allowed to place <strong>and</strong> cancel <strong>orders</strong>at any time. There are two fundamentally different waysto execute an order <strong>in</strong> such a market. One is to use alimit order, <strong>in</strong> which an order is placed <strong>in</strong>side the orderbook, which is essentially a list <strong>of</strong> unexecuted <strong>orders</strong> atdifferent prices. The other is to place a market order,which we def<strong>in</strong>e as any order that results <strong>in</strong> an immediatetransaction. Every transaction <strong>in</strong>volves a marketorder transact<strong>in</strong>g aga<strong>in</strong>st a limit order. A given real ordermight act as both, e.g. part <strong>of</strong> it might result <strong>in</strong> animmediate transaction <strong>and</strong> part <strong>of</strong> it might be left <strong>in</strong> theorder book. We only consider transactions, so <strong>in</strong> the exampleabove we would treat the first part as a marketorder <strong>and</strong> treat the second part as a limit order, but thesecond part will enter our analysis only if it eventuallyresults <strong>in</strong> a transaction. The LSE database allows us toidentify whether the <strong>in</strong>itiator <strong>of</strong> the transaction was thebuyer or the seller. For BME this <strong>in</strong>formation is not available<strong>and</strong> we <strong>in</strong>fer it with the Lee <strong>and</strong> Ready algorithm[29]A hidden order is characterized by several variables.These are• The execution time T (<strong>in</strong> seconds) <strong>of</strong> the hidden order,measured as the <strong>trad<strong>in</strong>g</strong> time <strong>in</strong>terval betweenthe first <strong>and</strong> the last transaction <strong>of</strong> the hidden order.• The number N <strong>of</strong> transactions <strong>of</strong> the hidden order.We consider hidden <strong>orders</strong> <strong>of</strong> length N > 10.• The volume V <strong>of</strong> the hidden order def<strong>in</strong>ed asV =N∑v j , (1)j=1where v j is the signed volume <strong>of</strong> each transaction<strong>of</strong> the hidden order. For buy trades v i > 0 <strong>and</strong> forsell trades v i < 0. We consider the hidden order tobe a buy order if V > 0 <strong>and</strong> a sell order if V < 0.The buy<strong>in</strong>g/sell<strong>in</strong>g nature <strong>of</strong> a hidden order is thusencoded <strong>in</strong> its sign, ɛ = sign(V ). The volume is theproduct <strong>of</strong> the number <strong>of</strong> shares times the price <strong>and</strong>is measured <strong>in</strong> Pounds (LSE) or <strong>in</strong> Euro (BME).
3• The volume fraction <strong>of</strong> market <strong>orders</strong> f mo . A hiddenorder can be implemented with very differentliquidity strategies, i.e. with different compositions<strong>of</strong> market <strong>and</strong> limit <strong>orders</strong>. In order to quantifythis we def<strong>in</strong>e the fraction (<strong>in</strong> volume) <strong>of</strong> market<strong>orders</strong> with<strong>in</strong> a hidden order as∑ Nj=1f mo =|v j,mo|∑ Nj=1 |v , (2)j|where v j,mo is the traded volume at each transactiondone through market <strong>orders</strong>. Values <strong>of</strong> f moclose to zero mean that the broker completed thehidden order by us<strong>in</strong>g ma<strong>in</strong>ly limit <strong>orders</strong>, whilevalues <strong>of</strong> f mo close to one imply the broker usedma<strong>in</strong>ly market <strong>orders</strong> dur<strong>in</strong>g the execution <strong>of</strong> thehidden order.• The participation rate α <strong>of</strong> a hidden order def<strong>in</strong>edas∑ Ni=1α =|v i|, (3)V Mwhere V M is the unsigned volume <strong>of</strong> the <strong>stock</strong>traded <strong>in</strong> the market concurrently with the hiddenorder. Values <strong>of</strong> α close to zero imply the hidden orderwas negligible compared to the activity <strong>in</strong> themarket, while values <strong>of</strong> α close to one mean thatmost <strong>of</strong> the activity <strong>in</strong> the market came from thetransactions <strong>of</strong> the hidden order.Summariz<strong>in</strong>g, we expect that the market <strong>impact</strong> <strong>of</strong> ahidden order is a functionr = f(N, V, T, f mo ), (4)plus possibly other variables specific <strong>of</strong> the <strong>stock</strong>, suchas the participation rate, the capitalization, the volatility,or the spread. We will now try to simplify this byunderst<strong>and</strong><strong>in</strong>g some <strong>of</strong> the relationships between the dependentvariables <strong>and</strong> by condition<strong>in</strong>g on some <strong>of</strong> them<strong>in</strong> our analysis. Note that <strong>in</strong> all the analyses <strong>and</strong> figureswe compute error bars as st<strong>and</strong>ard errors. It should beborn <strong>in</strong> m<strong>in</strong>d that this procedure underestimates the errorsdue to the heavy tails <strong>of</strong> the fluctuations <strong>and</strong> due topossible long-memory properties <strong>of</strong> the data.III.STATISTICAL PROPERTIES OF HIDDENORDERSWe <strong>in</strong>vestigate the statistical properties <strong>of</strong> the variablescharacteriz<strong>in</strong>g hidden <strong>orders</strong>. Ref. [27] considered a set <strong>of</strong>3 most capitalized <strong>stock</strong>s traded at the BME <strong>and</strong> studiedthe probability distribution <strong>of</strong> the variables characteriz<strong>in</strong>gthe hidden <strong>orders</strong> <strong>and</strong> the scal<strong>in</strong>g relations betweenthese variables. In Ref. [27] no restriction on the lengthor on the fraction <strong>of</strong> market <strong>orders</strong> was set on the hidden<strong>orders</strong>. The authors <strong>of</strong> Ref. [27] found that the distribution<strong>of</strong> hidden order size is fat tailed <strong>and</strong> consistent witha distribution with <strong>in</strong>f<strong>in</strong>ite variance. They also showedthat this broad distribution is due to an heterogeneity <strong>of</strong>scales among different brokerage firms rather than to theheterogeneity <strong>of</strong> scales with<strong>in</strong> the hidden <strong>orders</strong> <strong>of</strong> eachbrokerage firm. By us<strong>in</strong>g Pr<strong>in</strong>cipal Component Analysis(PCA) on the logarithm <strong>of</strong> the variables characteriz<strong>in</strong>gthe hidden <strong>orders</strong>, it was found that N, V <strong>and</strong> T arerelated through scal<strong>in</strong>g relationshipsN ∼ V g1 , T ∼ V g2 , N ∼ T g3 . (5)where g 1 ≃ 1, g 2 ≃ 2 <strong>and</strong> g 3 ≃ 0.66 for 3 highly capitalized<strong>stock</strong>s <strong>in</strong> the BME <strong>and</strong> <strong>in</strong>clud<strong>in</strong>g all hidden <strong>orders</strong>.We repeat the two dimensional PCA analysis <strong>of</strong> [27] onour much <strong>large</strong>r data set. Figure 1 shows the value <strong>of</strong>the three exponents for all the <strong>stock</strong>s as a function <strong>of</strong>the number <strong>of</strong> hidden <strong>orders</strong> per year. We observe thatfor <strong>stock</strong>s with a small number <strong>of</strong> hidden <strong>orders</strong> the heterogeneity<strong>in</strong> the value <strong>of</strong> the exponents is pretty <strong>large</strong>,while, as the number <strong>of</strong> hidden <strong>orders</strong> detected by the algorithm<strong>in</strong>creases, the exponent estimations become lessnoisy <strong>and</strong> tend to converge to similar values. Moreoverfor BME <strong>stock</strong>s there is a clear trend <strong>of</strong> the exponents asa function <strong>of</strong> the number <strong>of</strong> hidden <strong>orders</strong>. In order tomeasure market <strong>impact</strong> <strong>in</strong> a statistically reliable way, wepool together data from different <strong>stock</strong>s. We need thereforean homogeneous sample <strong>of</strong> <strong>stock</strong>s. To this end <strong>in</strong> thefollow<strong>in</strong>g analysis we restrict our dataset to those <strong>stock</strong>sfor which our algorithm detects at least 250 <strong>orders</strong> peryear. These <strong>stock</strong>s are TEF, SAN, BBVA (as <strong>in</strong> [27]) butalso REP, ELE, IBE, POP <strong>and</strong> ALT for the BME market<strong>and</strong> AZN, BSY, CCH, DVR, GUS, KEL, PO, PSON,SIG, TATE <strong>and</strong> TSCO for the LSE market. Moreover, <strong>in</strong>this paper we will focus ma<strong>in</strong>ly on short hidden <strong>orders</strong>,consider<strong>in</strong>g the set <strong>of</strong> hidden <strong>orders</strong> <strong>of</strong> time duration Tsmaller than one <strong>trad<strong>in</strong>g</strong> day. The reason for this choice,detailed below, is to obta<strong>in</strong> stable statistical averages forthe market <strong>impact</strong>. Apply<strong>in</strong>g these two restrictions, weobta<strong>in</strong> a f<strong>in</strong>al dataset that conta<strong>in</strong>s 14,655 hidden <strong>orders</strong><strong>in</strong> the BME <strong>and</strong> 11,165 <strong>orders</strong> for the LSE (see Table I).We repeat the two-dimensional PCA analysis <strong>of</strong> [27] onthe pooled set <strong>of</strong> hidden <strong>orders</strong> from different <strong>stock</strong>s. Wef<strong>in</strong>d for the BME market the follow<strong>in</strong>g exponentsg 1 = 0.81 (0.79; 0.82), (6)g 2 = 1.57 (1.43; 1.72), (7)g 3 = 0.67 (0.65; 0.68), (8)where quantities <strong>in</strong> parenthesis are 95% confidence <strong>in</strong>tervalsobta<strong>in</strong>ed through bootstrapp<strong>in</strong>g the data. Theserelations expla<strong>in</strong>s 83%, 61% <strong>and</strong> 80%, respectively, <strong>of</strong> thevariance observed <strong>in</strong> the data. For the LSE dataset wegetg 1 = 0.99 (0.98; 1.01), (9)g 2 = 2.41 (2.29; 2.52), (10)g 3 = 0.58 (0.57; 0.59), (11)