This methodology is less subjective th<strong>an</strong> Teh et al. (2006) <strong>an</strong>d Fox et al. (2011) becausethe hierarchical structure is robust to prior elicitation. It also facilitates the mixing of the<strong>Markov</strong> chain by shrinking λ to a reasonable region so that a new regime c<strong>an</strong> be easily born ifa structural ch<strong>an</strong>ge is implied by the data.2.2 Prior of PThe infinite-dimensional tr<strong>an</strong>sition matrix P is comprised of <strong>an</strong> infinite number of infinitedimensionalrow vector π j ’s, where j = 1, 2, · · · . Each π j = (π j1 , π j2 , · · · ) represents a probabilitymeasure on the natural numbers, namely, Pr(A | π j ) = ∞ π ji 1 i∈A . By definition,∑we∑should have π ji ≥ 0 for each j <strong>an</strong>d i <strong>an</strong>d ∞ π ji = 1 for each j. The prior of P is set asi=1i=1π 0 ∼ SBP(γ), (9)π j | π 0 ∼ DP (c, (1 − ρ)π 0 + ρδ j ) . (10)π 0 is a r<strong>an</strong>dom probability measure on the natural numbers <strong>an</strong>d drawn from a stick breakingprocess (SBP). 9 It serves as a hierarchical parameter of all π j ’s. Conditional on π 0 , each π jis drawn from a Dirichlet process (DP) <strong>with</strong> concentration parameter c <strong>an</strong>d shape parameter(1 − ρ)π 0 + ρδ j . 10 The aforementioned constraints on π j ’s are automatically satisfied by thisprior.From (10), the shape parameter, (1 − ρ)π 0 + ρδ j , is <strong>an</strong> infinite discrete distribution <strong>an</strong>drepresents the me<strong>an</strong> of π j by the definition of DP. It is a convex combination of the hierarchicaldistribution π 0 <strong>an</strong>d a degenerate distribution at integer j, δ j 11 , <strong>with</strong> ρ ∈ [0, 1]. The hierarchical9 The stick breaking process generates a probability measure over natural numbers. Each number is associated<strong>with</strong> a non-zero probability. For a probability measure p ≡ (p 1 , p 2 , · · · ) ∼ SBP(γ), where γ is a positive scalarwhich controls the concentration of a r<strong>an</strong>dom probability measure, p i is the probability associated <strong>with</strong> integeri <strong>with</strong> i = 1, 2, · · · . Appendix A.2 provides detailed expl<strong>an</strong>ation of this process.10 A Dirichlet process is a distribution of discrete distributions. It has two parameters: the shape parameter<strong>an</strong>d the concentration parameter. The shape parameter is a probability measure <strong>an</strong>d controls the centre of ther<strong>an</strong>dom samples, which is <strong>an</strong>alogous to the me<strong>an</strong> of a distribution. The concentration parameter is a positivescalar <strong>an</strong>d controls the tightness of a r<strong>an</strong>dom draw, which is <strong>an</strong>alogous to the inverse of the vari<strong>an</strong>ce of adistribution. Appendix A.1 provides a detailed{discussion of this process.1 if j ∈ A11 δ j is a probability measure <strong>with</strong> δ j (A) = .0 o.w.8
distribution π 0 creates a common shape for each π j <strong>an</strong>d δ j reflects the prior belief of regimepersistence. By construction, conditional on π 0 <strong>an</strong>d ρ, the me<strong>an</strong> of the tr<strong>an</strong>sition matrix P isa convex combination of two infinite-dimensional matrices, expressed by⎡E(P | π 0 , ρ) = (1 − ρ) ·⎢⎣⎤ ⎡π 01 π 02 π 03 · · ·π 01 π 02 π 03 · · ·+ ρ ·π 01 π 02 π 03 · · · ⎥ ⎢⎦ ⎣.. . . ..⎤1 0 0 · · ·0 1 0 · · ·.0 0 1 · · · ⎥⎦.. . . ..The above conditional me<strong>an</strong> of P shows that the self-tr<strong>an</strong>sition probability is larger as ρ goescloser to 1. In the rest of the paper, ρ is referred to as the sticky coefficient. It is introduced tothe iHMM for two reasons. First, empirical evidence shows that regime persistence is a salientfeature of m<strong>an</strong>y macroeconomic <strong>an</strong>d fin<strong>an</strong>ce variables. The sticky coefficient explicitly embedsthis feature into the prior. Second, a finite hidden <strong>Markov</strong> model usually has a small number ofregimes, which guar<strong>an</strong>tees that each regime c<strong>an</strong> have a reasonable amount of data. The infinitehidden <strong>Markov</strong> model, however, may assign each data to one distinct regime. This phenomenonis called state saturation, which is obviously not interesting <strong>an</strong>d harmful to forecasting. Thesticky coefficient shrinks the over-dispersed regime allocation towards a coherent one <strong>an</strong>d henceavoids the state saturation problem.In summary, the iHMM is comprised of (1) <strong>an</strong>d (2), in which (1) takes the form of (3) forbubble detection <strong>an</strong>d estimation. (4)-(8) comprise the hierarchical prior for Θ, <strong>an</strong>d (9)-(10)comprise the hierarchical prior for P .3 Estimation, Dating Algorithm <strong>an</strong>d <strong>Model</strong> Comparison3.1 EstimationThe posterior sampling is based on a <strong>Markov</strong> Chain Monte Carlo (MCMC) method. Fox et al.(2011) show that the block sampler which approximates the iHMM <strong>with</strong> truncation is more9