13.07.2015 Views

Bayesian analysis of ordinal survey data using the Dirichlet process ...

Bayesian analysis of ordinal survey data using the Dirichlet process ...

Bayesian analysis of ordinal survey data using the Dirichlet process ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

our rationale for <strong>the</strong> Uniform(0, 6) prior distribution is based on <strong>the</strong> observed responseX ij constrained to <strong>the</strong> five-point scale. It is thought that X ij represents <strong>the</strong> rounded score<strong>of</strong> <strong>the</strong> continuous latent variable Y ij whose mean is µ j when b i = 1 and a i = 0. For <strong>the</strong>personality parameters a i and b i , <strong>the</strong> prior assignment is based on <strong>the</strong> supposition that<strong>the</strong>re are classes <strong>of</strong> personalities. We <strong>the</strong>refore consider <strong>the</strong> <strong>Dirichlet</strong> <strong>process</strong>(a i , b i ) ′ iid ∼ GG ∼ DP(α, tr-Normal 2 (µ G , Σ G ))(5)for i = 1, . . . , n. The specification in (5) states that (a i , b i ) arises from a distribution Gbut G itself arises from a distribution <strong>of</strong> distributions known as <strong>the</strong> <strong>Dirichlet</strong> <strong>process</strong>. The<strong>Dirichlet</strong> <strong>process</strong> in (5) consists <strong>of</strong> <strong>the</strong> concentration parameter α and baseline distributiontr-Normal 2 (µ G , Σ G )) where tr-Normal refers to <strong>the</strong> truncated bivariate Normal whosesecond component b i is constrained to be positive. The baseline distribution serves as aninitial guess <strong>of</strong> <strong>the</strong> distribution <strong>of</strong> (a i , b i ) and <strong>the</strong> concentration parameter determines ourconfidence in <strong>the</strong> baseline distribution with large values <strong>of</strong> α > 0 corresponding to greaterdegrees <strong>of</strong> belief. Prior distributions can be assigned to <strong>the</strong> hyperparameters in (5). Ouranalyses involving course evaluation <strong>survey</strong>s on a five-point scale give sensible resultswith α ∼ Uniform(0.4, 10) (Ohlssen, Sharples and Spiegelhalter, 2007), µ G = (0, 1) ′ andΣ G = (σ ij ) where σ 11 = 1.0, σ 22 = 0.5 and σ ij = 0 for i ≠ j. Note that <strong>the</strong> choice <strong>of</strong>1.0 and 0.5 are sufficiently diffuse in <strong>the</strong> range <strong>of</strong> parameters a i and b i . The key aspect<strong>of</strong> <strong>the</strong> <strong>Dirichlet</strong> <strong>process</strong> in our application is that <strong>the</strong> personality parameters (a i , b i ) havesupport on a discrete space and this enables <strong>the</strong> clustering <strong>of</strong> personality types. Anadvantage <strong>of</strong> <strong>the</strong> <strong>Dirichlet</strong> <strong>process</strong> approach is that clustering is implicitly carried out in<strong>the</strong> framework <strong>of</strong> <strong>the</strong> model and <strong>the</strong> number <strong>of</strong> component clusters need not be specified inadvance. Once a <strong>the</strong>oretical curiousity, <strong>the</strong> <strong>Dirichlet</strong> <strong>process</strong> and its extensions are findingdiverse application areas in nonparametric modelling (e.g. Qi, Paisley and Carin 2007,9

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!