A Marginalisation Paradox Example

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View A Marginalisation Paradox Example as PDF for free.

More details

  • Words: 1,104
  • Pages: 27
A Marginalisation Paradox Example Dennis Prangle

28th October 2009

Overview

Bayesian inference recap Example of error due to a marginalisation paradox (Very) rough overview of general issues

Part I Bayesian Inference

Bayesian Inference

Prior distribution on parameters θ: p(θ) Model for the data X : f (X |θ) Posterior distribution is (using Bayes’ theorem): f (θ|X ) = R

p(θ)f (X |θ) p(θ)f (X |θ)dθ

n.b. p(θ) only needed up to proportionality Bayesian inference performed using computational Monte Carlo methods (e.g. MCMC) Typically also don’t need normalisation constant for p(θ) as ratios used

Improper Prior

A probability density p(θ) (roughly speaking!) satisfies: 1 2

p(θ) ≥0 R p(θ)dθ = 1

An improper prior doesn’t require condition 2 R Instead can have p(θ)dθ = ∞ Example: p(θ) = 1 “improper uniform” Sometimes used to represent prior ignorance Resulting posterior often a proper distribution ⇒ meaningful conclusions (. . . or are they?!)

Part II Example: Tuberculosis in San Francisco

Background: Tuberculosis

Tuberculosis is an infectious disease spread by bacteria Epidemiological interest lies in estimating rates of transmission and recovery Conjectured that data on bacteria mutation provides information → more accurate inference

Background: Paper

Tanaka et al (2006) investigated a Tuberculosis outbreak in San Francisco in 1991/2 473 samples of Tuberculosis bacteria taken at a particular date Genotyped according to a particular genetic marker Samples split into clusters which share the same genotype Cluster size Number of clusters

1 282

2 20

3 13

4 4

5 2

8 1

10 1

15 1

23 1

30 1

Model: Underlying disease process

Assume initially there is one case 3 event types: birth, death, mutation (→ new genotype) Suppose there are N cases at some time Rate of births: αN Rate of deaths: δN Rate of mutations θN Defines a continuous time Markov process model We don’t care about times (no data) so can reduce to discrete time Markov process

Model: Producing data

Run the disease process until there are 10,000 cases (If the disease dies out, rerun) Take a simple random sample of 473 cases Convert to data on genotype frequencies

Prior Some information on θ from previous studies Prior distribution N(0.198, 0.067352 ) chosen Corresponding density denoted p(θ)

Ignorance for other parameters Proposed (improper) overall prior:  p(θ) if 0 < δ < α p(α, δ, θ) = 0 otherwise Motivation: Marginal for θ is p(θ) Marginal for (α, δ) is improper uniform:  1 if 0 < δ < α 0 otherwise Restriction α > δ ⇒ zero prior probability on parameters where epidemic usually dies out

Results

See Tanaka et al paper Note change from prior

Parameter Redundancy

All parameters are proportional to rates Multiplying all by a constant affects only rate of events But this is irrelevant to our model Model is over-parameterised: (α, δ, θ) and (kα, kδ, kθ) give same likelihood

Reparameterisation Reparameterise to: a = α/(α + δ + θ) d = δ/(α + δ + θ) θ=θ Motivation: keep θ as have prior info for it a and d tell us everything about relative rates Only θ has info on absolute rates. . . . . . and θ has info on absolute rates only Parameter constraints: α, δ, θ > 0 ⇒ a, d, θ ≥ 0 and also a + d ≤ 1 Requirement α > δ in prior ⇒ a > d

Paradox (intuitive)

In new parameterisation, θ equiv to absolute rate info But data has no information on absolute rates So (marginal) θ posterior should equal prior?????

Analytic Results 1: Jacobian Recall: a = α/(α + δ + θ) d = δ/(α + δ + θ) θnew = θ Solve to give: α = aθnew /(1 − a − d) δ = dθnew /(1 − a − d) θ = θnew Differentiate for Jacobian:   θnew (1 − d) aθ a(1 − a − d) dθ θnew (1 − a) d(1 − a − d) J = (1−a−d)−2  0 0 1 2 (1 − a − d)−3 |J| = θnew

Analytic Results 2: Reparameterised prior

Recall p(α, δ, θ) = p(θ)I [0 < δ < α] (where p(θ) is a normal pdf)

Then: p(a, d, θnew ) = p(θ)I [0 < δ < α]|J| 2 = θnew p(θnew )I [0 < d < a](1 − a − d)−3

Analytic Results 3: Posterior

Recall likelihood depends on a, d only i.e. f (X |λ) = f (a, d)

So posterior is: 2 π(a, d, θnew ) ∝ θnew p(θnew )I [0 < d < a](1 − a − d)−3 f (a, d)

If this is proper, then posterior marginal for θ is: 2 π(θnew ) ∝ θnew p(θnew )

Matches results graph

Paradox and explanation

The prior was constructed to have marginal p(θ) The model contains no data on θ But we have shown that the posterior acts like ∝ θ2 p(θ) (easy to falsely conclude that change is due to data)

PARADOX The problem is that marginal distributions are not well defined for improper priors R i.e. p(α, δ, θ)dαdδ is not a pdf (integral not 1) Attempting to normalise gives /∞ problems

Prior didn’t really have claimed marginal

Practical resolution

Prior aimed to combine ignorance on α, δ with prior knowledge on θ In (a, d, θ) reparameterisation, range of (a, d) is finite Combine p(θ) with a uniform marginal on (a, d) using independence For this parameterisation does give proper prior So priors are well defined (side issue: is uniform best representation of ignorance?)

Part III Marginalisation Paradoxes: theory

Subjective Bayes viewpoint

Priors should represent prior beliefs Only a probability distribution represent beliefs coherently Therefore don’t use improper priors (this is the resolution used earlier)

Objective Bayes viewpoint

Conclusions shouldn’t depend on subjective beliefs (c.f. frequentist analysis) Instead use objective reference priors Lots of theory for choosing these Will often be improper (e.g. Jeffrey’s prior) So marginalisation paradoxes a real issue

The marginalisation paradox

Well-known Bayesian inference paradox From Dawid, Stone, Zidek (RSS B 1973; read paper) For models with a particular structure. . . . . . there are two marginalisation approaches to Bayesian inference For improper priors, these typically do not agree Large literature; claims of resolution but not fully acknowledged Is my example a special case of this?

Part IV Conclusion

Conclusion

Be wary of marginalisation issues for improper priors!

Bibliography

A. P. Dawid, M. Stone, and J. V. Zidek Marginalization paradoxes in Bayesian and structural inference JRSS(B), 35:189-233, 1973. Mark M. Tanaka, Andrew R. Francis, Fabio Luciani, and S. A. Sisson. Using Approximate Bayesian Computation to Estimate Tuberculosis Transmission Parameters from Genotype Data. Genetics, 173:1511–1520, 2006.

Related Documents

Paradox
November 2019 39
Paradox
May 2020 21
Indian Marginalisation
November 2019 13
Example
May 2020 27
Example
October 2019 59