Three-Stage Semi-parametric Estimation of T-Copulas: Asymptotics, Finite-Samples Properties and Computational Aspects Dean Fantazzini Moscow School of Economics
EEA-ESEM 2008 Milan: Thursday August 28
Overview of the Presentation
Three-Stage Semi-parametric Estimation of T-Copulas: 2 Asymptotics, Finite-Samples Properties and Computational Aspects
Overview of the Presentation
• Semi-Parametric Copula Estimators: A Review
Three-Stage Semi-parametric Estimation of T-Copulas: 2-a Asymptotics, Finite-Samples Properties and Computational Aspects
Overview of the Presentation
• Semi-Parametric Copula Estimators: A Review • The Three-Stage KME-CML Method
Three-Stage Semi-parametric Estimation of T-Copulas: 2-b Asymptotics, Finite-Samples Properties and Computational Aspects
Overview of the Presentation
• Semi-Parametric Copula Estimators: A Review • The Three-Stage KME-CML Method • Asymptotic Properties Of The Three-Stage Method
Three-Stage Semi-parametric Estimation of T-Copulas: 2-c Asymptotics, Finite-Samples Properties and Computational Aspects
Overview of the Presentation
• Semi-Parametric Copula Estimators: A Review • The Three-Stage KME-CML Method • Asymptotic Properties Of The Three-Stage Method • Finite-Sample Properties And Computational Aspects
Three-Stage Semi-parametric Estimation of T-Copulas: 2-d Asymptotics, Finite-Samples Properties and Computational Aspects
Overview of the Presentation
• Semi-Parametric Copula Estimators: A Review • The Three-Stage KME-CML Method • Asymptotic Properties Of The Three-Stage Method • Finite-Sample Properties And Computational Aspects • Conclusions
Three-Stage Semi-parametric Estimation of T-Copulas: 2-e Asymptotics, Finite-Samples Properties and Computational Aspects
Semi-Parametric Copula Estimators: A Review Genest et al. (1995) were the first to analyze a semi-parametric estimation of a bivariate Copula with i.i.d. observations. Their Canonical Maximum Likelihood (CML) method differs from full Maximum Likelihood methods because no assumptions are made about the parametric form of the marginal distributions. Let us consider a multivariate random sample represented by the time series X = (x1t , . . ., xnt ), and t = 1, . . . , T , and let fh be the density of the joint distribution of X. Then, by using Sklar’s theorem (1959) fh (xi ; α1 , . . . , αn , γ) = c(F1 (x1; α1 ), . . . , F1 (xn; αn ); γ) ·
n Y
fi (xi ; αi ) (1)
i=1
where fi is the univariate density of the marginal distribution Fi , c is the copula density, αi , i = 1, . . . , n is the vector of parameters of the marginal distribution Fi , while γ is the vector of the copula parameters. Three-Stage Semi-parametric Estimation of T-Copulas: 3 Asymptotics, Finite-Samples Properties and Computational Aspects
Semi-Parametric Copula Estimators: A Review The CML estimation process is performed in two steps: Definition 1.1 (CML copula estimation). 1. Transform the dataset (x1t , x2t , . . . , xnt ), t = 1, . . . , T into uniform variates (ˆ u1t , u ˆ2t , . . . , u ˆnt ) using the empirical distributions FiT (·) defined as follows: T 1 X FiT (xit ) = 1l {xit ≤ xi ) , i = 1 . . . n T t=1
(2)
where 1l {x≤•} represents the indicator function. 2. Estimate the copula parameters by maximizing the log-likelihood: γˆ CM L = arg max
T X
log(c(F1T (x1t ), . . . , FnT (xnt )); γ)
(3)
t=1
Three-Stage Semi-parametric Estimation of T-Copulas: 4 Asymptotics, Finite-Samples Properties and Computational Aspects
Semi-Parametric Copula Estimators: A Review Under some regularity conditions, Genest et al. (1995) show that the semiparametric estimator γˆ CM L (for bivariate copula) has the following asymptotic distribution: 2 √ σ d (4) T (ˆ γ CM L − γ0 ) → N 0, 2 h where σ2
=
Wi (xi )
=
h
=
var[lγ (F1 (X1 ), F2 (X2 ); γ) + W1 (X1 ) + W2 (X2 )] Z 1l Fi (Xi )≤ui lγ,i (u1 , u2 ; γ) c(u1 , u2 ; γ)du1 du2 i = 1, 2
−E[lγ,γ (F1 (x1t ), F2 (x2t ); γ)]
and where Wi (xit ) can have this alternative expression too, upon integrating by parts with respect to ui (i = 1, 2): Z Wi (xit ) = − 1l Fi (Xi )≤ui lγ (u1 , u2 ; γ) li (u1 , u2 ; γ) c(u1 , u2 ; γ)du1 du2 Three-Stage Semi-parametric Estimation of T-Copulas: 5 Asymptotics, Finite-Samples Properties and Computational Aspects
The Three-Stage KME-CML Method Since the seminal work by Genest et al. (1995), it has become common practice to use semi-parametric methods with high-dimensional elliptical Student’s T copulas too (see, e.g. Cherubini et al. (2004) and Mcneil et al. (2005)): c(tυ (x1 ), . . . , tυ (xn )) = |Σ|−1/2
Γ Γ
" υ+n 2 υ 2
Γ Γ
# n υ
2 υ+1 2
1+ n Q
i=1
ζ ′ Σ−1 ζ υ
1+
ζi2 2
− υ+n 2
− υ+1 2
Particularly, after the marginal empirical distribution functions are computed in a first stage, the correlation matrix is estimated in a second stage using a method-of-moment estimator based on Kendall’s tau, while the degrees of freedom are estimated in a third stage using Maximum Likelihood methods. Despite the widespread use of this procedure, its asymptotics and finite-sample properties have not been developed yet. Three-Stage Semi-parametric Estimation of T-Copulas: 6 Asymptotics, Finite-Samples Properties and Computational Aspects
The Three-Stage KME-CML Method ˜1 ; X ˜ 2 ) two Definition 1.2 (Kendall’s tau). If we have (X1 ; X2 ) and (X independent and identically distributed random vectors, the population version of Kendall’s tau τ (X1 ; X2 ) is, (see Kruskal (1958)): h i ˜ 1 )(X2 − X ˜2 ) τ (X1 , X2 ) = E sign (X1 − X (5) Besides, Kendall’s tau can be expressed in terms of copulas, thus simplifying calculus, see, e.g., Nelsen (1999), p.127. τ (X1 , X2 ) = 4
Z1 Z1 0
C(u1 , u2 )dC(u1 , u2 ) − 1
(6)
0
Lindskog, McNeil, and Schmock (2002) proved that Kendall’s tau for elliptical distributions is given by τ (X1 , X2 ) =
2 arcsin ρX1 X2 π
(7)
where ρX1 X2 is the copula correlation parameter. Three-Stage Semi-parametric Estimation of T-Copulas: 7 Asymptotics, Finite-Samples Properties and Computational Aspects
The Three-Stage KME-CML Method The Kendall’s tau moment estimator can now be defined: Definition 1.3 (Copula estimation with Kendall’s tau). Let us consider the population version of Kendall’s τ (5) and its relationship with copula parameters (6) to build a moment function of the type E [ψ (X1 , X2 ; γ0 )] = 0
(8)
Then we can construct an empirical estimate of the Kendall’s tau pairwise correlation matrix and use relationship (6) to infer an estimate of the relevant parameters of the copula. This is a method of moments estimate because the true moment (5) is replaced by its empirical analogue,
T 2
−1
X
sign ((x1,t − x ˜1,s )(x2,t − x ˜2,s ))
(9)
1≤t<s
and (6) is then used to estimate the copula parameters. Three-Stage Semi-parametric Estimation of T-Copulas: 8 Asymptotics, Finite-Samples Properties and Computational Aspects
The Three-Stage KME-CML Method There are cases when the copula parameter vector has different kinds of parameters and only some of them can be expressed as a function of the Kendall’s tau. This is the case for the T-copula . In this situation, Bouy´e et al. (2001) and McNeil et al. (2005) have suggested the following estimation procedure: Definition 1.4 (Three-stage KME - CML copula estimation). 1. Transform the dataset (x1t , x2t , . . . xnt ), into uniform variates (F1T (x1t ), F2T (x2t ), . . . , FnT (xnt )), using the empirical distribution function. 2. Collect all pairwise estimates of the sample Kendall’s tau given by (9) ˆ τ defined by in an empirical Kendall’s tau matrix R τ ˆ jk = τ (FjT (Xj ), FkT (Xk )), and then construct the correlation matrix R τ ˆ j,k = sin( π R ˆ j,k using this relationship Σ ), where the estimated 2 parameters are the q = n · (n − 1)/2 correlations [ˆ ρ1 , . . . ρˆq ]′ . Three-Stage Semi-parametric Estimation of T-Copulas: 9 Asymptotics, Finite-Samples Properties and Computational Aspects
The Three-Stage KME-CML Method
⇒ Since there is no guarantee that this componentwise transformation of the empirical Kendall’s tau matrix is positive definite, when needed, ˆ can be adjusted to obtain a positive definite matrix using a procedure Σ such as the eigenvalue method of Rousseeuw and Molenberghs (1993) or other methods. 3. Look for the CML estimator of the degrees of freedom νˆ CM L by maximizing the log-likelihood function of the T-copula density: νˆ CM L = arg max
T X
ˆ , ν) (10) log cT −copula (F1T (x1t ), . . . , FnT (xnt ); Σ
t=1
Three-Stage Semi-parametric Estimation of T-Copulas: 10 Asymptotics, Finite-Samples Properties and Computational Aspects
Asymptotic Properties Of The Three-Stage Method
The second step in the previous definition 1.4 corresponds to a method-of-moments estimation based on q moments and Kendall tau rank correlations estimated with empirical distribution functions. We can therefore build a q × 1 moments vector ψ for the parameter vector θ0 = [ρ1 , . . . , ρq ]′ as reported below:
ψ (F1 (X1 ), . . . , Fn (Xn ); θ0 ) =
E [ψ1 (F1 (X1 ), F2 (X2 ); ρ1 )] .. .
=0
E [ψq (Fn−1 (Xn−1 ), Fn (Xn ); ρq )] (11)
Three-Stage Semi-parametric Estimation of T-Copulas: 11 Asymptotics, Finite-Samples Properties and Computational Aspects
Asymptotic Properties Of The Three-Stage Method Then these theorems follow: ˆ Let assume that (x1t , . . . , xnt ) are Theorem 1.1 (Consistency of θ). i.i.d random variables with dependence structure given by c(u1,t , . . . , un,t ; Σ0 , ν0 ). Suppose that (i) the parameter space Θ is a compact subset of Rq , (ii) the q-variate moment vector ψ (F1 (X1 ), . . . , Fn (Xn ); θ0 ) is continuous in θ0 for all Xi , (iii) ψ (F1 (X1 ), . . . , Fn (Xn ); θ) is measurable in Xi for all θ in Θ, (iv) E [ψ (F1 (X1 ), . . . , Fn (Xn ); θ)] 6= 0 for all θ 6= θ0 in Θ, (v) E supθ∈Θ kψ (F1 (X1 ), . . . , Fn (Xn ); θ) k < ∞, p Then θˆ → θ0 as n → ∞.
Theorem 1.2 (Consistency of νˆ CM L ). Let the assumptions of the previous theorem hold, as well as the regularity conditions reported in p Proposition A.1 in Genest et al.(1995). Then νˆ CM L → ν0 as n → ∞. Three-Stage Semi-parametric Estimation of T-Copulas: 12 Asymptotics, Finite-Samples Properties and Computational Aspects
Asymptotic Properties Of The Three-Stage Method The asymptotic normality is not straightforward, since we use a 3-step procedure where we perform a different kind of estimation at the second and third stage. ⇒ A possible solution is to consider the CML used in the 3rd stage as a special method-of-moment estimator. Just note that the CML estimator is defined by the derivative of the log-likelihood function with respect to the degrees of freedom: T
X ∂l(·; ν) ˆ , νˆ = 0 = lν F1T (x1,t ), . . . , FnT (xn,t ); Σ ∂ν t=1
(12)
Dividing both sides by T yields the definition of the method of moments estimator: T n X 1 1 X ˆ ,ν ˆ ,ν lν F1T (x1,t ), . . . , FnT (xn,t ); Σ ˆ = ψν (F1T (x1,t ), . . . , FnT (xn,t ); Σ ˆ) = 0 T i=1 T i=1
Three-Stage Semi-parametric Estimation of T-Copulas: 13 Asymptotics, Finite-Samples Properties and Computational Aspects
Asymptotic Properties Of The Three-Stage Method
Let define the sample moments vector Ψ KM E−CM L for the parameter ˆ = [ˆ vector Ξ ρ1 , . . . ρˆq , νˆ]′ as follows:
ˆ Ψ KM E−CM L F1T (x1,t ), . . . , FnT (xn,t ); Ξ = 1 PT ˆ1 ) i=1 ψ1 (F1T (x1,t ), F2T (x2,t ); ρ T .. . =0 = P T 1 ˆq i=1 ψq Fn−1,T (xn−1,t ), FnT (xn,t ); ρ T P T 1 ˆ νˆ ψν F1T (x1,t ), . . . , FnT (xn,t ); Σ, T
i=1
Three-Stage Semi-parametric Estimation of T-Copulas: 14 Asymptotics, Finite-Samples Properties and Computational Aspects
Asymptotic Properties Of The Three-Stage Method Let also define the population moments vector with a correction to take the non-parametric estimation of the marginals into account, together with its variance (see Genest et al. (1995), § 4):
∆0 =
ψ1 (F1 (X1 ), F2 (X2 ); ρ1 ) .. . ψq (Fn−1 (Xn−1 ), Fn (Xn ); ρq ) ψν (F1 (X1 ), . . . , Fn (Xn ); Σ0 , ν0 ) +
n P
Wi,ν (Xi )
i=1
Υ0 ≡ var [∆0 ] = E ∆
where Wi,ν (Xi )
=
Z
KM E−CM L ∆
1l Fi (Xi )≤ui
KM E−CM L
′
=0
∂2 log c(u1 , . . . un )dC(u1 , . . . un ) ∂ν∂ui
(13)
(14)
(15)
7 Note that the population moments used to estimate the correlations are → not affected by the marginals empirical d.f., since the Kendall’s tau is invariant under strictly increasing marginal transformations Three-Stage Semi-parametric Estimation of T-Copulas: 15 Asymptotics, Finite-Samples Properties and Computational Aspects
Asymptotic Properties Of The Three-Stage Method Theorem 1.3 (Asymptotic Distribution 3-stages KME-CML Method). Let the assumptions of the previous theorems hold. Assume ∂ΨKM E−CM L (·;Ξ) is O(1) and uniformly negative definite, further that ∂Ξ′ while Υ0 is O(1) and uniformly positive definite. Then, the three-stages KME-CML estimator verifies the properties of asymptotic normality: −1 −1′ ! √ ∂Ψ KM E−CM L ∂Ψ KM E−CM L d ˆ − Ξ0 ) −→ T (Ξ N 0, E Υ E 0 ∂Ξ′ ∂Ξ′ (16) Theorem 1.4 (Asymptotic Distribution 3-stages KME-CML Method for multivariate heteroscedastic time series models). Let the regularity conditions (i)-(v) reported in theorem 1.1 hold, together with conditions A.1 and A.9 in Gunky et al. (2007). Then, the three-stages KME-CML estimator verifies the properties of asymptotic normality defined in (16). Three-Stage Semi-parametric Estimation of T-Copulas: 16 Asymptotics, Finite-Samples Properties and Computational Aspects
Finite-Sample Properties And Computational Aspects We consider the following possible DGPs: 1. Bivariate Student’s T copula, ρ ∈ −0.9, . . . 0.9 (step 0.1); ν ∈ 3, . . . 30 (step 1). We consider two possible data situations: n = 50 and n = 500. 2. We examine the case that ten variables have a multivariate Student’s T copula, with the copula correlation matrix equal to: 1
-0.15
-0.15
-0.15
-0.15
-0.14
-0.09
-0.03
0.05
0.13
-0.15
1
-0.15
-0.15
-0.15
-0.13
-0.08
-0.02
0.06
0.14
-0.15
-0.15
1
-0.15
-0.15
-0.12
-0.07
-0.01
0.07
0.15
-0.15
-0.15
-0.15
1
-0.15
-0.11
-0.06
0.01
0.08
0.15
-0.15
-0.15
-0.15
-0.15
1
-0.10
-0.05
0.02
0.09
0.15
-0.14
-0.13
-0.12
-0.11
-0.10
1
-0.04
0.03
0.10
0.15
-0.09
-0.08
-0.07
-0.06
-0.05
-0.04
1
0.04
0.11
0.15
-0.03
-0.02
-0.01
0.01
0.02
0.03
0.04
1
0.12
0.15
0.05
0.06
0.07
0.08
0.09
0.10
0.11
0.12
1
0.15
0.13
0.14
0.15
0.15
0.15
0.15
0.15
0.15
0.15
1
Three-Stage Semi-parametric Estimation of T-Copulas: 17 Asymptotics, Finite-Samples Properties and Computational Aspects
Finite-Sample Properties And Computational Aspects
⇒ We choose this correlation matrix because its lowest eigenvalue is very close to zero (0.0786) and it allows us to study the effect that the eigenvalue method by Rousseeuw and Molenberghs (1993) has on the limiting distribution of the KME-CML estimator. Furthermore, we consider ν ∈ 3, . . . 30 (step 1), as well as two possible data situations: n = 50 and n = 500.
Three-Stage Semi-parametric Estimation of T-Copulas: 18 Asymptotics, Finite-Samples Properties and Computational Aspects
Finite-Sample Properties And Computational Aspects 3. We examine the case that ten variables have a multivariate Student’s T copula, with the copula correlation matrix equal to: 1
0.21
0.33
0.22
0.36
0.30
0.37
0.34
0.31
0.47
0.21
1
0.20
0.15
0.27
0.18
0.18
0.31
0.20
0.21
0.33
0.20
1
0.16
0.32
0.28
0.40
0.33
0.17
0.42
0.22
0.15
0.16
1
0.20
0.16
0.18
0.20
0.27
0.20
0.36
0.27
0.32
0.20
1
0.32
0.33
0.55
0.33
0.35
0.30
0.18
0.28
0.16
0.32
1
0.28
0.32
0.26
0.31
0.37
0.18
0.40
0.18
0.33
0.28
1
0.35
0.23
0.40
0.34
0.31
0.33
0.20
0.55
0.32
0.35
1
0.31
0.35
0.31
0.20
0.17
0.27
0.33
0.26
0.23
0.31
1
0.30
0.47
0.21
0.42
0.20
0.35
0.31
0.40
0.35
0.30
1
This is the correlation matrix of the returns of the first 10 stocks belonging to the Dow Jones Industrial Index, observed between the 18/11/1988 and the 20/11/2003. Furthermore, we consider ν ∈ 3, . . . 30 (step 1), as well as two possible data situations: n = 50 and n = 500. Three-Stage Semi-parametric Estimation of T-Copulas: 19 Asymptotics, Finite-Samples Properties and Computational Aspects
Finite-Sample Properties And Computational Aspects
The estimators considered are - the 3-stage KME-CML method , - and the Maximum Likelihood estimator computed with given marginals, in order to assess the loss in efficiency associated with absence of knowledge of the marginals. We also considered the 2-stage CML method which delivered results in-between the KME-CML and ML methods, as expected.
Three-Stage Semi-parametric Estimation of T-Copulas: 20 Asymptotics, Finite-Samples Properties and Computational Aspects
Coverage rate (95%) for ρ and ν. % of convergence failures. (Bivariate T-copula estimated with the KME-CML method)
Three-Stage Semi-parametric Estimation of T-Copulas: 21 Asymptotics, Finite-Samples Properties and Computational Aspects
Coverage rate (95%) for ρ and ν. % of convergence failures. (Bivariate T-copula estimated with the ML method)
Three-Stage Semi-parametric Estimation of T-Copulas: 22 Asymptotics, Finite-Samples Properties and Computational Aspects
Coverage rate (95%) for ρ and ν. (Ill-specified ten-variate T-copula estimated with the KME-CML method)
Three-Stage Semi-parametric Estimation of T-Copulas: 23 Asymptotics, Finite-Samples Properties and Computational Aspects
Coverage rate (95%) for ρ and ν. (Ill-specified ten-variate T-copula estimated with the ML method)
Three-Stage Semi-parametric Estimation of T-Copulas: 24 Asymptotics, Finite-Samples Properties and Computational Aspects
% of convergence failures. % of times when the correlation matrix was not positive definite. Mean / Median biases (in %), and R-RMSE of ν
(Ill-specified ten-variate T-copula estimated with the KME-CML method) Three-Stage Semi-parametric Estimation of T-Copulas: 25 Asymptotics, Finite-Samples Properties and Computational Aspects
% of convergence failures. % of times when the correlation matrix was not positive definite. Mean / Median biases (in %), and R-RMSE of ν
(Ill-specified ten-variate T-copula estimated with the ML method) Three-Stage Semi-parametric Estimation of T-Copulas: 26 Asymptotics, Finite-Samples Properties and Computational Aspects
Coverage rate (95%) for ρ and ν. (Dow-Jones returns, ten-variate T-copula estimated with the KME-CML method)
Three-Stage Semi-parametric Estimation of T-Copulas: 27 Asymptotics, Finite-Samples Properties and Computational Aspects
Coverage rate (95%) for ρ and ν. (Dow-Jones returns, ten-variate T-copula estimated with the ML method)
Three-Stage Semi-parametric Estimation of T-Copulas: 28 Asymptotics, Finite-Samples Properties and Computational Aspects
% of convergence failures. % of times when the correlation matrix was not positive definite. Mean / Median biases (in %), and R-RMSE of ν
Dow-Jones returns, ten-variate T-copula estimated with the KME-CML m. Three-Stage Semi-parametric Estimation of T-Copulas: 29 Asymptotics, Finite-Samples Properties and Computational Aspects
% of convergence failures. % of times when the correlation matrix was not positive definite. Mean / Median biases (in %), and R-RMSE of ν
(Dow-Jones returns, ten-variate T-copula estimated with the ML method) Three-Stage Semi-parametric Estimation of T-Copulas: 30 Asymptotics, Finite-Samples Properties and Computational Aspects
Conclusions • We examined the asymptotics and the finite-sample properties of a recent semi-parametric estimation method used in the financial literature with the multivariate Student’s T-copula. • We found that the KME-CML estimator was more efficient and less biased than the one-stage ML estimator when small samples and t-copulas with low degrees of freedom ν were of concern. • When small samples were of concern and ν was high, the number of times when the numerical maximization of the log-likelihood failed to converge was much higher for the ML method than for the KME-CML method. • Yet, while the coverage rates at the 95% level for the ML estimates for ν did not show any particular bias or trend, the KME-CML estimates showed very low rates when ν became close to 30 and the correlations were not too strong. Three-Stage Semi-parametric Estimation of T-Copulas: 31 Asymptotics, Finite-Samples Properties and Computational Aspects
Conclusions • However, this drop in the coverage rates for ν was large with bivariate t-copulas, only, while it was much lower with ten-variate copulas. • The coverage rates for the correlations were quite close to the true values. • Finally, we found that the eigenvalue method by Rousseeuw and Molenberghs (1993) has to be used to obtain a positive definite correlation matrix only when dealing with very small samples (n < 100) and when the true underlying process has the lowest eigenvalue close to zero. • This fix induces a positive mean bias in the estimate of ν, but the effects on the coverage rates are rather limited. Besides, the number of times when this method has to be used quickly decreases when ν increases. Three-Stage Semi-parametric Estimation of T-Copulas: 32 Asymptotics, Finite-Samples Properties and Computational Aspects