0750-0769

  • Uploaded by: Eric Sampson
  • 0
  • 0
  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View 0750-0769 as PDF for free.

More details

  • Words: 8,926
  • Pages: 20
Transformation Kernel Density Estimation With Applications

Gerhard KOEKEMOER and Jan W.H. S WANEPOEL One of the main objectives of this article is to derive efficient nonparametric estimators for an unknown density f X . It is well known that the ordinary kernel density estimator has, despite several good properties, some serious drawbacks. For example, it suffers from boundary bias and it also exhibits spurious bumps in the tails. We propose a semiparametric transformation kernel density estimator to overcome these defects. It is based on a new semiparametric transformation function that transforms data to normality. A generalized bandwidth adaptation procedure is also developed. It is found that the newly proposed semiparametric transformation kernel density estimator performs well for unimodal, low, and high kurtosis densities. Moreover, it detects and estimates densities with excessive curvature (e.g., modes and valleys) more effectively than existing procedures. In conclusion, practical examples based on real-life data are presented. Key Words: Adaptive bandwidth selection; Normality; Semiparametric.

1. INTRODUCTION

Let X 1 , X 2 , . . . , X n be independent and identically distributed (iid) continuous random variables having a density function f X . The standard kernel density estimator is defined as n X ˆf X (x; h) = 1 kh (x − X i ), n i=1

(1.1)

where kh (∙) = k(∙/ h)/ h is the so-called kernel (or weight) function and h is the smoothing parameter or bandwidth. We assume that k is a density function, which is symmetric about zero. Two of the major drawbacks of fˆX (x; h) are boundary bias and spurious bumps in the tails (see Section 2). The standard transformation kernel density estimator serves as an automatic solution to both drawbacks. In addition, the transformation kernel density estimator can detect density curvature more profoundly. The use of transformations in kernel Gerhard Koekemoer is Senior Lecturer, Department of Statistics, North-West University, Potchefstroom, South Africa (E-mail: [email protected]). Jan W.H. Swanepoel is Professor, Department of Statistics, North-West University, Potchefstroom, South Africa (E-mail: [email protected]). c 2008

American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America Journal of Computational and Graphical Statistics, Volume 17, Number 3, Pages 750–769 DOI: 10.1198/106186008X318585 750

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS

751

density estimation has been proposed by Devroye and Gy¨orfi (1985), Silverman (1986), and Wand et al. (1991). Related work was presented by Park et al. (1992), Ruppert and Wand (1992), Marron and Ruppert (1994), Ruppert and Cline (1994), H¨ossjer and Ruppert (1995), Yang and Marron (1999), Markovitch and Krieger (2000), and Bolanc´e et al. (2003). Let gλ (x) be a monotonic increasing transformation function with transformation parameter λ. Also, let Yi = gλ (X i ), i = 1, . . . , n, be the transformed data. The ordinary transformation kernel density estimator (henceforth referred to as TKDE) is given by n 1X 0 fˆX (x; h, λ ) = gλ0 (x) fˆY (gλ (x); h) = gλ (x)kh {gλ (x) − gλ (X i )} . n i=1

(1.2)

In this article the implementation of a TKDE is discussed, based on the semiparametric transformation function proposed by Koekemoer and Swanepoel (2008). A review of ordinary kernel density estimation and the semiparametric transformation function is given in Sections 2 and 3, respectively. A general bandwidth adaptation scheme is introduced in Section 4. A brief literature study of related work in the context of the TKDE is given in Section 5, which also contains the proposed semiparametric TKDE. Empirical studies are presented in Section 6 consisting of Monte Carlo simulations and three real-life applications. Section 7 summarizes our main conclusions.

2. STANDARD KERNEL DENSITY ESTIMATION

In this section a brief discussion of the ordinary kernel density estimator is given with regard to issues such as efficiency measure, choice of an appropriate kernel function, choice of the bandwidth, boundary bias and spurious bumps in the tails. For a more comprehensive discussion the reader is referred to Silverman (1986), Wand and Jones (1995), and Koekemoer (2004). 2.1

E FFICIENCY M EASURE Z The functional R( f X00 ) =

f X00 (x)

2

d x is a measure of total curvature of f X . For

target densities f X with “sharp” features such as high skewness or several modes, | f X00 (x)| will take on relatively large values resulting in a large value of R( f X00 ). For densities without such features R( f X00 ) is smaller, hence easier to estimate. It is easily verified (Wand and Jones 1995) that D( f X ) := σx5 R( f X00 ),

is a scale invariant (difficulty) measure, where σx2 is the variance of f X . Terrell (1990) showed that R( f X00 ) is minimized by the beta-density given by 35 (1 − x 2 )3 , −1 ≤ x ≤ 1. 32 Hence, this density is the easiest to estimate using ordinary kernel density estimation. An efficiency measure for kernel estimation is therefore (Wand and Jones 1995)   D( f ∗ ) 1/4 . Eff( f ) = D( f X ) f ∗ (x) =

752

G. KOEKEMOER AND J.W.H. S WANEPOEL

It can be shown that the normal density has efficiency of 0.908. This motivates a transformation to normality in the context of the TKDE. 2.2

C HOICE OF K ERNEL F UNCTION AND S MOOTHING PARAMETER

2.3

B OUNDARY B IAS

It is well known that the choice of the kernel function k is not crucial. For this reason the standard normal density is used in this article, since it simplifies computation. The choice of h is of far greater importance. There exists an extensive literature on the selection of the optimal data-based smoothing parameter. The reader is referred to Rudemo (1982) and Bowman (1984) (least-squares cross-validation); Scott and Terrell (1987) (biased cross-validation), M¨uller (1985), Staniswalis (1989), and Hall et al. (1992) (smoothed cross-validation); Chiu (1991a), Chiu (1991b), and Chiu (1992) (characteristic function cross-validation). However, the procedure proposed by Sheather and Jones (1991) is used in this article. For a more extensive discussion concerning the methods described above, the reader is referred to Wand and Jones (1995) and Koekemoer (1999). Simulation and comparative studies are found in Park and Marron (1990), Park and Turlach (1992), Cao et al. (1994), Loader (1995), Jones et al. (1996), and Chiu (1996). Next, the behavior of the kernel density estimator is explored near the boundary domain of the random variable of interest, provided that the latter is naturally bounded from either below, above, or from both sides. Broadly speaking, the kernel estimator has to find a compromise between estimating the two distinct values of f on either side of the boundary. The reader is referred to Silverman (1986), and Wand and Jones (1995) for clarifying discussions surrounding boundary bias. Since the location of the boundary is usually known, fˆX (x; h) can be adapted to achieve better performance in the boundary vicinity. There is an extensive literature on how to correct this boundary effect. The simplest procedure is to normalize fˆX (x; h) by dividing with R x−a h x−b k(z)dz, where a and b are the boundaries of the domain. This causes consistency near

the boundary, but still results in O(h) bias there. Schuster (1985), Silverman (1986), and Cline and Hart (1991) discussed the ordinary reflection method, which amounts to adding more data points near the boundary. Some authors proposed the use of so-called boundary kernels. One family of boundary kernels was given by Gasser and M¨uller (1979). Another boundary kernel that can be used was given by Zhang et al. (1999). Zhang and Karunamuni (1998) proposed to vary the bandwidth in the boundary region. Zhang et al. (1999) proposed a more advanced reflection technique where a transformation is used to generate pseudo-data beyond the endpoints of the support of the density. Another pseudo-data method estimator is that of Cowling and Hall (1996). Jones and Foster (1996) proposed a nonnegative adaptation estimator. Alternative boundary correction procedures were proposed by, among others, Burnham et al. (1980), Marron and Ruppert (1994), Hjort (1996), and Alberts and Karunamuni (2003). In Section 5 the transformation method is employed to combat boundary bias. The input h

753

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS

variable (with bounded support) is transformed to a new variable with unbounded support. The density of this variable is then estimated and transformed back to the original scale. Hence, transformation to a normal density is suggested as a natural solution for boundary bias. 2.4

S PURIOUS B UMPS IN THE TAILS

The kernel density estimator defined in (1.1), evaluated at a specific point x, is the average of n normal density functions evaluated at x, each with mean X i and standard deviation h, if the standard normal kernel is used. The weight contribution of the latter is approximately zero for each X i that is further than four bandwidths away from x. Consequently, it is found that in regions where data are scarce only a limited number of kernel functions contribute to the kernel estimator. Hence, spurious bumps occur in these regions, usually the tail regions of a distribution. One natural way to deal with the occurrence of spurious bumps is to increase the bandwidth. However, it is known that an increased bandwidth results in a density estimate with increased bias, hence an oversmoothed estimate. Abramson (1982) suggested to use larger bandwidths in regions of low density. Hall and Davison (1997) proposed a variablebandwidth method that uses approximately equal amounts of information to estimate the density at all points. Alternatively, the transformation kernel density estimator may be used to address the problem of spurious bumps. Recall that the TKDE in (1.2) is given by n 1X 0 gλ (x)kh {gλ (x) − gλ (X i )} . fˆX (x; h, λ ) = n i=1

By Taylor’s theorem we have that gλ (x) − gλ (X i ) ≈ gλ0 (x)(x − X i ). Hence,   n X 1 x − Xi  fˆX (x; h, λ ) ≈  k . h/gλ0 (x) n h/gλ0 (x) i=1

(2.1)

This shows that the TKDE with a bandwidth h is similar to the so-called balloon kernel estimator with bandwidth h/gλ0 (x). The general balloon density estimator uses a different bandwidth for each estimation point x and it can lead to improvement over kernel density estimators using global bandwidth choices. The balloon estimator is more flexible and better able to model complex (multimodal) densities (see Terrell and Scott 1992, and the references therein). From the discussion presented above it is clear that the TKDE addresses both the issues of boundary bias and spurious bumps in the tails in a natural and automatic manner. Section 3 is devoted to transformation functions, since identifying the correct transformation is essential for the success of the TKDE.

3. A SEMIPARAMETRIC TRANSFORMATION FUNCTION

Parametric transformation of data to normality has received much attention in the literature. The reader is referred to Johnson (1949), Tukey (1957), Box and Cox (1964), Manley

754

G. KOEKEMOER AND J.W.H. S WANEPOEL

(1976), John and Draper (1980), Bickel and Doksum (1981), Burdige et al. (1988), Yang and Marron (1999), Yeo and Johnson (2000), and Ruppert and Wand (1992) for transformation functions that can be applied to transform data to normality. Sakia (1992) and Koekemoer and Swanepoel (2008) provided an overview of existing procedures. To successfully transform data to normality, a black box of possible pilot transformations is used (see Table 1). A detailed discussion of these transformations and their parameter restrictions was given by Koekemoer and Swanepoel (2008). For this black box to be exhaustive in the number of shapes that the transformations can possess, the input data are standardized so that transformations that change shape will be included automatically. The method of standardizing data used in this article is given by Zx =

X − μˆ x , σˆ x

where μˆ x is the sample median of the data X 1 , . . . , X n and        3 1 σˆ x = min sx , qˆ3 − qˆ1 / 8−1 − 8−1 , 4 4

(3.1)

where qˆ1 and qˆ3 are the first and third sample quartiles, respectively, and sx2 is the usual unbiased sample variance. Hence, the mapping used is the following: X → Z → Y, where one of the pilot transformations from the black box, defined in Table 1, is applied to transform the data from Z to Y . After these two initial transformations a nonparametric transformation function (defined in (3.2) and (3.3) below) is applied to the Y -values. The profile maximum likelihood method as well as the two methods proposed by Koekemoer and Swanepoel (2008), namely the minimum residual and minimum distance methods are used to estimate the transformation parameter λ . The Shapiro–Wilk (1965) test is generally accepted to be the standard test for normality of data. Hence, that transformation yielding Y -data with the highest Shapiro–Wilk test statistic value is selected as the “best” pilot transformation from the black box. The resulting semiparametric transformation function to normality proposed by Koekemoer and Swanepoel (2008), henceforth denoted by g, ˆ is defined by   ˆ , bY (t; h) (3.2) g(t) ˆ = 8−1 F ˆ is an ordinary kernel estimate of the distribution function FY (∙), that is, bY (∙; h) where F   n X t − Yi ˆ = 1 bY (t; h) F K , ˆ n h i=1

(3.3)

and hˆ is a suitable bandwidth selector with K the distribution function corresponding to k. The bandwidth selector proposed by Polansky and Baker (2000) is used throughout. The reader is referred to Koekemoer and Swanepoel (2008) for a complete Monte Carlo simulation study that evaluates the performance of the semiparametric transformation function, the parameter estimation techniques as well as the transformation selection procedure.

755

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS Table 1.

Summary of the transformation functions in the black box.

Convex or concave shapes

gλ (z) =

Restrictions

Shifted Box–Cox λ2 6= 0,

λ1 > − min(z)

λ2 = 0 λ1 > − min(z)

1 λ2

n

(z + λ1 )λ2 − 1

z ≥ 0, λ 6= 0

1 λ

z ≥ 0, λ = 0

log(z + 1)

Johnson γ = 1, J = −1, 1 0 < λ < max(z) γ = 1, J = +1, −1 0 < λ < min(z)

n

gλ (z) =

Restrictions

o

log(z + λ1 )

Yeo and Johnson

z < 0, λ 6= 2 z < 0, λ = 2

Convex to concave and concave to convex shapes Ruppert and Wand 0≤λ≤1

λz + (1 − λ)∙ o √ n   σˆ z 2π 8 σˆz − 12 z

Johnson

γ = 2, λ > 0

(z + 1)λ − 1

o

−1 2−λ − 1} 2−λ {(−z + 1)

− log(−z + 1)

γ = 3,

−1 , 1 0 < λ < min min(z) max(z) γ =4

h

1 λ log λz +

n

p

λ2 z 2 + 1

o

n o 1 log 1+λz 1−λz i 2λ z

1 λJ log(1 + λJ z)

4. A GENERAL ADAPTATION SCHEME

Empirical studies suggest that the standard TKDE suffers from potentially explosive behavior in the tail region when the transformation used is concave at the left tail or convex at the right tail. A remedy for this potential problem is to use bandwidth adaptation. Consider the random variables Y1 , . . . , Yn having density function and distribution function f Y and FY , respectively. Our proposed adaptation scheme consists of both pilot density function and pilot distribution function estimates and includes the popular adaptation scheme invented by Abramson (1982) as a special case. The method is based on the fact that the TKDE can be considered as an ordinary kernel density estimator with variable bandwidth. Furthermore, we suggest using the flexible class of beta(α, β)-distributions as a transformation target. Let beta(∙, α, β) and Beta(∙, α, β) be the beta-density function and distribution function respectively with support [0, 1]. A nonparametric transformation (Koekemoer and Swanepoel 2008) to a predetermined beta-distribution is given by   ˆ i ) = Be−1 F ˆ α, β , i = 1, . . . , n, bY (Yi ; h), l(Y

756

G. KOEKEMOER AND J.W.H. S WANEPOEL

bY is an ordinary kernel distribution function estimate (see (3.3)). The derivative of where F this transformation is lˆ 0 (Yi ) =

ˆ fˆY (Yi ; h)

 , ˆ α, β , α, β bY (Yi ; h), be Be−1 F 



i = 1, . . . , n,

where fˆY is an ordinary kernel density estimate (see (1.1)). Let τ˜i =

(

lˆ 0 (Yi ) g

)−α˜

, where log g =

n 1X log lˆ 0 (Yi ). n i=1

Also, let τ˜(1) , . . . , τ˜(n) be the order statistics associated with τ˜1 , . . . , τ˜n . The proposed adaptation scheme is then given by τi = τ˜i +

1 − τ˜(1) , ca

(4.1)

where ca is a positive constant, that determines the smallest bandwidth used. The shift (1 − τ˜(1) )/ca is motivated from the fact that smaller bandwidths lead to an increase in variance of the resulting density function and distribution function estimators. The corresponding adaptive density function and distribution function estimators are then given by

    n n X X 1 y − Yi y − Yi ˆ τ) = 1 ˆ τ) = 1 bY (y; h, fˆY (y; b, and F K . (4.2) k ˆ n n τi bˆ τi hˆ i=1 τi b i=1

For the choice ca = 1 (which is used in this article), the smallest bandwidths used ˆ respectively. The sensitivity parameter α˜ is in the estimates presented in (4.2) are bˆ and h, fixed at α˜ = 1/2, which is similar to the choice made by Abramson (1982). For α = β = 1 the beta-density becomes the uniform density, and as α = β increases the resulting density takes an increasingly normal-like form. Hence, α = β = 1 causes more drastic adaptation, while larger α = β results in less adaptation. It should also be noted that the proposed adaptation scheme reduces to the adaptation scheme proposed by Abramson (1982) when α = β = 1 and ca = +∞. The adaptation scheme of Abramson (1982) can therefore be seen as similar to the application of a TKDE for a transformation to a uniform target distribution. The effect of various values of α = β for randomly selected standard normal data is illustrated in Figure 1.

5.1

5. TRANSFORMATION KERNEL DENSITY ESTIMATION OVERVIEW

Perhaps the most informative article on the topic of the transformation kernel density estimator was written by Wand et al. (1991). The authors primarily considered density estimation for positive, skewed data based on the shifted Box–Cox transformation function. Park et al. (1992) investigated the performance of the TKDE based on the procedure proposed by Wand et al. (1991) through a simulation study. Ruppert and Wand (1992)

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS

757

3.9 3.4 2.9 2.4 1.9 1.4 -2.65

-1.65

Figure 1.

-0.65

0.9

0.35

1.35

Values of τ based on n = 200 standard normal data points.

2.35

considered densities with high kurtosis. They concluded that the TKDE based on their proposed transformation (see Table 1) is superior to the ordinary kernel density estimate for densities with high kurtosis, while similar performance can be expected for densities close to normality, that is, densities that are easy to estimate. Marron and Ruppert (1994) used the TKDE to reduce boundary bias if the support of the unknown density is the interval [0, 1]. Ruppert and Cline (1994) were the first authors to propose the use of a nonparametric transformation to a predetermined target distribution G 0 (∙) to reduce bias in kernel density estimation. The authors argued that a uniform target distribution is particularly interesting, since its density has all derivatives equal to 0 so that bias is asymptotically negligible. However, boundary bias will occur when estimating the uniform target density of the transformed data, by applying ordinary kernel density estimation. Moreover, spurious bumps in the tails can appear when estimating the derivative of the transformation function. Despite the occurrence of these potentially harmful effects, Ruppert and Cline (1994) implemented the procedure with G 0 (∙) being the uniform distribution. Their findings are that the nonparametric TKDE seems to be highly effective at capturing interesting features such as multiple modes and densities with sharp peaks. However, for extremely skewed or heavy-tailed densities, the poor performance of the estimate of the transformation function derivative seriously degrades the performance of the resulting nonparametric TKDE. H¨ossjer and Ruppert (1995) studied the asymptotics for the nonparametric TKDE proposed by Ruppert and Cline (1994). Yang and Marron (1999) proposed the use of the ordinary TKDE, where the transformation used is defined as a reparameterization of the versatile transformation family proposed by Johnson (1949). The authors found that transforming the data twice yields an estimate much superior to the estimate without transformation and, in most cases, little improvement is achieved after two transformation steps. Specifically, they reported that the transformation density estimate is better than the untransformed estimate in overall smoothness and the capturing of peaks. Markovitch and

758

G. KOEKEMOER AND J.W.H. S WANEPOEL

Krieger (2000) considered the standard TKDE to study World Wide Web (WWW) traffic measurements, since different traffic characteristics can be modeled by long-tailed densities. Bolanc´e et al. (2003) estimated actuarial loss distributions based on a symmetrized version of the transformation approach proposed by Wand et al. (1991). From an actuarial background, the authors gave valuable motivational insight for the application of the TKDE in the actuarial context. They showed by means of a simulation study that the proposed method is able to estimate all three possible kind of tails, as defined by Embrechts et al. (1997), namely the Fr´echet type, the Gumbel type, and the Weibull type, which makes the methodology extremely powerful for actuaries in various disciplines. In conclusion, Bolanc´e et al. (2003) mentioned that the ordinary TKDE can also be useful to estimate the tail index of an extreme value distribution. 5.2

S EMIPARAMETRIC T RANSFORMATION K ERNEL D ENSITY E STIMATOR

The results derived in the previous sections will now be used to define our semiparametric TKDEs. The parametric transformation in the black box is responsible for removing spurious bumps in the tails as well as minimizing boundary bias. The nonparametric transformation (defined in (3.2) and (3.3)) is responsible for capturing density curvature more effectively. It should be noted that a transformation is potentially harmful to the TKDEs in regions where the input data are stretched out, that is, when the transformation is concave at the lower bound and/or convex at the upper bound. In these regions the derivative of the parametric transformation function can be large, hence the resulting TKDE’s may overestimate in these regions. At the parametric level we propose adding a shift quantity to the standardized input data, after the best transformation function has been selected from the black box. The transformation is then applied to the shifted standardized input data. In this way the shape of the transformation selected is preserved. Note that Park et al. (1992) also proposed adding a constant c > 0 to the data when applying the TKDE proposed by Wand et al. (1991), and reported that this improved the performance of the estimate. However, no data-based procedure of choosing c was given. At the nonparametric level, potentially explosive behavior in the derivative is controlled by using the general bandwidth adaptation scheme derived in Section 4.  We first introduce some new notation. Suppose that Z x, j = X j − μˆ x /σˆ x , j = 1, . . . , n, then, for ease of notation, the subscript j will be suppressed, that is, we write Z x instead of Z x, j . The little x will therefore indicate that the X -data are standardized. Also, if Yi, j , j = 1, . . . , n, denote the transformed data at the ith iteration step, we will simply write Yi (omitting the index j). The index i will therefore be used as an iteration step index, for i = 0, 1, . . . , r (say). Write τ i = (τ1i , τ2i , . . . , τni ), as defined in (4.1). If α and β are fixed at α = β = 1, we denote τ i by τ i∗ . Let x1 , . . . , xm be the finely spaced grid. Define δ as a shift constant and let 0 < ε ≤ 1. Next, the stepwise algorithm of our semiparametric TKDE’s is presented. Step 1.

Standardize the input data, that is, let

Zx =

X − μˆ x , σˆ x

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS

Step 2.

Step 3.

759

where μˆ x = qˆ2 is the sample median and σˆ x is defined in expression (3.1). The  standardized grid is then given by z x = x − μˆ x /σˆ x . Here, the x in z x refers to the grid points x 1 , . . . , x m . Apply the black box of transformations to Z x . The parameter λ is estimated using the profile maximum likelihood, minimum residual and minimum distance methods. Choose that transformation yielding the highest Shapiro–Wilk test statistic value.

Determine the curvature of the parametric transformation selected. Knowledge of the curvature is utilized to protect the TKDE’s against potentially explosive behavior in the tail regions as follows: Let m x and Mx be the minimum and maximum standardized input data-points, respectively. Define, for some estimate ˆ δ, ˆ (5.1) Y0 = gλˆ (Z x + δ). Consider the bias-corrected density estimate adapted according to the scheme presented in expression (4.1) with α = β = 4, that is, ˜ = ˜ b) fˆX (X i ; γˆ , a,

Z

X i −a˜ γˆ

X i −b˜ γˆ

k(t)dt

−1

  n Xi − X j 1X 1 k , n τˆ j γˆ τˆ j γˆ j=1

(5.2)

where a˜ and b˜ are the left and right bounds of the domain of X. The smoother γˆ is selected according to the method of Sheather and Jones (1991). The choice α = β = 4 corresponds to a transformation to the beta(4,4)-density, which is the easiest density to estimate, using kernel density estimation methods (see Section 2.1).

• If gλˆ (∙) is concave, then potentially explosive behavior in the TKDEs can occur at the left bound of the input data, since the derivative of the parametric transformation can be very large in this region. To reduce this effect, δˆ is added to Z x as in (5.1), ensuring that the transformed data are pushed away from the lower bound. The estimate δˆ is derived by choosing that δ, 0 ≤ δ ≤ 0.5, which minimizes n  X i=1

2  ˆ − fˆX (X i ; γˆ , a, ˜ , gλ0ˆ Z x,i + δ fˆY0 (gλˆ (Z x,i + δ); b) ˜ b) λ

where fˆY0 is based on no bandwidth adaptation.

(5.3)

• If gλˆ (∙) is convex, then potential explosive behavior in the TKDEs can occur at the right bound of the input data, since the derivative of the parametric transformation can be very large in this region. In this case, δˆ is chosen as that δ, −0.5 ≤ δ ≤ 0, which minimizes the discrepancy measure given in (5.3). The effect of δˆ is that the transformed data are pulled away from the upper bound.

760

G. KOEKEMOER AND J.W.H. S WANEPOEL

• For the case where gλˆ (∙) is concave to convex, potential explosive behavior in the TKDEs can occur at both the left and right bounds of the transformed data. If |gλˆ (m x )| ≥ |gλˆ (Mx )|, minimize the discrepancy measure given in (5.3) for 0 ≤ δ ≤ 0.5 ; else, minimize the measure for −0.5 ≤ δ ≤ 0. The effect of δˆ is to pull or push (δˆ can be negative or positive) the transformed data away from the bound where the transformation derivative is the largest, hence, the bound where the most damage is caused in the TKDE’s. • For the case where gλˆ (∙) is convex to concave, select δˆ exactly as above (concave to convex transformations).

It should be noted that δˆ = 0 corresponds to the original transformation. The ˆ Let δi , i = following stop criteria are introduced in search for a suitable small δ. 1, . . . , r, be the ith δ-value considered and select δ1 = 0. If gλˆ (∙) is a concave to convex transformation function, we apply Criterion 1, and for all other transformation shapes Criterion 2 is applied: Criterion 1: Stop the search when o n max g 0ˆ (m x + δi ) , g 0ˆ (Mx + δi ) λ λ o n 0 ≥ max gλˆ (m x + δi−1 ) , gλ0ˆ (Mx + δi−1 ) , i = 2, . . . , r. λ

λ

This will ensure that no artificial outliers are created in the transformed data.

Criterion 2: Stop the search when i h gλˆ (Mx + δi ) − gλˆ (m x + δi ) h i ≤ ε gλˆ (Mx ) − gλˆ (m x ) , i = 2, . . . , r.

For the simulation study presented in Section 6, we used ε = 0.8. This criterion ensures that the range of the transformed data are not too small.

Other ways of choosing δˆ can be developed to improve the performance of the eventual TKDE’s. However, this was not pursued any further. Once δˆ is determined, the transformed data and grid used is given, respectively, by Step 4.

ˆ and y0 = g ˆ (z x + δ). ˆ Y0 = gλˆ (Z x + δ) λ

The 0-step TKDE is then given by

fˆX (x; bˆ0 , λˆ , τˆ 0 ) =



dy0 dx



fˆY0 (y0 ; bˆ0 , τˆ 0 ),

(5.4)

where fˆY0 (y0 ; bˆ0 , τˆ 0 ) is the adaptive KDE defined in (4.2). Also, τˆ 0 is derived from (4.1), where αˆ and βˆ are obtained by minimizing the following discrepancy

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS

measure with respect to α and β: 2 n  X ˆf X (X i ; bˆ0 , λˆ , τ 0 ) − fˆX (X i ; γˆ , a, ˜ ˜ b) . i=1

Step 5.

761

(5.5)

For the simulation study presented in Section 6, we restricted the values of α and β to 1 ≤ α ≤ 1.5 and 1 ≤ β ≤ 1.5. The selector bˆ0 is chosen by the Sheather and Jones (1991) procedure.

The semiparametric transformed data and grid (see (3.2) and (3.3)) are defined as h i h i bY0 (Y0 ; hˆ 0 , τˆ 1∗ ) and y1 = 8−1 F bY0 (y0 ; hˆ 0 , τˆ 1∗ ) , Y1 = 8−1 F

respectively, which yield the 1-step TKDE fˆX (x; bˆ1 , hˆ 0 , λˆ , τˆ 1 ) =

Step 6.

  fˆY0 (y0 ; hˆ 0 , τˆ 1∗ ) dy0 ˆ h i f Y1 (y1 ; bˆ1 , τˆ 1 ). d x −1 1∗ ˆ b FY0 (y0 ; h 0 , τˆ ) φ 8 

(5.6)

Furthermore, τˆ 1 is derived from (4.1), where αˆ and βˆ are obtained by minimizing the measure given in expression (5.5) with fˆX (X i ; bˆ0 , λˆ , τ 0 ) replaced by fˆX (X i ; bˆ1 , hˆ 0 , λˆ , τˆ 1 ). Also, hˆ 0 and bˆ1 above are chosen according to the procedures of Polansky and Baker (2000) and Sheather and Jones (1991), respectively.

The semiparametric transformation can be iterated to obtain an r -step TKDE by repeating Step 5.

Software: An executable program performing density estimates using the proposed TKDEs as well as the competitor density estimators considered in the Monte Carlo simulation study can be downloaded at http:// puk.ac.za/ fakulteite/ natuur/ rsw/ stt/ density/ index. html.

6. EMPIRICAL STUDIES

In this section we present a Monte Carlo simulation study to evaluate the performance of the newly proposed semiparametric TKDEs. Real-life applications are also presented. 6.1

S IMULATION S TUDY

We claim that our semiparametric TKDEs address shortcomings of the standard kernel density estimator with regard to artifacts, such as boundary bias and spurious bumps in the tails, in an automatic and quite natural manner. In addition, density curvature is captured more profoundly. In order to illustrate these claims, a broad spectrum of candidate distributions from which data are drawn are considered. Among these are normal mixture densities (i.e., bimodal, trimodal, claw, skewed bimodal, skewed unimodal, kurtotic unimodal, and outlier)

762

G. KOEKEMOER AND J.W.H. S WANEPOEL

which were introduced by Marron and Wand (1992) and are considered to be extremely difficult to estimate using the ordinary kernel density estimator. The standard normal [1] density is included to test the performance of the semiparametric TKDEs when little or no transformation is required. The standard uniform [2] density is included since it has a low kurtosis and a bounded support. The bimodal [3], trimodal [4], claw [5], and skewed bimodal [6] densities are considered since they contain more than one mode. Densities that are skewed to the left or right with potentially long tails include the skewed bimodal, skewed unimodal [7], Weibull [8] with scale parameter α = 1 and shape parameter β = 1.5, standard lognormal [9], standard exponential [10] and strict Pareto [11] with shape parameter α = 1.5. For these densities the performance of the semiparametric TKDEs are tested in the tail regions where spurious bumps can occur. Also, boundary bias occurs for the exponential and strict-Pareto densities. The kurtotic unimodal [12] density is studied since it has a high kurtosis as well as relatively heavy tails. The numbers assigned to each distribution above are used to tabulate the Monte Carlo output (see Table 2). The competitive density estimators are the ordinary KDE (see (1.1)), denoted by ODE, the adaptive KDE proposed by Abramson (1982), denoted by ADAP, the TKDE proposed by Wand et al. (1991), denoted by W-M-R and the TKDE proposed by Yang and Marron (1999), denoted by Y-M. It should be noted that the TKDE procedures proposed by Wand et al. (1991), and Yang and Marron (1999) also suffer from the potential explosive behavior (when the derivative is large) observed for the proposed semiparametric TKDE’s. For this reason we added a shift δˆ (see (5.1)) according to the criterion stated in (5.3) to the input data. However, no bandwidth adaptation was performed for these two density estimators. Our two candidates are: • the semiparametric TKDE without iteration denoted by SEMI 0; and

• the semiparametric TKDE with one iteration denoted by SEMI 1.

For all the densities considered (except the claw density), the Monte Carlo study was performed for sample sizes n = 100, n = 200, and n = 500. Sample sizes considered for the claw density were n = 500, n = 800, and n = 1000. The Monte Carlo repetition number used was MC = 200. A general fixed grid was constructed between the minimum and maximum data values observed after pooling all the Monte Carlo samples. For each density estimator an average value (over all the Monte Carlo samples) was obtained at each grid point, enabling us to assess the performance of the estimators with regard to bias graphically (see Figure 2). Let fˆc,i (x) be any of the candidate density estimators evaluated for the ith Monte Carlo sample, i = 1, . . . , MC. We then define h i Z +∞ 2 d fˆc,i (∙) = ISE fˆc,i (x) − f (x) d x, i = 1, . . . , MC, −∞

which was calculated for each Monte Carlo repetition. The Monte Carlo estimate of the MISE (mean integrated squared error) of the candidate density estimate fˆc was then calculated as MC i i h h X d fˆc,i (∙) , \ fˆc (∙) = 1 MISE ISE MC i=1

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS

Table 2. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

ODE

Mean integrated squared error (x 102 )

SEMI 0

SEMI 1

ADAP

W-M-R

Y-M

0.68(0.04) 0.38(0.02) 0.18(0.01)

0.76(0.04) 0.42(0.02) 0.20(0.01)

0.93(0.04) 0.57(0.02) 0.32(0.01)

0.76(0.05) 0.41(0.02) 0.21(0.01)

0.71(0.04) 0.39(0.02) 0.18(0.01)

0.75(0.04) 0.39(0.02) 0.18(0.01)

0.79(0.03) 0.48(0.02) 0.24(0.01)

0.83(0.03) 0.49(0.02) 0.24(0.01)

0.85(0.04) 0.52(0.02) 0.29(0.01)

0.88(0.04) 0.51(0.02) 0.24(0.01)

0.81(0.03) 0.49(0.02) 0.24(0.01)

0.90(0.04) 0.52(0.02) 0.26(0.01)

4.18(0.13) 3.04(0.07) 2.17(0.05)

0.95(0.03) 0.61(0.02) 0.33(0.01) 4.26(0.02) 3.74(0.02) 3.45(0.02) 0.99(0.03) 0.68(0.02) 0.33(0.01) 0.90(0.04) 0.59(0.03) 0.29(0.01) 1.49(0.08) 0.86(0.04) 0.47(0.02) 2.00(0.07) 1.31(0.05) 0.69(0.02) 4.07(0.11) 2.94(0.07) 2.05(0.04)

9.54(0.17) 7.90(0.10) 63.76(0.53) 6.59(0.23) 3.86(0.14) 1.60(0.05)

2.93(0.18) 1.66(0.08) 0.91(0.04)

0.98(0.03) 0.61(0.02) 0.32(0.01) 4.30(0.02) 3.77(0.02) 3.45(0.02) 0.95(0.03) 0.63(0.02) 0.28(0.01) 0.88(0.04) 0.59(0.03) 0.30(0.01) 1.40(0.08) 0.84(0.04) 0.41(0.02) 1.12(0.05) 0.74(0.03) 0.38(0.01) 1.20(0.07) 0.64(0.03) 0.35(0.02)

2.95(0.16) 1.94(0.10) 13.50(0.83) 5.64(0.23) 2.94(0.13) 0.97(0.04)

3.85(0.20) 2.40(0.11) 1.38(0.05)

0.97(0.04) 0.62(0.02) 0.34(0.01) 3.43(0.02) 2.30(0.03) 1.78(0.02) 0.98(0.04) 0.65(0.02) 0.32(0.01) 1.11(0.05) 0.77(0.03) 0.44(0.02) 1.80(0.09) 1.15(0.05) 0.65(0.02) 1.33(0.05) 0.91(0.04) 0.50(0.02) 1.48(0.08) 0.90(0.04) 0.56(0.02)

3.32(0.20) 2.24(0.12) 12.75(0.87) 4.68(0.22) 2.36(0.11) 0.91(0.03)

5.88(0.19) 4.24(0.10) 3.02(0.06)

1.02(0.04) 0.61(0.02) 0.30(0.01) 4.08(0.02) 3.31(0.03) 2.89(0.03) 1.03(0.04) 0.65(0.03) 0.30(0.01) 1.03(0.06) 0.65(0.03) 0.32(0.01) 1.89(0.10) 1.13(0.05) 0.68(0.03) 1.78(0.07) 1.16(0.05) 0.60(0.02) 3.92(0.12) 2.78(0.08) 1.96(0.05)

8.65(0.15) 7.53(0.10) 62.68(0.70) 4.00(0.19) 2.15(0.09) 0.97(0.03)

4.38(0.15) 3.11(0.08) 2.20(0.05)

0.98(0.03) 0.62(0.02) 0.33(0.01) 4.25(0.02) 3.71(0.02) 3.42(0.02) 0.90(0.03) 0.61(0.02) 0.28(0.01) 0.85(0.04) 0.55(0.03) 0.27(0.01) 1.32(0.08) 0.71(0.03) 0.35(0.02) 1.08(0.06) 0.65(0.04) 0.31(0.01) 1.62(0.09) 0.95(0.05) 0.61(0.03)

3.85(0.18) 3.30(0.12) 31.59(0.92) 6.62(0.23) 3.86(0.14) 1.60(0.05)

3.17(0.19) 1.68(0.08) 0.93(0.04)

1.03(0.04) 0.64(0.02) 0.35(0.01) 4.29(0.02) 3.77(0.02) 3.48(0.02) 0.95(0.03) 0.62(0.02) 0.28(0.01) 0.99(0.05) 0.62(0.03) 0.31(0.01) 1.31(0.09) 0.67(0.03) 0.33(0.02) 1.05(0.05) 0.67(0.03) 0.33(0.01) 1.54(0.08) 0.93(0.04) 0.60(0.03)

3.59(0.15) 2.99(0.10) 25.05(0.78) 3.81(0.21) 2.11(0.10) 0.84(0.03)

763

764

G. KOEKEMOER AND J.W.H. S WANEPOEL

Curvature capturing 0.6

0.3

0.5 0.4

0.2

0.3 0.2

0.1

0.1

-3.0

-1.5

0.0

1.5

3.0

-2

-1

0

1

2

4.5

7.0

9.5

12.0

0.2

0.4

0.6

0.8

Spurious bumps removal

0.4

0.3

0.1

0.2

0.1

-3.0

-1.5

0.0

1.5

3.0

2.0

Boundary bias correction 1

1

0.9

0.9

0.8 0.7

0.8

0.6

0.7

0.5

0.0

0.2

0.5

0.8

1.0

0.0

Figure 2. (top row) Average density estimates: trimodal and claw densities—to illustrate curvature capturing ability. (middle row) Skewed bimodal density and tail of the standard lognormal density—to illustrate the removal of spurious bumps in tail regions as well as curvature capturing ability. (bottom row) Uniform density and the boundary region of the standard exponential density—to illustrate boundary bias correction. Legend: theoretical density (solid line), 0-step TKDE (dotted line), 1-step TKDE (dash-dotted line), Y-M estimate (dashed line).

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS

765

and the results are displayed in Table 2 for the three sample sizes chosen. The estimated standard errors were omitted in the output since these values were small and of the same order for all estimators considered. From Table 2 and Figure 2 we conclude that:

• As far as the global discrepancy measure (MISE) is concerned, we conclude that the 0-step TKDE performs generally well when applied to data obtained from densities without excessive curvature such as the normal [1], uniform [2], skewed unimodal [7], Weibull [8], lognormal [9], exponential [10] and Pareto [11] densities. In cases [10] and [11] the 0-step TKDE outperforms all the other competitors significantly. We recommend the 0-step TKDE for the estimation of unimodal, low, and high kurtosis densities.

• Regarding the MISE, the 1-step TKDE is suitable for capturing density curvature which appears in the bimodal [3], trimodal [4], claw [5], skewed bimodal [6], and kurtotic unimodal [12] densities. In case [5] the 1-step TKDE outperforms all the other competitors significantly. Figure 2 displays graphs for the trimodal, claw, and skewed bimodal densities. From these density estimates (averages of 200 Monte Carlo trials) it is evident that the 1-step TKDE outperforms the other competitors. We recommend the 1-step TKDE for the estimation of densities with moderate to excessive curvature.

• Spurious bumps in the tail regions are significantly removed by the proposed procedures: graphical output of the skewed bimodal density and the tail of the standard lognormal density is presented in Figure 2. It should be noted that the 0-step procedure seems to be most effective at estimating density tails.

• Boundary bias is addressed automatically by the 0-step and 1-step TKDEs, as is evident from the estimates of the uniform density and the boundary region of the standard exponential density as shown in Figure 2.

6.2

6.2.1

A PPLICATIONS TO R EAL DATA

Example 1: British Income Data

These data consist of 7,201 British incomes, see Wand (1997), for the year 1975, and were divided by their sample average, yielding the observations X 1 , . . . , X n . Let Z x,i =

X i − 0.9188 , i = 1, . . . , 7201. 0.551

The shifted Box–Cox transformation was selected with the minimum residual parameter estimation technique. The estimated parameter values are λˆ 1 = 1.7459 and λˆ 2 = 0, rendering the log transformation. The shift estimate δˆ (see 5.1) is 0.185. The 0-step and 1-step semiparametric TKDEs clearly show a sharp bimodal structure in the data (see Figure 3), which was also found by Wand (1997) and De Beer and Swanepoel (1999).

766

G. KOEKEMOER AND J.W.H. S WANEPOEL 0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0.0

2.2

4.5

6.8

9.0

0.0

0.4

0.7

1.0

1.4

Figure 3. 0-step TKDE (dotted line), 1-step TKDE (dash-dotted line) of the British income data. The left panel displays density estimates over the entire range of the data. The right panel contains only the modal region.

6.2.2

Example 2: Astrophysical Data

6.2.3

Example 3: Buffalo Snowfall Data

These data were obtained from an astrophysical experiment that considered all pulsar phases above 50 MeV for the Geminga pulsar (Mayer-Hasselwander 1994). The data, which consist of 5,018 phases, were extracted from the public domain Phase I of the EGRET experiment on Compton Gamma Ray Observatory. The data were standardized yielding X i − 0.5487 Z x,i = , i = 1, . . . , 5018. 0.3026 The Johnson (1949) family of transformations with γ = 3 was selected with the minimum residual parameter estimation technique. The estimated parameters are cˆ = 0.5448 and δˆ = 0.165. The 0-step and especially the 1-step semiparametric TKDE’s comply (see left panel of Figure 4) to all the profiles of the Geminga pulsar described by De Jager (1994). De Beer and Swanepoel (1999) derived similar conclusions. This dataset is the well-known Buffalo snowfall data which consist of 63 observations (Scott 1992, p. 279). Much controversy exists in the literature regarding the distribution of this data. According to Scott (1992, p. 109) some researchers argued that these data appear to be trimodal, while others suggested a unimodal distribution. These data were standardized yielding X i − 79.6 Z x,i = , i = 1, . . . , 63. 23.72 The Johnson (1949) family of transformations with γ = 3 was selected with the profile maximum likelihood parameter estimation technique. The estimated parameters are cˆ = 0.3545 and δˆ = 0.105. Both the 1-step and 2-step semiparametric TKDEs clearly suggest a trimodal density (see right panel of Figure 4). This conclusion is in contrast to the finding of other researchers who suggested a unimodal density. Moreover, it is also in contrast to the conclusion reached by De Beer and Swanepoel (1999) that the underlying density of the data is bimodal.

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS 2.7

767

0.017

2.4

0.015

2.1

0.013

1.8

0.011

1.5

0.009

1.2

0.007

0.9

0.005

0.6

0.003

0.3 0.0

0.2

0.5

0.8

1.0

25

50

75

100

125

Figure 4. The left panel displays density estimates of the astrophysical data. The right panel contains density estimates of the Buffalo snowfall data. Legend: 0-step TKDE (dotted line), 1-step TKDE (dash-dotted line), 2-step TKDE (solid line).

7. CONCLUSIONS

The newly proposed semiparametric TKDEs are at least as effective as established kernel estimators in the literature. The 0-step TKDE is recommended for the estimation of unimodal, low, and high kurtosis densities as well as density tails. The 1-step TKDE is recommended for the estimation of densities with moderate to excessive curvature. Spurious bumps in the tail regions and boundary bias are addressed automatically. Hence, the proposed estimators are applicable to a broad range of target densities. An open research problem is to construct an effective, data-dependent choice of ca . Also, more advanced procedures can be invented to estimate α and β, required to implement the newly proposed adaptation scheme given in expression (4.1). We are currently investigating these issues and will report on our findings at a later stage.

ACKNOWLEDGMENTS

This work was supported by the National Research Foundation of South Africa and by the Ministry of the Flemish Community (Project BIL 00/28, International Scientific and Technological Cooperation) by the Interuniversity Attraction Poles research network nr P5/24 of the Belgian Government (Belgian Science Policy). [Received December 2005.Revised September 2007.]

REFERENCES

Abramson, I. S. (1982), “On Bandwidth Variation in Kernel Estimates—A Square Root Law,” The Annals of Statistics, 10, 1217–1223. Alberts, T., and Karunamuni, R. J. (2003), “A Semiparametric Method of Boundary Correction for Kernel Density Estimation,” Statistics and Probability Letters, 61, 287–298. Bickel, P. J., and Doksum, K. A. (1981), “An Analysis of Transformations Revisited,” Journal of the American Statistical Association, 76, 296–311. Bolanc´e, C., Guillen, M., and Nielsen, J. P. (2003), “Kernel Density Estimation of Actuarial Loss Functions,” Insurance: Mathematics and Economics, 32, 19–36. Bowman, A. W. (1984), “An Alternative Method of Cross-Validation for the Smoothing of Density Estimates,” Biometrika, 71, 353–360.

768

G. KOEKEMOER AND J.W.H. S WANEPOEL

Box, G. E. P., and Cox, D. R. (1964), “An Analysis of Transformations,” Journal of the Royal Statistical Society, Series B, 26, 211–252. Burdige, J. B., Magee, L., and Robb, A. L. (1988), “Alternative Transformations to Handle Extreme Values of the Dependent Variable,” Journal of the American Statistical Association, 83, 123–127. Burnham, K. P., Anderson, D. R., and Laake, J. L. (1980), “Estimation of Density from Line Transect Sampling of Biological Populations,” Wildlife Monograph, 72. Cao, R., Cuevas, A., and Gonz´alez-Manteiga, W. (1994), “A Comparative Study of Several Smoothing Methods in Density Estimation,” Computational Statistics & Data Analysis, 17, 153–176. Chiu, S.-T. (1991a), “Bandwidth Selection for Kernel Density Estimation,” The Annals of Statistics, 19, 1883– 1905. (1991b), “The Effect of Discretization Error on Bandwidth Selection for Kernel Density Estimation,” Biometrika, 78, 436–441. (1992), “An Automatic Bandwidth Selector for Kernel Density Estimation,” Biometrika, 79, 771–782. (1996), “A Comparative Review of Bandwidth Selection for Kernel Density Estimation,” Statistica Sinica, 6, 129–145. Cline, D. B. H., and Hart, J. D. (1991), “Kernel Estimation of Densities with Discontinuities or Discontinuous Derivatives,” Statistics, 22, 69–84. Cowling, A., and Hall, P. (1996), “On Pseudodata Methods for Removing Boundary Effects in Kernel Density Estimation,” Journal of the Royal Statistical Society, Series B, 58, 551–563. Devroye, L., and Gy¨orfi, L. (1985), Nonparametric Density Estimation: The L 1 View, New York: Wiley. De Beer, C. F., and Swanepoel, J. W. H. (1999), “Simple and Effective Number-of-Bins Circumference Selectors for a Histogram,” Statistics and Computing, 9, 27–35. De Jager, O. C. (1994), “On Periodicity Tests and Flux Limit Calculations for Gamma-Ray Pulsars,” The Astrophysical Journal, 436, 239–248. Embrechts, P., Kl¨uppelberg, C., and Mikosch, T. (1997), Modelling Extremal Events for Insurance and Finance, Berlin: Springer. Gasser, T., and M¨uller, H.-G. (1979), in Smoothing Techniques for Curve Estimation, eds. Gasser, T., and Rosenblatt, M., Heidelberg: Springer-Verlag, pp. 23–68. Hall, P., and Davison, A. C. (1997), “On Kernel Density Estimation Without Bumps in the Tail,” unpublished manuscript. Hall, P., Marron, J. S., and Park, B. U. (1992), “Smoothed Cross-Validation,” Probability Theory and Related Fields, 92, 1–20. H¨ossjer, O., and Ruppert, D. (1995), “Asymptotics for the Transformation Kernel Density Estimator,” The Annals of Statistics, 23, 1198–1222. Hjort, N. L. (1996), in Baysian Statistics, eds. Berger, J. M., Berger, J. O., Dawid, A. P., and Smith, A. F. M., New York: Oxford Press University, vol. 5, pp. 223–253. John, J. A., and Draper, N. R. (1980), “An Alternative Family of Transformations,” Journal of the Royal Statistical Society, Series C, 29, 190–197. Johnson, N. L. (1949), “Systems of Frequency Curves Generated by Methods of Translation,” Biometrika, 36, 149–176. Jones, M. C., and Foster, P. J. (1996), “A Simple Nonnegative Boundary Correction Method for Kernel Density Estimation,” Statistica Sinica, 6, 1005–1013. Jones, M. C., Marron, J. S., and Sheather, S. J. (1996), “Progress in Data-Based Bandwidth Selection for Kernel Density Estimation,” Computational Statistics, 11, 337–379. Koekemoer, G. (1999), “A Comparative Study of Nonparametric Density Estimators,” unpublished master’s thesis, Potchefstroom University. (2004), “A New Method for Transforming Data to Normality with Application to Density Estimation,” unpublished Ph.D. thesis, Potchefstroom University. Koekemoer, G., and Swanepoel, J. W. H. (2008), “A Semiparametric Method for Transforming Data to Normality,” Statistics and Computing, DOI: 10.1007/s11222-008-9053-3. Available online at http:// www.springerlink. com/ content/ 8tu515g5r2371982/ . Loader, C. R. (1995), “Old Faithful Erupts: Bandwidth Selection Reviewed,” unpublished manuscript. Manley, B. F. (1976), “Exponential Data Transformations,” The Statistician, 25, 37–42. Markovitch, N. M., and Krieger, U. R. (2000), “Nonparametric Estimation of Long-Tailed Density Functions and

T RANSFORMATION K ERNEL D ENSITY E STIMATION W ITH A PPLICATIONS

769

its Application to the Analysis of World Wide Web Traffic,” Performance Evaluation, 42, 205–222. Marron, J. S., and Ruppert, D. (1994), “Transformations to Reduce Boundary Bias in Kernel Density Estimation,” Journal of the Royal Statistical Society, Series B, 56, 653–671. Marron, J. S., and Wand, M. P. (1992), “Exact Mean Integrated Squared Error,” The Annals of Statistics, 20, 712–736. Mayer-Hasselwander (1994), “High Energy Gamma Radiation from Gemina observed by EGRET,” The Astrophysical Journal, 421, 276–283. M¨uller, H.-G. (1985), “Empirical Bandwidth Choice for Nonparametric Kernel Regression by Means of Pilot Estimators,” Statistical Decisions, 2, 193–206. Park, B. U., Chung, S. S., and Seog, K. H. (1992), “An Empirical Investigation of the Shifted Power Transformation Method in Density Estimation,” Computational Statistics & Data Analysis, 14, 183–191. Park, B. U., and Marron, J. S. (1990), “Comparison of Data-Driven Bandwidth Selectors,” Journal of the American Statistical Association, 85, 66–72. Park, B. U., and Turlach, B. A. (1992), “Practical Performance of Several Data Driven Bandwidth Selectors,” Computational Statistics, 7, 251–270. Polansky, A. M., and Baker, E. R. (2000), “Multistage Plug-in Bandwidth Selection for Kernel Distribution Function Estimates,” Journal of Statistical Computation and Simulation, 65, 63–80. Rudemo, M. (1982), “Empirical Choice of Histograms and Kernel Density Estimators,” Scandinavian Journal of Statistics, 9, 65–78. Ruppert, D., and Cline, D. B. H. (1994), “Bias Reduction in Kernel Density Estimation by Smoothed Empirical Transformations,” The Annals of Statistics, 22, 185–210. Ruppert, D., and Wand, M. P. (1992), “Correcting for Kurtosis in Density Estimation,” Australian Journal of Statistics, 34, 19–29. Sakia, R. M. (1992), “The Box-Cox Transformation Technique: A Review,” The Statistician, 41, 169–178. Schuster, E. F. (1985), “Incorporating Support Constraints into Nonparametric Estimators of Densities,” Communications in Statistics—Theory and Methods, 14, 1123–1136. Scott, D. W. (1992), Multivariate Density Estimation: Theory, Practice, and Visualization, New York: Wiley. Scott, D. W., and Terrell, G. R. (1987), “Biased and Unbiased Cross-Validation in Density Estimation,” Journal of the American Statistical Association, 82, 1131–1146. Shapiro, S. S., and Wilk, M. B. (1965), “An Analysis of Variance Test for Normality (complete samples),” Biometrika, 52, 591–611. Sheather, S. J., and Jones, M. C. (1991), “A Reliable Data-Based Bandwidth Selection Method for Kernel Density Estimation,” Journal of the Royal Statistical Society, Series B, 53, 683–690. Silverman, B. W. (1986), Density Estimation for Statistics and Data Analysis, London: Chapman and Hall. Staniswalis, J. G. (1989), “Local Bandwidth Selection for Kernel Estimates,” Journal of the American Statistical Association, 84, 284–288. Terrell, G. R. (1990), “The Maximal Smoothing Principle in Density Estimation,” Journal of the American Statistical Association, 85, 470–477. Tukey (1957), “The Comparative Anatomy of Transformations,” The Annals of Mathematical Statistics, 28, 602– 632. Wand, M. P. (1997), “Data-Based Choice of Histogram Bin Width,” The American Statistician, 51, 59–64. Wand, M. P., and Jones, M. C. (1995), Kernel Smoothing, London: Chapman and Hall. Wand, M. P., Marron, J. S., and Ruppert, D. (1991), “Transformations in Density Estimation,” Journal of the American Statistical Association, 86, 343–361. Yang, L., and Marron, J. S. (1999), “Iterated Transformation Kernel Density Estimation,” Journal of the American Statistical Association, 94, 580–589. Yeo, I.-K., and Johnson, R. A. (2000), “A New Family of Power Transformations to Improve Normality or Symmetry,” Biometrika, 87, 954–959. Zhang, S., and Karunamuni, R. J. (1998), “On Kernel Density Estimation Near Endpoints,” Journal of Statistical Planning and Inference, 70, 301–316. Zhang, S., Karunamuni, R. J., and Jones, M. C. (1999), “An Improved Estimator of the Density Function at the Boundary,” Journal of the American Statistical Association, 94, 1231–1241.

More Documents from "Eric Sampson"

243-254
November 2019 18
289-295
November 2019 15
0770-0772
November 2019 15
279-288
November 2019 15
0750-0769
November 2019 22
June 2020 8