A Functional Central Limit Theorem for Kernel Type Density Estimators

Kernel type density estimators are studied for random fields. A functional central limit theorem in the space of square integrable functions is proved if the locations of observations become more and more dense in an increasing sequence of domains.


Introduction
In this paper a functional central limit theorem in the space L 2 [0, 1] is proved for kernel type density estimators for α-mixing random fields if the locations of observations become more and more dense in an increasing sequence of domains.
Kernel type density estimators are widely studied, see e.g. Prakasa Rao (1983), Devroye-Györfi (1985). In Bosq-Merlevède-Peligrad (1999), the asymptotic normality of these estimators is proved for α-mixing stochastic sequences and continuous time processes. In Fazekas-Chuprunov (2003) a so called infill-increasing setup is used to obtain a result that is in some sense between the discrete and the continuous time cases. See also Fazekas-Chuprunov (2004).
In statistics, most asymptotic results concern the increasing domain case, i.e. when the random process (or field) is observed in an increasing sequence of domains T n , with |T n | → ∞. However, if we observe a random field in a fixed domain and intend to prove an asymptotic theorem when the observations become dense in that domain, we obtain the so called infill asymptotics (see Cressie (1991)). It is known that several estimators being consistent for weakly dependent observations in the increasing domain setup are not consistent if the infill approach is considered.
In this paper we combine the infill and the increasing domain approaches. We call infill-increasing approach if our observations become more and more dense in an increasing sequence of domains. Using this setup, Lahiri (1999) and Fazekas (2003) studied the asymptotic behaviour of the empirical distribution function. Also in the infill-increasing case, consistency and asymptotic normality of the least squares estimator for linear errors-in-variables models were proved in Fazekas-Kukush (2000).
In this paper we follow the line of Fazekas-Chuprunov (2003). There asymptotic normality of the kernel type density estimator (2.4) is proved in the infill-increasing case. We quote that result in Theorem 2.1.
The main result of this paper is a functional central limit theorem for kernel type density estimators (Theorem 3.1). It is the functional version of the ordinary central limit theorem, i.e, Theorem 2.1.
We prove our functional central limit theorem in L 2 [0, 1], i.e. in the space of square integrable functions defined in the interval [0, 1]. We have to mention that most of the functional limit theorems are established in the space C of continuous functions, or in the Skorohod space D, see Billingsley (1968).. However, there are papers establishing criteria for functional limit theorems in L p and containing applications of such theorems (Grinblat (1976), Ivanov (1980), Oliveira-Suquet (1998)). To prove our result we apply criteria given in Grinblat (1976).

Notation and central limit theorems
The following notation is used. N is the set of positive integers, Z is the set of all integers, N d and Z d are d-dimensional lattice points, where d is a fixed positive integer. R is the real line, R d is the d-dimensional space with the usual Euclidean norm x . In R d we shall also consider the distance corresponding to the maximum norm: (x, y) = max 1≤i≤d |x (i) −y (i) |, where x = (x (1) , . . . , x (d) ), y = (y (1) , . . . , y (d) ). The distance of two sets in R d corresponding to the maximum norm is also denoted by , i.e. (A, B) = min{ (a, b) : a ∈ A, b ∈ B}.
For real valued sequences {a n } and {b n }, a n = o(b n ) (resp. a n = O(b n )) means that the sequence a n /b n converges to 0 (resp. is bounded). We shall denote different constants with the same letter c (or C). |D| denotes the cardinality of the finite set D and at the same time |T | denotes the volume of the domain T . The indicator function of the set A is I{A}.
We shall suppose the existence of an underlying probability space (Ω, F, P). The σ-algebra generated by a set of events or by a set of random variables will be denoted by σ{.}. Sign E stands for the expectation. The variance and the covariance are denoted by var(.) and cov(., .), respectively. The L p -norm of a random (vector) variable η is Sign ⇒ denotes convergence in distribution. N (m, Σ) stands for the (vector) normal distribution with mean (vector) m and covariance (matrix) Σ.
Describe the scheme of observations. For simplicity we restrict ourselves to rectangles as domains of observations. Let Λ > 0 be fixed. By (Z/Λ) d we denote the Λ-lattice points in R d i.e. lattice points with distance 1/Λ: T will be a bounded, closed rectangle in R d with edges parallel to the axes and D will denote the Λ-lattice points belonging to T , i.e. D = T ∩(Z/Λ) d . To describe the limit distribution we consider a sequence of the previous objects. I.e. let T 1 , T 2 , . . . be bounded, closed rectangles in R d . Suppose that We assume that the length of each edge of T n is integer and converges to ∞, as n → ∞. Let {Λ n } be an increasing sequence of positive integers (the non-integer case is essentially the same) and D n be the Λ n -lattice points belonging to T n .
Let {ξ t , t ∈ T ∞ } be a random field. The n-th set of observations involves the values of the random field ξ t taken at each point k ∈ D n . Actually, each k = k (n) ∈ D n depends on n but to avoid complicated notation we often omit superscript (n). By our assumptions, lim n→∞ |D n | = ∞.
We need the notion of α-mixing (see e.g. Guyon (1995), Lin-Lu (1996)). Let A and B be two σ-algebras in F. The α-mixing coefficient of A and B is defined as follows. α where I i is a finite subset in T ∞ with cardinality |I i | and F I i = σ{ξ t : t ∈ I i }, i = 1, 2. We shall use the following condition. For some 1 < a < ∞ From now on we shall use the following assumptions througout the paper.
Suppose that ξ t , t ∈ T ∞ , is a strictly stationary random field with unknown continuous marginal density function f . We shall estimate f from the data ξ i , i ∈ D n .
A function K : R → R will be called a kernel if K is a bounded, continuous, symmetric density function (with respect to the Lebesgue measure), and Let K be a kernel and let h n > 0, then the kernel-type density estimator is We assume that g u (x, y) is continuous in x and y for each fixed u. Let g u denote g u (x, y) as a function g : R d 0 → C(R 2 ), i.e. a function with values in C(R 2 ), the space of continuous real-valued functions over R 2 . Let g u = sup (x,y)∈R 2 |g u (x, y)| be the norm of g u .
For a fixed positive integer m and fixed distinct real numbers x 1 , . . . , x m , introduce the notation Theorem 2.1. (Theorem 3 in Fazekas-Chuprunov (2003).) Assume that g u is Riemann integrable (as a function g : R d 0 → C(R 2 )) on each bounded closed ddimensional rectangle R ⊂ R d 0 , moreover g u is directly Riemann integrable (as a function g : R d 0 → R). Let x 1 , . . . , x m be given distinct real numbers and assume that Σ (m) in (2.7) is positive definite. Suppose that there exists 1 < a < ∞ such that (2.2) is satisfied and for each n . (2.8) Assume that lim n→∞ Λ n = ∞, lim n→∞ h n = 0, and
If f (x) has bounded second derivative and lim n→∞ |T n |h 4 n = 0, then in (2.10) Ef n (x i ) can be changed for f (x i ), i = 1, . . . , m, and the above statement remains valid. 2 Remark 2.2. The notion of direct Riemann integrabilty can be found e.g. in Asmussen (1987), p. 118. That is somewhat stronger than Lebesgue integrability. In Fazekas-Chuprunov (2003) one can found an appropriate version of direct Riemann integrability for nonnegative functions defined on R d 0 and (possibly) unbounded at the origin.

The functional central limit theorem
In this section we shall prove functional central limit theorems in the space L 2 [0, 1]. We shall use the assumptions of the previous section. Moreover, in this section we suppose that both f and f n are equal to 0 outside of the interval [0, 1]. If we restrict our study to densities and kernel functions with compact supports, by appropriate transformation, this condition can be realized.
In the following theorem we shall use conditions of Theorem 2.1.
Theorem 3.1. Assume that g u is Riemann integrable on each bounded closed d-dimensional rectangle R ⊂ R d 0 , moreover g u is directly Riemann integrable. Let the function σ(x, y) = be positive definite. Suppose that there exists 1 < a < ∞ such that (2.2) and (2.8) are satisfied. Assume that lim n→∞ Λ n = ∞, lim n→∞ h n = 0, and Assume that f (x) has bounded second derivative and lim n→∞ |T n |h 4 n = 0. Then, as n → ∞, where G is a Gaussian process with mean 0 and with covariance function σ( . , . ).
To prove our theorem we need the following criterion (see Grinblat (1976) and Ivanov (1980)). Then, as n → ∞, ξ n (t) ⇒ ξ(t) in L 2 [0, 1]. 2 Proof of Theorem 3.1. First we prove that To this end we have to check the conditions of the preceeding proposition.
Condition (3.5), i.e. the convergence of the finite dimensional distributions of L 0 n (x) to those of G(x) is a consequence of Theorem 2.1. Now we turn to (3.6). The following calculation is a version of what is included in the proof of Theorem 3 in Fazekas-Chuprunov (2003).
where and A n (x) denotes the part of the sum with i = j, while B n (x) denotes the part of the sum with i = j.
as f (being continuous on a compact set) is bounded. Now, turn to B n (x).
As the random field is strictly stationary, we can assume that the center of the rectangle T n is the origin. Then the set of vectors of the form i − j with i, j ∈ D n is 2D n , where 2D n is defined as (2T n ) ∩ (Z/Λ n ) d . If u ∈ 2D n is fixed, then denote by |D n,u | the number of pairs (i, j) ∈ D n × D n with i − j = u. Then where 2D 0 n = 2D n \ {0}. Now fix an ε > 0. As g u is directly Riemann integrable, one can find a stripe M ε ⊂ R d (with center in the origin) such that (3.12) and at the same time the Riemannian approximating sums of this integral do not exceed ε if the diagonal of the subdivision is small enough. Therefore, as |Dn,u| |Dn| ≤ 1, when 1 Λ d n is small enough, i.e. when n is large enough: n ≥ n ε . Fix ε, M ε and assume that n ≥ n ε . Because g u is Riemann integrable as a function g : R d 0 → C(R 2 ) on R for each bounded closed d-dimensional rectangle R in R d 0 , therefore we have 1 Λ d n u∈2D 0 n ∩Mε g u − Mε g u du ≤ ε (3.14) in the space C(R 2 ), if n is large enough. This relation and (3.12) imply that R d 0 g u (x, y) du exists and it is continuous in (x, y). As each edge of T n converges to ∞, |Dn,u| |Dn| → 1 uniformly according to u ∈ M ε . Therefore, using that g u is directly Riemann integrable, we obtain that 1 Λ d n u∈2D 0 n ∩Mε |D n,u | |D n | g u − 1 Λ d n u∈2D 0 n ∩Mε g u ≤ ε , (3.15) if n is large enough.
Therefore, using that 1 hn K( x−u hn ) is a density function, we have if n is large enough. Moreover, the second term between the above absolute value signs is equal to x − h n s) du dtds .