Non-Parametric Signal Interpolation

Abstract: This paper considers the problem of interpolation (smoothing) of a partially observable Markov random sequence. For the dynamic observation models, an equation for the interpolation of the posterior probability density is derived. The main goal of this paper is to consider the smoothing problem for the case of unknown distributions of an unobservable component of a random Markov sequence. Successful results were obtained for the strongly stationary Markov processes with mixing and for the conditional density belonging to the exponential family of densities. The resulting method is based on the empirical Bayes approach and kernel nonparametric estimation. The equation for the optimal smoothing estimator is derived in the form independent of unknown distributions of an unobservable process. Such form of the equation allows to use the nonparametric estimates for some conditional functionals in the equation given a set of dependent observations. To compare the nonparametric estimators with optimal mean square smoothing estimators in Kalman scheme, simulation results are given.


Introduction
This paper is devoted to synthesis of interpolation algorithms for an unobservable stationary sequence in conditional Markov scheme.The characteristic feature of the problem consists in absence of knowledge of the probability distribution family for the unobservable sequence.Nevertheless, some efforts are made to construct the sequence of estimators that have the asymptotic behavior (under a large volume of observations) close to the behavior of optimal interpolation estimators built under full statistical information on observable and unobservable signal components.For the case of full information, the transformation equations for the posterior probability density of unobservable sequences are known.They were obtained by R. Stratonovich for filtration and interpolation problems in 1966.The paper of Khazen (1978) represents the interpolation equation in the form of the normalizing product of filtration posterior probability densities in forward and backward time.This was done only for static observation models.The present paper shows that the similar equation is valid also for dynamic observation models (e.g., autoregressive models), however, in this case the equation is to be supplemented by another recursive equation that reflects the dynamic properties of observations.
Unfortunately, it is impossible to make use of this equations when the probability family for the unobservable sequence is unknown.However, for some conditional probability family of observations it is possible to transform these equations eliminating (explicit) dependence on statistic characteristics unknown a priori.The solution is found on the principles of the empirical Bayes approach and the theory of kernel non-parametric functional estimation (see Vasiliev, Dobrovidov, and Koshkin, 2004).

Interpolation Equation for Dynamic Observation Models
Interpolation (smoothing) of the partly observable Markov random sequence , is the problem of constructing estimators of the unobservable vector S k or a known one-to-one function Q(S k ) by observations x n 1 = (x 1 , . . ., x n ) T of the sequence (X n ) n≥1 for all k ≤ n.It is well known, that the optimal mean-square smoothing estimator of Q(S k ), k ≤ n, equals to the conditional expectation where π k (s k |x n 1 ) is the conditional posterior probability density of S k given all observable realizations x n 1 which is called to be the interpolating posterior density.There are some ways to calculate this conditional density.One of the interesting approaches examined below and referred to as the two-filter smoothing consists in recursive calculation of the filtering posterior probability density w k (s k |x k 1 ) in forward time and the filtering posterior density w k (s k |x n k ) in backward time.This algorithm is as follows (Khazen, 1978;Briers, Douset, and Maskell, 2003): where the first factor is a normalizing constant depending only on observations.Here and further the function f without index denotes any probability density that may differ from another one even in the same formula, because its argument completely determines the object.The computation results for filtering densities in forward time and backward time are multiplied giving the interpolating posterior density.For static and dynamic observation models, the algorithms of such computation are different.For static models described by the conditional density f (x k |s k ), the joint probability density f (s k , x k ) in the denominator of (2) can be represented in the form of the product p(s k )f (x k |s k ), where p(s k ) is the prior density.For dynamic observation models described by the conditional density f (x k |x k−1 , s k ), such a product form can not be constructed, because the density f (x k |s k ) is unknown in this case.Nevertheless, the joint probability density f (s k , x k ) admits representation via f (x k |x k−1 , s k ) by means of the following recursive equation with the initial condition f (s 1 , x 1 ) that is given by the prior density of the Markov process (S n , X n ) n≥1 .This equation can be simply derived using the total probability formula.Consequently, for dynamic observation models, pair (2), (3) is the system of equations for interpolating posterior probability density.The subsequent computation in (2), (3) can be carried out if all distributions of the composed Markov process (S n , X n ) n≥1 are known.But the main purpose of this paper consists in solving the interpolation problem with unknown distributions of the unobservable strongly stationary Markov sequence (S n ) n≥1 .The central idea to solve this problem is based on the principle of the empirical Bayes approach and the theory of nonparametric functional estimation by weakly dependent observations (see Vasiliev et al., 2004).The empirical Bayes approach aims to construct the estimators that are explicitly independent of the probabilistic characteristics of the unobservable random variables.This can be achieved, for instance, by using the conditional densities of observations from the exponential density family (see Chentsov, 1972) where , and T j (•), j = 1, . . ., m are the given Borelian functions and C(s n ) is the normalizing factor.
3 Optimal Interpolation Estimator under Unknown Probability Distribution of the Unobservable Signal Some applied problems of extracting a signal from its mixture with a noise result in a situation when this useful signal is never observed in the pure form and consequently one can not gather any statistic about its distribution.Moreover, any information about possible parametric family which includes the probability distribution for the signal is usually absent.To solve this problem we propose to use the empirical Bayes approach to the interpolation problem.This means that one must make an attempt to find such a form of the estimation equation that does not depend explicitly on the unknown probability characteristics of the unobserved signal.Now there are no regular methods for constructing such equations for arbitrary observation models.The success can be attained for more narrow class of observation models described by conditional densities from the exponential family (4).At that, the success in constructing the equation is attained not for the signal posterior probability density (2), but directly for optimal mean square estimator (1) itself.To derive this equation let us use the Markovian property of the sequence where (s k , x k ) is the value of the random variable (S k , X k ), and f (s 1 , x 1 ) and g(•|•) are the prior density and the transition density of (Z n ) n≥1 .Taking into account (5), the interpolating posterior density In the future we must choose in equation ( 6) any factors depending on the observation x k .
For this purpose, the second factor of ( 6) is transformed as follows Austrian Journal of Statistics, Vol. 37 (2008), No. 1, 21-29 But, since for dynamic models the transition density g( Obviously, only the second factor of (7) depends on x k .
Let us consider the third factor in ( 6).Then we get Here only the last ratio f (x k+1 |x k , s k+1 )/f (x k+1|s k+1 ) depends on x k .It is this dependence on x k that does not yet permit us to construct a non-parametric version of the equation ( 2) for dynamic observation models.However, for static observation models of the form X n = ϕ(S n , η n ), where ϕ is a measurable function and η n is an independent noise, the conditional density of observations f (x k+1 |x k , s k+1 ) = f (x k+1 |s n+1 ) does not depend on x k , and this transforms the equation ( 6) to the form: where only the second factor depends on x k ; some new definitions for the normalizing constants are introduced here: Such a form of the interpolation equation allows us to obtain its counterpart independent of the statistical characteristics of the unobserved process and remark once more that u k is independent of x k .Then, equation ( 9) can be represented in the form Let us integrate this equation with respect to s k and carry over the normolizing factor depending only of the observations to the left-hand side: Assuming now that the density f (x k |s k ) belongs to the exponential family (4), differentiate ( 14) w.r.t.x k .The possibility of differentiating under the sign of integral is justified by the assumption of existence of the second prior moment EQ T (S k )Q(S k ), that is the natural restriction of signal power.The latter is sufficient for existence of the mean square risk.Differentiation w.r.t.x k results in the equation For the exponential conditional density f (x k |s k ), Substituting ( 16) to ( 15) and denoting by )ds k , let us find the equation for the optimal mean square estimator Q( s k ) where T is the Jacobi matrix with elements ∂T i /∂x [j] k , i = 1, dots, m, j = 1, . . ., r.The equation ( 17) is a simple linear vector equation w.r.t.Q(ŝ k ), but it can be solved only for a certain density f k (x k |x n 1 without x k ).In the case where all probability distributions are known, this density can be computed, and the result coincides with (1) and (2).But in the case where f k (x k |x n 1 without x k ) can not be explicitly computed and even its parametric form is unknown, we restore it from the observations using the kernel non-parametric procedures.

Non-Parametric Interpolation Estimator
To solve the problem of interpolating on the basis of one realization x n 1 of a process (X k ) 1≤k≤n , X k ∈ R l , we can use the asymptotically ε-optimal interpolating procedure from Vasiliev et al. (2004), in which the truncated conditional density f ( ), where the parameter τ is the order of the Markov process which approximates the non-Markovian weak dependent process (X n ) n≥1 .The criteria and methods of estimation of τ which were developed in Vasiliev et al. (2004) with regard to filtration, can be extended to the interpolation problems.In this way, the conditional density f (x k |x k−1 k−τ , x k+τ k+1 ) can be written as the ratio , where the numerator is the marginal density of l × (2τ + 1)-dimensional vector and the denominator is the marginal density of l × 2τ -dimensional vector of observations.By substituting the multivariate non-parametric kernel estimators for these densities, we get a non-parametric form of the equation ( 17) in the form Interpretation of this equation is quite obvious.To construct the interpolation estimator at the point k, one uses the data before and later of k in a distance not exceeding τ .The interpolation estimator in the equation ( 19) is obviously consistent, but it depends on the logarithmic gradient of the conditional probability density which is an unstable functional that takes infinite values, when the denominator is equal to zero.For a more strong convergence, one should construct a piecewise smooth approximation (see Vasiliev et al., 2004) providing the mean-square convergence under some additional regularity conditions.

Comparison of Non-Parametric Interpolation Estimator with the Optimal Estimator in the Kalman Scheme
To illustrate the proposed estimator and its performance, consider an example with univariate state and observation models (m = l = 1): Here, S 1 , ξ n and η n are the mutually independent random variables with the Gaussian distributions N (0, σ 2 ) for S 1 and N (0, 1) for ξ n and η n , n ≥ 1.The coefficients a, b, A, B are known, |a| < 1.With suitable initial conditions, such equations generate a strongly stationary process.For the model ( 20), ( 21), the conditional probability density of observations is Gaussian and, therefore, the Kalman filter and the optimal forward and backward recursive linear interpolation equations associated with it can be obtained using the results of Liptser and Shiryaev (1977).
The Kalman filter: where M is the number of repeated experiments.This performance measure is the mean square deviation of the process estimator ( S i ) 1≤i≤N from the process of the signal (S i ) 1≤i≤N averaged by M experiments.We consider three estimators: Kalman estimator S k from the equation ( 22) with the risk R K , optimal backward interpolation Sk from the equation ( 24) with the risk R OI , and non-parametric interpolation from the equation ( 25) with the risk for convenience of comparison, let us introduce the relative errors in percentage as It shows how much one estimator is better or worse than another one.The simulation results are presented in Figure 1 for n = 1000, σ 2 = 2, a = 0.7, b = 1, A = B = 1, and τ = 1.The relative errors ε K and ε N I are given in Table 1.The simulation shows that nonparametric estimators can superior the optimal Kalman filtering estimators by the performance, but it is always inferior w.r.t. the optimal backward interpolation.In spite of existence of some methods for bandwidth parameter selection in non-parametric density estimators (e.g., plug-in or cross-validation), there are no corresponding methods for non-parametric estimation of the density derivative.Therefore, the way for completely automatic bandwidth selection still remains open.

Conclusion
The paper presents three interpolating algorithms for estimation of an unobservable signal corrupted by noise on the fixed time interval.By the example of a linear model, the simulation experiments illustrate the performance of the proposed non-parametric interpolation estimator in comparison with the optimal Kalman filtering estimator and with the optimal backward interpolation.

Figure 1 :
Figure 1: Comparison of the non-parametric smoothing estimator with the optimal estimators

Table 1 :
Relative excess of the empirical risk over the optimal smoothing risk M Optimal Kalman ε K Non-par ε N I