E cient Numerical Algorithms for Solving Nonlinear Filtering Problems
by
Feng Bao
A dissertation submitted to the Graduate Faculty of
Auburn University
in partial ful llment of the
requirements for the Degree of
Doctor of Philosophy
Auburn, Alabama
August 2, 2014
Keywords: Nonlinear  lter, Bayesian theory, implicit  lter, Zakai  lter, sparse grid,
backward doubly stochastic di erential equation, It^o-Taylor expansion
Copyright 2014 by Feng Bao
Approved by
Yanzhao Cao, Chair, Professor of Mathematics
Amnon Meir, Co-Chair, Professor of Mathematics
Paul Schimdt, Professor of Mathematics
Erkan Nane, Professor of Mathematics
Jianjun Dong, Professor of Physics
Abstract
Examples of nonlinear  ltering problems arise in biology, mathematical  nance, signal
processing, image processing, target tracking and many engineering applications. Commonly
used numerical simulation methods are the Bayesian  lter which is derived from the Bayesian
formula and the Zakai  lter which is related to a system of stochastic partial di erential
equation known as \the Zakai equation".
This dissertation mainly focuses on developing and analysing novel, e cient numerical
algorithms for solving nonlinear  ltering problems. We  rst introduce a novel numerical
algorithm which lies in the general framework on the Bayesian  lter. The algorithm is
constructed based on samples of the current state obtained by solving the state equation
implicitly. We call this algorithm the \implicit  lter method". Rigorous analysis has been
done to prove the convergence of the algorithm. Through numerical experiments we show
that our algorithm is more accurate than the Kalman  lter and more stable than the particle
 lter.
In the second e ort of this work, we propose a hybrid numerical algorithm for the Zakai
 lter to solve nonlinear  ltering problems e ciently. The algorithm combines the splitting-
up  nite di erence scheme and hierarchical sparse grid method to solve moderately high
dimensional nonlinear  ltering problems. When applying hierarchical sparse grid method
to approximate bell-shaped solutions in most applications of nonlinear  ltering problem,
we introduce a logarithmic approximation to reduce the approximation errors. Some space
adaptive methods are also introduced to make the algorithm more e cient. Numerical
experiments are carried out to demonstrate the performance and e ciency of our algorithm.
ii
In this dissertation, we also develop high order numerical approximation methods for
backward doubly stochastic di erential equations (BDSDEs). One of the most importan-
t properties of BDSDEs is it?s equivalence to the Zakai equation. In this connection, our
numerical approximation methods for BDSDEs can be considered as e cient numerical ap-
proaches to solving nonlinear  ltering problems. The convergence order is proved through
rigorous error analysis for each algorithm. Numerical experiments are carried out to verify
the theoretical results and to demonstrate the e ciency of the proposed numerical scheme.
iii
Acknowledgments
This dissertation would never have been possible without the guidance of my committee
memebers, help from friends, and support from my family and wife.
Foremost, I am heartily thankful to my advisors, Dr. Yanzhao Cao and Dr. Amnon J.
Meir. Through my  ve-year study at Auburn University, they provided excellent teaching,
great company, and a signi cant amount of new ideas. I must also thank Dr. Yanzhao Cao
who encouraged me to step in the deep ocean of the stochastic  lter, for his unconditional
support and the full con dence he always o ered.
I would like to thank my committee members for their helpful comments to my disserta-
tion. I would also like to thank all the collaborators who always willing to help and give their
best suggestions. Besides, I am very grateful to Dr. Weidong Zhao who gives great ideas in
my research of numerical methods for backward doubly stochastic di erential equations.
Finally, I would like to express my deepest grateful to my beloved wife Ye and my
parents for their constant support. Without them, this dissertation would have hardly been
possible.
iv
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Nonlinear Filtering problems and Existing Numerical Approaches . . . . . . . . 6
2.1 Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Nonlinear Filtering Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Bayesian Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Particle  lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Zakai Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 An Implicit Algorithm of Solving Nonlinear Filtering Problems . . . . . . . . . . 15
3.1 Methodology and the implicit  lter algorithm . . . . . . . . . . . . . . . . . 15
3.1.1 Bayesian Optimal Filter . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.2 An inverse algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 A Hybrid Sparse Grid Approach for Nonlinear Filtering Problems Based on
Adaptive-Domain of the Zakai Equation Approximations . . . . . . . . . . . . . 38
4.1 Nonlinear Filtering Problems and Zakai Equations . . . . . . . . . . . . . . . 38
4.2 Hierarchical Local Sparse Grid Interpolation . . . . . . . . . . . . . . . . . . 40
4.2.1 Standard hierarchical sparse grid ipnterpolation . . . . . . . . . . . . 40
v
4.2.2 Hierarchical sparse grid approximation of bell-shaped cpurves . . . . 43
4.3 Hybrid Approach for Numerical Solution of the Nonlinear Filtering Problem. 45
4.3.1 Adaptive selection of solution domains . . . . . . . . . . . . . . . . . 46
4.3.2 Spliting-up  nite di erence method on sparse grid . . . . . . . . . . . 48
4.4 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5 Numerical Algorithms for Backward Doubly Stochastic Di erential Equations
and it?s Applications to Nonlinear Filtering Problems . . . . . . . . . . . . . . . 63
5.1 FBDSDEs and SPDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Half Order Numerical Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2.1 Numerical algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2.2 Error Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.3 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3 First Order Numerical Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.1 Variational Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.2 Reference equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.3 Numerical scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.3.4 Error Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3.5 Fully discrete scheme and its error estimate . . . . . . . . . . . . . . 113
5.3.6 Numerical experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
vi
List of Figures
3.1 Original Position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Particle Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 implicit Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Target Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6 Comparison result for the observation angle . . . . . . . . . . . . . . . . . . . . 36
3.7 Simulation result of EKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.8 Comparison result between Particle Filter and implicit Algorithm . . . . . . . . 37
4.1 The interpolation error for regular sparse grid approximation up to 5 dimensions. 44
4.2 The interpolation error for logarithmic sparse grid approximation up to 5 dimen-
sions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 (a) is the x1x2-marginal surface of U with x3 = x4 = 0 while (b) is its interpolate
approximation obtained by applying the regular sparse grid approximation to
function U. We can see from the Figure that the approximation is quite poor
and signi cant oscillations occur at the bottom of the surface. On the other hand,
from (d) one can see that the logarithmic interpolation approximation described
above is far more accurate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
vii
4.4 Target Trajectory and Adaptive Solution Domain. The red curve shows the real
target state. The blue points are actual states of the target and the blue boxes
are the corresponding solution domains. . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Target state PDF at time step: (a) n=1; (b) n=50; (c)=80; in the corresponding
solution domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.6 2D nonlinear  ltering problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.7 Marginal distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.8 Sparse grid Zakai  lter estimate of posterior mean with 95% probability region
. (a) Estimate of posterior mean and 95% probability region: X-coordinate (b)
Estimate of posterior mean and 95% probability region: Y-coordinate . . . . . . 57
4.9 Bearing-only tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.10 Comparison of estimated mean of the target state between particle  lter and
sparse grid Zakai  lter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.11 Sparse grid Zakai  lter estimate of posterior mean with 95% probability region.
(a) Estimate of posterior mean of the location x and 95% probability region. (b)
Estimate of posterior mean of the location y and 95% probability region. (c)
Estimate of posterior mean of the velocity u and 95% probability region. (d)
Estimate of posterior mean of the velocity v and 95% probability region . . . . 60
4.12 Marginal distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.1 Example 1: Convergence comparison between the direct  nite di erence scheme
and our scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.2 Example 2: Convergence comparison between the approximations of y and z . . 89
viii
5.3 Target tracking by using one detector . . . . . . . . . . . . . . . . . . . . . . . . 126
5.4 Tracking Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.5 Probability Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
ix
List of Tables
4.1 Comparison of computing costs . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Comparison of computing costs . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1 Example 1 of Section 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2 Example 2 of Section 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3 Example 1 of Section 6.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
x
Chapter 1
Introduction
Stochastic  ltering theory was  rst established in the early 1940s due to the pioneering
work by Norbert Wiener [79, 80] and Andrey N. Kolmogorov [46]. The goal of data assimila-
tion of the stochastic  ltering problem is to obtain good estimates of the state of a stochastic
dynamic system model based on noisy partial observations. Examples of stochastic  ltering
problems arise in biology [4], mathematical  nance [11, 31, 37, 69], signal processing [2, 3, 74],
image processing [12, 71, 91], target tracking [13, 20, 26, 29, 62, 85] and many engineering
applications. The major breakthrough of the classic  ltering problem was the landmark work
of Kalman and Bucy in 1960s [45], and subsequent Kalman-Bucy  lter in 1961. Most people
now call this type of  lter theory \Kalman  lter"( see also [28, 40, 49, 56, 57, 66, 72, 77] ).
For decades, Kalman  lter has been one of the dominant tools in solving  ltering prob-
lems. According to the Kalman  lter, under standard assumption of linearity and Gaussinity
of the noise, the conditional distribution of the state, given the observations, is Gaussian.
This conditional distribution gives the best estimate of the statistical description of the state
of the system based on all the available observation information up to the current time. The
power of Kalman  lter lies in it?s simplicity and accuracy. Largely because of the success of
Kalman  lters, linear and nonlinear  lters have been applied in the various engineering and
scienti c areas, including communications such as positioning in wireless networks, signal
processing such as tracking and navigation, economics and business, and many others.
In most of the practical application problems, however, linearity assumption is not valid
because of the nonlinearity in the model speci cation process as well as the observation
process. For example, in typical atmospheric data assimilation problems, the dynamic system
1
is described by a rather complex system of equations, which is highly nonlinear. In bearing-
only tracking problems, depending on how the measurements are obtained, the observation
could also have a nonlinear relation with the state process.
Numerous nonlinear  ltering methods have been proposed and developed to overcome
the linearity limitation of Kalman  lter. Two of the widely used methods for nonlinear
 ltering problems are the Bayesian  lter and Zakai  lter.
The Bayesian  lter is based on the Baysian theory. It is originally discovered by the
British researcher Thomas Bayes in 1763. The most well-known Bayesian  lter methods
include the extended Kalman  lter (EKF) and Particle Filter method (PFM). For the EKF,
the state equation and the observation equation are linearized so that the standard Kalman
 lter can be applied [7, 14, 27, 33, 42, 43, 44, 47, 63, 73]. The central theme behind the PFM
involves the representation of the desired probability density function (PDF) of the system
state with a set of adaptively selected random samples [15, 18, 21, 24, 38, 59, 58, 65, 76, 78].
Since PFMs are essentially sequential Monte Carlo methods, with su ciently large number of
samples, PFMs can provide an accurate representation of the PDF for the nonlinear  ltering
solution. While the aforementioned methods have been remarkably successful in attacking
the nonlinear  ltering problem, each of them has its drawbacks and limitations. For instance,
when the state equation describing the signal process and the observation equation are highly
non-linear, the EKF can give particularly poor performance. PFM has a number of advan-
tages over EFK, including its ability to represent arbitrary densities, adaptively focus on
the most probable regions of state-space. However, it also has a number of disadvantages,
including high computational complexity, degeneracy for long period simulation and its di -
culty of determining optimal number of particles. The  rst e ort of this work is to construct
a new algorithm for numerical simulations of nonlinear  ltering problems using the Bayesian
 ltering theory. For each time recursive step, we have two stages: the prediction state and
the update stage. The prediction stage gives the estimation for the prior PDF of the future
state based on the currently available observation information while update stage gives the
2
posterior PDF from the updated observation information and the the result obtained in the
prediction stage. However, instead of attempting to search for a representation of the PDF
as in PFM, we approximate the PDF as a function over a grid in state space. Speci cally,
at the prediction stage, we attempt to seek the predicted pdf of the future state variable
through a Monte Carlo method by evaluating the conditional expectation of the future state
with respect to the current stage. Since the sample points for the current state is computed
by solving the state equation implicitly, we name our method as an \implicit  lter method".
The following two items summarize the novelty of our approach.
(i) We propose an accurate implicit scheme for prediction purpose. The implicit scheme
has stabilizing e ect on the proposed numerical algorithm. This is veri ed in our
numerical experiments.
(ii) Based on the Bayesian theory, we apply a novel Monte-Carlo like method to approxi-
mate the conditional expectation in the update stage to compute the prior PDF.
The Zakai  lter represents the PDF of the nonlinear  ltering solution through the solu-
tion of a parabolic-type stochastic partial di erential equation, known as the Zakai equation
[82]. Similar to the PFM, the Zakai  lter allows one to accurately compute conditional dis-
tributions. A number of numerical algorithms have been proposed to solve Zakai equations
[1, 10, 22, 23, 30, 32, 36, 39, 67, 75, 85]. One of the most e ective methods is the splitting-up
approximation scheme [10, 30, 85] where the original Zakai equation is split into a second
order deterministic PDE, related with the prediction step, and a degenerated second order
stochastic PDE in the update step. In the numerical simulation process, a prior PDF is
obtained by solving the deterministic PDE at the prediction; then this prior PDF is updated
following a posteriori criterion. The main drawback of the Zakai  lter is that the numerical
approximation is grid based, thus it su ers the so-called \curse of dimensionality" since the
computing cost increases exponentially as the dimension of the system increases. Another
di culty related to solutions of the Zakai equation is that the domain is the whole space
3
Rd. To address these challenges, we propose the construction of an e cient hybrid numer-
ical algorithm which combines the advantages of the splitting-up approximation scheme for
the Zakai equation, a hierarchical sparse grid method [41, 50, 55, 68, 70, 81, 83, 84] for
moderately high dimensional nonlinear  ltering problems to compute the numerical solution
of the Zakai  lter, and an importance sampling method to adaptively construct a bounded
domain at each time step of the temporal discretization. Speci cally, this enables us to use
the splitting-up  nite di erence scheme to solve the Zakai equation on the sparse grid of
the bounded domain. The hierarchical sparse grid method, which was originally created to
approximate multi-variable functions, uses only O(n(logn)d 1) number of grid points instead
of O(nd) number of grid points, required by the standard full-grid approximation.
Backward doubly stochastic di erential Equations (BDSDEs) were introduced as Feynman-
Kac type probabilistic representations of semi-linear parabolic stochastic partial di erential
equations (SPDEs), which are generalized Zakai equations [61]. In this connection, numerical
approximation methods for simulating solutions of BDSDEs are also numerical approxima-
tion methods for solving Zakai type equations. Thus it can be considered as alternative
numerical approaches for nonlinear  ltering problems. In this work, we propose e cient
numerical algorithms for approximating solutions of BDSDEs. Several e ective numerical
approaches for backward stochastic di erential equations ( BSDEs ) and forward backward
stochastic di erential equations ( FBSDEs ) have been proposed in the last decade, including
primary schemes for BSDEs ([5, 19, 52, 86]), the four-step scheme for FBSDEs [9, 53, 54, 51]),
and the  -scheme with high convergence rate for BSDEs ([87, 88, 89, 90]). In comparison,
e cient high order numerical algorithms for BDSDEs are not well developed. Obviously
solving BDSDEs numerically is more di cult than solving SDEs and BSDEs as they contain
two Brownian motions. In this work, we  rst construct a half order algorithm using the
simple Euler method. It is much more involved to construct higher order algorithms. The
bottle neck is the di culty in approximating the forward and backward It^o integrals with
high order quadratures. To tackle this di culty, we propose to use an It^o-Taylor formula
4
for two-sided stochastic integrals [60, 61] in order to obtain a higher order quadrature for
the backward It^o integral; for the forward stochastic integral, we propose to use the varia-
tional equations for BDSDEs [61] to derive high order quadrature rule. It is worth noting
that although our focus is the on construction of a  rst order algorithm, the methodology
developed in this work can be used to obtain even higher order algorithms.
The outline of this work is as follows. In Chapter 2, we give a brief overview of the prob-
ability theory that will be used throughout the rest of the chapters. We will also introduce
the mathematical de nition of the nonlinear  ltering problems and some existing numerical
approximation methods for nonlinear  ltering problems. In Chapter 3, we present our novel
\implicit  lter algorithm" for nonlinear  ltering problems. We also give convergence analysis
which shows weak convergence of our algorithm. Numerical experiments demonstrate that
our algorithm is more accurate than the Kalman  lter and more stable than the particle  l-
ter. In Chapter 4, we construct a hybrid  nite di erence algorithm for the Zakai equation to
solve nonlinear  ltering problems. The algorithm combines the splitting-up  nite di erence
scheme and hierarchical sparse grid method to solve moderately high dimensional nonlinear
 ltering problems. A space adaptive method is introduced to make the algorithm more e -
cient. Numerical experiments are carried out to demonstrate the performance and e ciency
of our algorithm. In Chapter 5, we focus on numerical algorithms of BDSDEs, which can be
considered as an alternative numerical approach for nonlinear  ltering problems. We  rst
introduce a half order convergence algorithm based on Euler approximation for stochastic
integrals. Then, we develop a  rst order convergence scheme by using two-sided It^o Taylor
expansion. For each algorithm, we give rigorous error analysis and numerical experiments
to verify the convergence rates of our algorithms.
5
Chapter 2
Nonlinear Filtering problems and Existing Numerical Approaches
In this Chapter, we  rst give some de nitions from general probability theory. Then,
we introduce the mathematical formulations of nonlinear  ltering problems and some of
the most well-known numerical schemes that have been studied in the literature. There
are mainly two numerical approaches to solve the nonlinear  ltering problem. One is the
Bayesian  lter, which is based on the Bayes rule; the other is the Zakai  lter, which is based
on the Zakai equation, a stochastic evolution equation.
2.1 Mathematical Preliminaries
De nition 1 If  is a given set, then a  -algebra F on  is a family F of subsets of  
with the following properties:
(i) ;2F
(ii) F 2F)FC 2F, where FC =  nF is complement of F in  
(iii) A1;A2;   2F)A := S1i=1Ai2F.
The pair ( ;F) is called a measurable space. A probability measure P on a measurable
space ( ;F) is a function P :F![0;1] such that
(a) P(;) = 0, P( ) = 1
(b) if A1;A2;   2F and fAig1i=1 is disjoint (i. e. Ai\Aj =; if i6= j) then
P
 1[
i=1
Ai
!
=
1X
i=1
P(Ai):
6
The triple ( ;F;P) is called a probability space.p
Given any family U of subsets of  there is a smallest  -algebra HU containing U,
namely
HU =\fH;H -algebra of  ; U Hg:
We call HU the  -algebra generated by U.
If ( ;F;P) is a given probability space, then a function f :  ! Rn is called F-
measurable if
f 1(U) :=f!2 ; f(!)2Ug2F
for all open sets U2Rn.
If X :  !Rn is any function, then the  -algebra HX generated by X is the smallest
 -algebra on  containing all sets
X 1(U); U Rn open:
De nition 2 A random variable X is an F-measurable function X :  ! Rn. Every
random variable induces a probability measure  X on Rn de ned by
 X(B) = P(X 1(B)):
 X is called the distribution of X.
If R jX(!)jdP(!) <1 then the number
E[X] :=
Z
 
X(!)dP(!) =
Z
Rn
xd X(x)
is called the expectation of X (w.r.t. P).
De nition 3 A stochastic process is a parameterized collection of random variables
fXtgt2T de ned on a probability space ( ;F;P) and assuming values in Rn.
7
Next, we introduce a general form of the stochastic di erential equations. First, we
need the following de nition of Brownian motion
De nition 4 Brownian motion (or Wienner process) Bt starting at x is a stochastic
process satisfying the following properties
(i) B0 = x
(ii) Bt is almost surely continuous
(iii) Bt has independent increments
(iv) Bt Bs N(0;t s) (for 0 s t )
N( ; 2) denotes the normal distribution with mean value  and variance  2.
To proceed, we show how to de ne the It^o integral RTS f(t;!)dBt(!).
A function  is called elementary if it has the form
 (t;!) =
X
j
ej(!) X[tj;tj+1)(t):
For elementary function  , we de ne the It^o integral by
Z T
S
 (t;!)dBt(!) =
X
j 0
ej(!)[Btj+1  Btj](!):
Then we de ne the It^o integral as follows
De nition 5 Let f be a measurable function. Then the It^o integral of f (from S to T) is
de ned by Z
T
S
f(t;!)dBt(!) = limn!1
Z T
S
 n(t;!)dBt(!)
where f ng is a sequence of elementary functions such that
E[
Z T
S
(f(t;!)  n(t;!))2dt]!0 as n!1:
8
With the de nition of It^o integral, a general form of stochastic di erential equation is
given by
dXt = b(Xt)dt+ (Xt)dBt; X0 = x;
where Xt is the solution of the stochastic di erential equation.
2.2 Nonlinear Filtering Problems
Now, let us consider the following generic stochastic  ltering problem in a dynamic
state-space form
Xt
dt = f(t;Xt;
_Wt) (2.1)
Yt
dt = g(t;Xt;
_Bt): (2.2)
Equations (2.1) and (2.2) are called signal state equation and measurement equation (or
observation equation), respectively; Xt represents the state vector, Yt is the measurement
vector; f : Rnx ! Rnx and g : RNx ! RNy are two vector valued functions, which are
potentially time-varying; Wt and Bt are two independent Wienner processes, with covariance
I and R, respectively, which represent process noise and measurement noise, respectively.
When f and g are both linear functions, the  ltering problem (2.1)-(2.2) is called the linear
 ltering problem, otherwise, it?s called the nonlinear  ltering problem.
In many applications, the noise can be assumed to be additive and (2.1)-(2.2) become
dXt = b(Xt)dt+ (Xt)dWt (2.3)
dYt = h(Xt)dt+dBt; (2.4)
where, b : Rnx ! Rnx and h : RNx ! RNy are two vector valued functions, which are
potentially time-varying, and  : Rnx !Rnx nw is a matrix valued function.
9
The main purpose of numerical simulations of a  ltering process is to obtain, recursively
in time, the best estimate for the probability density function (pdf) of the state Xt based on
observations up until time t, i.e.fYs : 0 s tg. This can be expressed as  nding stochastic
process ~Xt such that
~Xt = E[(XtjYt)] = inffE[jX Yj2];Y 2Kg
where Yt is the  -algebra generated by the observation process up to t, and K is the space
of all Yt-measurable and square integrable random variables.
In typical applications, observations are available only at a discrete sequence of time
instants. We assume that the observations are collected at time instants  t;2 t;   , where
 t is a given time stepsize. Then denoting
Yn := Y(n t) Y((n 1) t);
we have
Yn =
Z n t
(n 1) t
h(Xs)ds+Bn t B(n 1) t h(Xn t) t+  n;
where  n is an i.i.d. sequence of standard Gaussian variables and  := pR t. Denoting
h(x) t by g(x), we rewrite a discrete analog of the observation equation (2.4) as follows.
Yn = g(Xn t) +  n; n 1: (2.5)
Alternatively, one can consider (2.5) as the original underlying observation model and
not a discrete approximation to (2.4).
10
2.3 Bayesian Filters
We  rst review the extended Kalman  lter (EKF), the most commonly used numerical
approach for solving the nonlinear  lter. Then, we discuss a popular class of approximation
methods, the particle  lter.
2.3.1 Extended Kalman Filter
The basic idea of EKF is to linearize the nonlinear functions in both the state equation
and observation equation and as a result approximate the conditional distribution Pn by a
normal distribution with parameters ( n; n). Given the approximation P n 1 to Pn 1 as a
normal distribution with mean  n 1 and covariance  n 1. The approximation P n to Pn is
obtained by the following two steps.
Prediction step. Let fx (t);t 0g be the trajectory about which linearization is
performed. Typically, the trajectory x is chosen as the solution of the deterministic ODE
_xt = b(xt)
with initial condition  n 1. We denote  Xt := Xt x (t). Then, the linearized dynamic
signal state is given as
 _Xt = [@f@x]x=x  Xt + (x (t))dBt:
The prediction step of the standard continuous time Kalman  lter is then applied with this
linearized state model to give the approximation ^P n to ^Pn via a normal distribution with
parameters (^ n;^ n).
Update step. Similar to the state model, the observation model (2.5) is also linearized
and is given as follows.
Yn g(^ n) = [@g@x]x= ^mn(Xtn ^mn) +  n
11
The update step of the usual Kalman  lter [45] is used with this linearized observation model
to obtain P n as a normal distribution with parameter  n; n.
2.3.2 Particle  lter
The particle  lter, also known as sequential Monte-Carlo method is a recursive Bayesian
 lter based on the idea of approximating expected values by suitable Monte-Carlo sample
averages. It is also called bootstrap  lter, due to Gordon, Salmond and Smith [38]. The basic
idea of the particle  lter is as follows. The state space is partitioned into subregions, in which
some random samples, called particles, are  lled according to some probability measure. The
higher the probability of a subregion, the denser the particles are concentrated. Since the
pdf can be approximated by the point-mass histogram, by random sampling of the state
space, we get a number of particles representing the pdf.
To be more speci c, the approximation P n to Pn is given by a discrete probability
measure supported on random points x(n)1 ;   ;x(n)L and corresponding weights p(n)1 ;   ;p(n)L .
Here, fx(n)l gl=1;   ;L are called particles and L represents the number of particles that are
used to approximate the distribution Pn. Given the approximation P n to Pn by the discrete
probability measuref(x(n 1)1 ;p((n 1))1 );   (x(n 1)L ;p((n 1))L )gthe two key steps of the algorithm
are as follows.
Prediction step. Propagate each of the particles according to the state equation, i.e.
x(n 1)j ! ^x(n)j . This requires simulating the SDE (2.3) with initial condition x(n 1)j by some
discretization scheme. For example, we can use Euler scheme to get
^x(n)j = x(n 1)j +b(x(n 1)j ) t+ (x(n 1)j )p t n 1;
wheref ngn=0;1;2;::: is an i.i.d sequence of standard Gaussian random variables. This gives an
approximation ^P n to ^Pn as the discrete probability distribution f(^x(n)1 ; ^p(n)1 );   (^x(n)L ; ^p(n)L )g
where we set ^p(n)j := p(n 1)j .
12
Update step. Update the weights ^p(n)j !p(n)j using the observation Yn by setting
p(n)j = c^p(n)j R(^x(n)j ;Yn);
where c is a normalization constant and
R(^x(n)j ;Yn) := exp(g(^x
(n)
j )Yn 
1
2jg(^x
(n)
j )j
2
 2 ):
The approximation P n is now given as f(x(n)1 ;p(n)1 );   (x(n)L ;p(n)L )g where we set x(n)j := ^x(n)j .
Although the scheme is very easy to implement, it su ers from several degeneracy problems,
especially in high dimensions. The main di culty is that after a few time steps all the
weights tend to concentrate on a very few particles which drastically reduces the e ective
sample size. A common remedy for this problem is to re-sample all the particles in order to
rejuvenate the particle cloud.
2.4 Zakai Filter
An alternative approach to the computation of the nonlinear  lter is by developing
an evolution equation for the conditional pdf p(t;Xt) := p(XtjYt) (see [82]). To be more
speci c, under suitable regularity conditions one can show that the conditional pdf p(t;x) is
the unique solution of the following stochastic partial di erential equation (SPDE):
dp(t;x) = L p(t;x)dt+h(x)p(t;x)dYt; (2.6)
where L is the in nitesimal generator of the state equation Xt and L is the adjoint of L.
The above SPDE is also called the Zakai equation. The Zakai equation gives a recursive way
for evaluating p(t;x). Several works [10, 23, 30, 36, 85] have been done to develop suitable
time and space discretization schemes for the Zakai equation for the purpose of obtaining
13
numerical approximations for the nonlinear  lter. This type of approximation methods are
called \the Zakai  lters".
14
Chapter 3
An Implicit Algorithm of Solving Nonlinear Filtering Problems
In this Chapter, we consider the following state and observation equations in the dy-
namic state-space form:
dXt
dt = f(t;Xt;Wt) (3.1)
Yt = g(t;Xt;Vt) (3.2)
where Xt 2Rnx denotes the state vector, Yt 2Rny denotes the measurement vector, Wt 2
Rnw is a random vector representing the uncertainties in the model, and Vt 2Rnv denotes
the random measurement error. In many applications, the noise from measurement can be
assumed to be additive, and the problem can be formulated in a discrete manner as
Xt+1 = ft(Xt;Wt) (3.3)
Yt = gt(Xt) +Vt; (3.4)
where fWtgt2N2Rnw and fVtgt2Nnf0g2Rnv are mutually independent white noises and the
subscript t indexes the discrete time level at which the functions are evaluated. In data
assimilation, the observation Yt arrives sequentially in time and the goal is to estimate the
state vector Xt given the information of fYs; 0 <s tg.
3.1 Methodology and the implicit  lter algorithm
In this section, we provide a brief review on the formulation of Bayesian optimal  lter
and introduce the implicit  lter algorithm.
15
3.1.1 Bayesian Optimal Filter
First we adoptp some notations that will be used throughoutp the rest of this Chapter.
Denote Zm:n as (Zm;Zm+1;   ;Zn) and denote Xt  p(xt) if the pdf of a random variable
Xt is p(xt). Write
Xtj(Xt 1 = xt 1) p(xtjxt 1) (3.5)
where Xt jXt 1 denotes the conditional expectation. When the context is clear, notations
similar to (3.5) will be introduced without formal explanations.
The dynamical model is Markovian such that any future Xt is independent of the past
given the present Xt 1:
p(xtjx1:t 1;y1:t 1) = p(xtjxt 1)
and the measurements are conditionally independent given xt:
p(ytjx1:t;y1:t 1) = p(ytjxt)
In this Chapter, we denote by It := fy1;y2;   ;ytg the information observed before time
t. Given a prior distribution p(x0), Bayesian optimal  lter is to construct the distribution
p(xtjIt) recursively in two stages: prediction and update.
Assume that the required pdf p(xt 1jIt 1) of previous step t 1 is available. The
Chapman-Kolmogorov equation gives the prediction step of
p(xtjIt 1) =
Z
Rnx
p(xtjxt 1)p(xt 1jIt 1)dxt 1: (3.6)
At time t, as measurement yt becomes available, the prior distribution from (3.6) can then
be updated via Bayesian?s formula
p(xtjIt) = p(ytjxt)p(xtjIt 1)p(y
tjIt 1)
= p(ytjxt)p(xtjIt 1)R
Rnx p(ytjxt)p(xtjIt 1)dxt
: (3.7)
16
The exact computation of (3.6) and (3.7) is generally not possible. Exception exists
where all p(xtjIt) are Gaussian and the model is linear, in which case the moments can be
obtained using Kalman  lter. In practically all other cases, approximate solutions are sought
by numerical methods. Traditional particle  ltering methods recursively generate samples
(particles) following p(xtjIt) and use these samples to approximate moments of p(xtjIt). We
propose here an inverse method which generates samples from the white noise and use them
to approximate p(xtjIt), by utilizing inverse solutions to discretized stochastic di erential
equations.
3.1.2 An inverse algorithm
Our proposed method is based on the the fact that
p(xtjxt 1) =
Z
Rnw
p(xtjxt 1;wt 1) p(wt 1)dwt 1 = E[p(xtjxt 1;Wt 1)];
and therefore the term p(xtjIt 1) in (3.6) can be written as
p(xtjIt 1) =
Z
Rnx
p(xtjxt 1)p(xt 1jIt 1)dxt 1
=
Z
Rnx
E[p(xtjxt 1;Wt 1)]p(xt 1jIt 1)dxt 1:
(3.8)
We assume that a compact domain, B Rnx, is the region of interest. Assuming that
the pdf p(xt 1jIt 1) is given, to achieve the prediction step (3.8) from time t 1 to time t:
1. Generate M particles/paths fw(j)t 1gj=1   M according to the pdf of Wt 1 and approxi-
mate p( jWt 1) by its empirical pdf, denoted by  M( jWt 1) := 1M PMj=1   jw(j)
t 1
, where
 x denotes the delta-Dirac mass located in x. In fact, according to Bayesian formula,
p(xtjxt 1;wt 1) = p(xt 1;wt 1jxt)p(x
t 1;wt 1)
 p(xt);
17
and it follows immediately that p(xtjxt 1;Wt 1) is a random variable and the random-
ness comes from the white noise Wt 1, for each given xt and xt 1 2Rnx. Therefore the
term E[p(xtjxt 1;Wt 1)] in (3.8) can be approximated by
E[p(xtjxt 1;Wt 1)] E[ M(xtjxt 1;Wt 1)];
in which  M(xtjxt 1;Wt 1) has the probability distribution
Pr( M(xtjxt 1;w(j)t 1)) = 1M; j = 1;:::;M:
2. Partition regionBby usingN nodes: fu(i)gi=1   N and approximate M(xtj ) by M(u(i)j ).
Step 1 together with (3.8) gives
p(u(i)jIt 1) 
Z
B
E[ M(u(i)jxt 1;Wt 1)]p(xt 1jIt 1)dxt 1: (3.9)
3. Assume that ft is invertible. Let x(i;j)t 1 be the solution to the equation ft(x(i;j)t 1;w(j)t 1) =
u(i) for j = 1;   ;M and i = 1;   N. Then
E[ M(u(i)jxt 1;Wt 1)] = 1M
MX
j=1
 M(u(i)jxt 1;w(j)t 1)
= 1M
MX
j=1
 (xt 1 x(i;j)t 1 ); i = 1;:::;N
(3.10)
Hence the integral on the right hand side of (3.9) can be simpli ed and gives that for
each i = 1;:::;N,
p(u(i)jIt 1)  
Z
B
E[ M(u(i)jxt 1;Wt 1)]p(xt 1jIt 1)dxt 1
= 1M
MX
j=1
p
 
x(i;j)t 1jIt 1
 
:
(3.11)
18
4. The last step of prediction is to use interpolation to construct a piecewise approxi-
mation  (xtjIt 1) of p(xtjIt 1), from p(u(i)jIt 1) obtained in step 3. Denote by T[f g]
the piecewise linear function that connects the points (u(i);f g), then we obtain the
approximation of p(xtjIt 1) via
p(xtjIt 1)  T fp(u(i)jIt 1)gNi=1 
 T
2
4
(
1
M
MX
j=1
p
 
x(i;j)t 1jIt 1
 )N
i=1
3
5 :=  (xtjIt 1): (3.12)
where p
 
x(i;j)t 1jIt 1
 
is the value of the pdf p(xt 1jIt 1) at the point xt 1 = x(i;j)t 1 .
Finally in the update step, we update the prior pdf  (xtjIt 1) at xt 2B by using the
Bayes formula. Thus the approximation of p(xtjIt), denoted by  (xtjIt) is given as
p(xtjIt)  (xtjIt) = p(ytjxt) (xtjIt 1)R
Bp(ytjxt) (xtjIt 1)dxt
: (3.13)
3.2 Weak Convergence
Now, we study the convergence of the pdf obtained by our algorithm converges to the
Bayesian optimal  lter on B. In general, given a measure  and a function ?, we de ne
h ( );?i=
Z
?(x) ( ) dx:
De nition 6 Letf ng1n=1 be a sequence of probability densities onP(B), whereP(B) is the
space of all probability measures over B. We say that
  n converges to  2P(B) uniformly and write limn!1 n =  if for any "> 0, there
exists N0, such that j n(z)  (z)j<" for all z2B and n>N0.
19
  n converges to  2P(B) weakly and write limn!1 n ?=  if
limn!1h n;?i=h ;?i; 8?2Cb(B);
where Cb(B) is the set of all continuous bounded functions on B.
We will prove the weak convergence of (xtjIt) top(xtjIt), i.e., the convergence ofh (xtjIt);?i
to hp(xtjIt);?i.
To guarantee that the Bayes? formula in (3.7) is well de ned and can be ful lled in our
algorithm, we make the following standing assumptions:
(A1) For given It, the denominator in (3.7) (normalization constants) satis es
Z
B
p(ytjxt)p(xtjIt 1)dxt > > 0
(A2) The conditional kernel densitiesp(xtjxt 1) andp(ytjxt) are uniformly continuous, bound-
ed and strictly positive, i.e., given It;
0 <p(xtjxt 1) < 1; 0 <p(ytjxt) < 1:
For simplicity we denote Ktjt 1 := E[p(xtjxt 1;Wt 1)], denote the true pdf?s by
pt 1jt 1 := p(xt 1jIt 1); ptjt 1 := p(xtjIt 1); ptjt := p(xtjIt);
and similarly denote the simulated pdf?s by
 t 1jt 1 :=  (xt 1jIt 1);  tjt 1 :=  (xtjIt 1);  tjt :=  (xtjIt):
20
We  rst de ne at :P(B)!P(B) to be the mapping
at( )(xt) =
Z
B
Ktjt 1 (xt 1)dxt 1; 8 2P(B): (3.14)
Then we have
hat( );?i=   ;Ktjt 1? ; 8?2Cb(B); (3.15)
and it holds that
at(pt 1jt 1) = ptjt 1:
It?s natural to assume that at is continuous, since in the context of  ltering two realizations of
the signal that start from \close" positions will remain \close" at subsequent times. In fact,
when the transition kernel p(xtjxt 1) is Feller, i.e., p(xtjxt 1)? is a continuous bounded func-
tion for any continuous bounded function ?, we have according to [25] that if limn!1 n ?=  
then
limn!1hat( n);?i= limn!1  n;Ktjt 1? =   ;Ktjt 1? =hat( );?i; 8?2Cb(B): (3.16)
De ne bt :P(B)!P(B) to be the mapping
bt( ) = p(ytjxt) (xt)R
Bp(ytjxt) (xt)dxt
; 8 2P(B): (3.17)
Then we have
 b
t(ptjt 1);?
 =  p
tjt 1;p(ytjxt)
  1  p
tjt 1;?p(ytjxt)
 ; 8?2C
b(B); (3.18)
and it holds that
bt(ptjt 1) = ptjt:
21
It is also natural to assume that bt is continuous, which means that a slight variation in two
distributions will not result in a large variation in the distributions when observations are
taken into account. In fact, assuming that p(ytj ) is a continuous bounded strictly positive
function, we have according to [25] that limn!1 n ?=  then
limn!1hbt( n);?i = limn!1h n;p(ytjxt)i 1 h n;?p(ytjxt)i
= h ;p(ytjxt)i 1 h ;?p(ytjxt)i=hbt( );?i; 8?2Cb(B): (3.19)
We next de ne two approximation operators, the sampling operator, and the interpola-
tion operator that appear in the prediction step.
1. Denote by  a function of Wt 1 from P(B) to P(B). At each step t, we draw M
samples, w(1)t 1;   ;w(M)t 1 , which are i.i.d. random variables with common distribution
Wt 1. The Monte Carlo estimate of E[ (Wt 1)] can be obtained to be
~ (wt 1) = 1
M
MX
j=1
 (w(j)t 1):
De ne the sampling operator sM :P(B)!P(B) to be
sM(hE[ ]; i) =hE[ ~ ]; i; 8 2P(B); (3.20)
Then
sM at( ) =
Z
B
1
M
MX
j=1
 M(xtjxt 1;w(j)t 1)  (xt 1)dxt 1; 8 2P(B);
22
2. Given the rectangular spatial partition nodes fu(i)gNi=1. De ne TN :P(B)!P(B) to
be the interpolation operator
TN( )(xt) =
NX
i=1
 Ni (xt) (u(i));
for each xt 2 B, where f igNi=1 are basis functions for interpolation operator TN.
Therefore
TN sM at(pt 1jt 1) =  tjt 1;
where  tjt 1 is the piecewise linear approximation of ptjt 1 satisfying
 tjt 1(u(i)) = 1M
MX
j=1
p
 
x(i;j)t 1jIt 1
 
; for each i = 1;:::;N:
Denote by  t := bt  at,  M;Nt := bt  TN  sM  at,  1:t =  t   t 1       1 and
 M;N1:t =  M;Nt   M;Nt 1       M;N1 , we have
 t(pt 1jt 1) = ptjt;  M;Nt (pt 1jt 1) =  tjt; and  1:t(p0j0) = ptjt;  M;N1:t (p0j0) =  tjt:
Our goal is to show that  M;N1:t ?! 1:t. This can be done by showing  M;Nt ?! t for each step
t and induction.
Recalling thatP(B) is the set of all probability measures onB, we denote byPU(B) be
the set of all uniformly continuous probability measures on B and PC(B) the set of all a.e.
continuous probability measures on B, for latter use.
Lemma 1 sM at converges to at weakly, i.e., for any  M; 2P(B) with limM!1 M ?=  ,
it holds that
lim
M!1
sM at( M) ?= at( ): (3.21)
23
p Proof. For any t2N, by the Strong Law of Large Numbers,
lim
M!1
1
M
MX
j=1
 M(xtjxt 1;w(j)t 1) = E[p(xtjxt 1;Wt 1)]; a:s: (3.22)
Therefore, for any ?2Cb(B) we have
 sM a
t( M);?
  ha
t( );?i
=  sM at( M);?  hat( M);?i+hat( M);?i hat( );?i
 
Z
B
 
1
M
MX
j=1
 M(xtjxt 1;w(j)t 1) M(xt 1)dxt 1 E[p(xtjxt 1;Wt 1)]
!
 M(xt 1)?dxt 1
+hat( M);?i hat( );?i: (3.23)
It then follows directly from equation (3.22) and (3.16) that
lim
M!1
sM at( M) ?= at( ):
The proof is complete.
Lemma 2 Assume that f M;Ng1M;N=1 2PC(B) and  M 2PU(B) with limN!1 M;N= M
for each M2N. Then, limN!1sM at( M;N) = sM at( M) for each M2N. Moreover, if
there exists  > 0 such that   @@xf 1t   < , then sM at( M;N)2PC(B) and sM at( M)2
PU(B).
Proof. For any x2B, by the de nition of sM and at we have
sM at( M;N)(x) = 1M
MX
j=1
 M;N
 
x(x;j)t 1
 
and
sM at( M)(x) = 1M
MX
j=1
 M
 
x(x;j)t 1
 
:
24
Since limN!1 M;N= M for each M2N, given any "> 0, there exists N0, such that for all
z2B, j M;N(z)  M(z)j<" for each M2N. Therefore, for all x2B,
  sM a
t( M;N)(x) sM at( M)(x)
  =
  
  
 
1
M
MX
j=1
 
 M;N
 
x(x;j)t 1
 
  M
 
x(x;j)t 1
     
  
 1M
MX
j=1
  
 
 
 M;N
 
x(x;j)t 1
 
  M
 
x(x;j)t 1
    
 
< 1M
MX
j=1
" = ":
This proves that limN!1sM at( M;N)=sM at( M) for all M2N.
We next prove that sM  at( M;N) 2PC(B). In fact, for any " > 0 and z0 2B, since
 M;N 2PC(B), there exists  > 0 such that when jz z0j< ,
j M;N(z)  M;N(z0)j<":
Fix arbitrary x0 2B, for any x2Bsatisfyingjx x0j< = , using that ft(x(x;j)t 1 ;w(j)t 1) =
x and ft(x(x0;j)t 1 ;w(j)t 1) = x0 we have
  
 x(x;j)t 1  x(x0;j)t 1
  
 =jf 1t (x;w(j)t 1) f 1t (x0;w(j)t 1)j 
  
  @
@xf
 1
t
  
   jx x0j< ;
and thus
  sM a
t( M;N)(x) sM at( M;N)(x0)
   1
M
MX
j=1
  
  M;N
 
x(x;j)t 1
 
  M;N
 
x(x0;j)t 1
   
 <":
It remains to show that sM at( M)2PU(B). In fact, given any x1;x2 2B,
sM at( M)(x1) sM at( M)(x2) = 1M
MX
j=1
 
 M
 
x(x1;j)t 1
 
  M
 
x(x2;j)t 1
  
:
25
For any " > 0, from the uniformly continuity of  M, there exists  > 0, such that for any
z1;z2 2B with jz1 z2j< , j M(z1)  M(z2)j<". Let ~ =   , then jx1 x2j< ~ implies
that
  
 x(x1;j)t 1  x(x2;j)t 1
  
 =
  
 f 1t (x1;w(j)t 1) f 1t (x2;w(j)t 1)
  
  
  
  @
@x(f
 1
t )
  
   jx1 x2j< :
Hence
  sM a
t( M)(x1) sM at( M)(x2)
  =    
  1M
MX
j=1
 
 M
 
x(x1;j)t 1
 
  M
 
x(x2;j)t 1
     
  <":
The proof is complete.
Lemma 3 For f M;Ng1M;N=1 2PC(B) and  M 2PU(B) with limN!1 M;N= M for each
M2N, it holds that
lim
N!1
TN( M;N)= M; 8M2M: (3.24)
Proof. For any xt2B,
  TN( 
M;N)(xt)  M(xt)
     TN( 
M;N)(xt) TN( M)(xt)
  +  TN( 
M)(xt)  M(xt)
  :
(3.25)
Since limN!1 M;N= M, for any " > 0, there exists N1 = N1(M) > 0 such that when
N >N1,
j M;N(xt)  M(xt)j< "2:
Thus because of the linearity of TN we have
  TN( 
M;N)(xt) TN( M)(xt)
  =   TN( 
M;N  M)(xt)
  < "
2: (3.26)
For the second term on the right hand side of inequality (3.25), since TN is the linear
interpolation operator and  M is uniformly continuous, for any " > 0, there exists N2 =
26
N2(M) > 0 such that when N >N2 we have
  TN( 
M)(xt)  M(xt)
  < "
2: (3.27)
In summary letting N0 = maxfN1;N2g we have by (3.25), (3.26) and (3.27) that for
any "> 0,
  TN( 
M;N)(xt)  M(xt)
  <"; 8N >N
0; 8xt2B; 8M2N: (3.28)
The proof is complete.
We next prove the weak convergence of the operator  M;Nt to  t. Letting  M;Nt and  t
be the composition operators de ned as above, we have the following theorem.
Theorem 3.1 (Local convergence) Assume that the transition kernel p(xtjxt 1) is Feller and
p(ytjxt) is bounded, uniformly continuous, and strictly positive. Also assume that   @@xf 1t   is
bounded. Then, for any f M;Ng1M;N=1 2PC(B) and  M; 2PU(B) with limN!1 M;N= M
for each M2N and limM!1 M ?=  , it holds that
lim
M!1
lim
N!1
 M;Nt ( M;N) ?=  t( ): (3.29)
Proof. Given limM!1 M ?=  , by Lemma 1,
lim
M!1
sM at( M) ?= at( ): (3.30)
Given limN!1 M;N= M for eachM2N, by Lemma 2 we have limN!1sM at( M;N)=sM 
at( M) and sM  at( M;N) 2 PC(B), sM  at( M) 2 PU(B). Thus by letting  M;N :=
sM at( M;N) and  M := sM at( M) in Lemma 3 we get
lim
N!1
TN sM at( M;N)=sM at( M): (3.31)
27
Equations (3.30) and (3.31) together give
lim
M!1
lim
N!1
TN sM at( M;N) ?= at( ):
Therefore it follows directly from (3.19) that
lim
M!1
lim
N!1
bt TN sM at( M;N) ?= bt at( ):
The proof is complete.
To prove the global weak convergence result, we also need the following Lemma.
Lemma 4 Assume p(ytjxt) is bounded, uniformly continuous, and strictly positive. For
f M;Ng1M;N=1 2PC(B) and  M 2PU(B) with limN!1 M;N= M for each M 2N, if there
exists a  0 > 0 such that RBp(ytjxt) M(xt)dxt  0, then we have
lim
N!1
bt( M;N)=bt( M)2PU(B); 8M2N:
Proof. Since limN!1 M;N= M for each M 2N, then for any 0 < " <  02 , there exists N0,
such that when N >N0,
j M;N(xt)  M(xt)j<"; 8xt2B and 8M2N:
It then follows that   RBp(ytjxt)( M;N(xt)  M(xt))dxt  <" and
Z
B
p(ytjxt) M;N(xt)dxt >
Z
B
p(ytjxt) M(xt)dxt ">  02 :
28
Thus for any xt2B when N >N0 we have
jbt( M;N)(xt) bt( M)(xt)j
= p(ytjxt) 
  
   M;N(xt) 
R
Bp(ytjxt) M(xt)dxt  M(xt) 
R
Bp(ytjxt) M;N(xt)dxtR
Bp(ytjxt) M;N(xt)dxt 
R
Bp(ytjxt) M(xt)dxt
  
  
 
  
  
R
Bp(ytjxt) ( M(xt)  M;N(xt))dxtR
Bp(ytjxt) M;N(xt)dxt 
R
Bp(ytjxt) M(xt)dxt
  
  +
  
   M(xt)  M;N(xt)R
Bp(ytjxt) M(xt)dxt
  
  
<
 4
 20 +
2
 0
 
":
(3.32)
Therefore limN!1bt( M;N)=bt( M). It remains to show that bt( M)2PU(B).
In fact, by the de nition of bt, we have for any x(1)t ;x(2)t 2B,
  
 bt( M)(x(1)t ) bt( M)(x(2)t )
  
  2 
0
  
 p(ytjx(1)t ) M(x(1)t ) p(ytjx(2)t ) M(x(2)t )
  
 :
From the uniformly continuity property and the boundedness of  M and p(xtjxt 1) that for
any " > 0, there exists  > 0, such that when x(1)t ;x(2)t 2B with jx(1)t  x(2)t j<  , we have
j M(x(1)t )  M(x(2)t )j< "2 and jp(x(1)t ) p(x(2)t )j< "2. Thus,
  
 bt( M)(x(1)t ) bt( M)(x(2)t )
  
 <C":
This completes the proof.
Note 1 When the standing assumption (A1) holds, it follows immediately by the de nition
of sM that there exists a  0 > 0 such that RBp(ytjxt)sM p(xtjIt 1)dxt > 0 for M su cient
large.
Applying Theorem 3.1 to the context of  ltering problems, we can obtain the weak
convergence of our implicit  ltering simulation to the Bayesian optimal  lter. Our main
result of this work is stated in the following theorem. For simplicity, we de ne two new
29
operators  Mt and  M1:t to be
 Mt = bt sM at and  M1:t =  t  t 1      1:
Theorem 3.2 (Global convergence) Assume that the transition kernel p(xtjxt 1) is Feller
and p(xtjxt 1) is bounded, uniformly continuous, and strictly positive. Also assume that
  @
@xf
 1
t
  is bounded. Then
lim
N!1
 M;N1:t (p0j0) =  M1:t(p0j0) 8M2N; and lim
M!1
 M1:t(p0j0) ?=  1:t(p0j0);
which implies that
lim
M!1
lim
N!1
 tjt ?= ptjt:
Proof. To prove Theorem 3.2, we use induction method.
(1) t = 1: choose  M;N =  M =  = p0j0 in equation (3.29). It is obviously that
lim
N!1
 M;N= M; lim
M!1
 M ?=  ;
and  M = p0j0 is uniformly continuous. By Lemma 2, Lemma 3 and Lemma 4 and Note 1,
lim
N!1
 M;N1 (p0j0) =  M1 (p0j0)2PU(B); 8M2N:
By Lemma 1 and the continuity of bt, we have
lim
M!1
 M1 (p0j0) ?=  1(p0j0) = p1j1:
It then follows from Theorem 3.1 that
lim
M!1
lim
N!1
 M;N1 (p0j0) ?=  1(p0j0) = p1j1:
30
(2) Assume that
lim
N!1
 M;N1:t 1(p0j0) =  M1:t 1(p0j0)2PU(B) 8M2N; and lim
M!1
 M1:t 1(p0j0) ?=  1:t 1(p0j0):
We choose  M;N =  M;N1:t 1(p0j0) and  =  1:t 1(p0j0) = pt 1jt 1 in equation (3.29) and
 M =  M1:t 1(p0j0) in Theorem 3.1. From the assumption,
lim
N!1
 M;N= M and lim
M!1
 M ?=  :
By Lemma 2, Lemma 3, Lemma 4 and Note 1 that
lim
N!1
 M;N1:t (p0j0) = lim
N!1
bt TN sM at( M;N1:t 1(p0j0))= M1:t(p0j0)2PU(B): (3.33)
By Lemma 1 and the continuity of bt we have
lim
M!1
 M1:t(p0j0) ?= ptjt: (3.34)
Therefore it follows from Theorem 3.1 that
lim
M!1
lim
N!1
 M;Nt ( M;N1:t 1(p0j0)) ?=  t(pt 1jt 1) =  1:t(p0j0) = ptjt:
The proof is complete.
3.3 Numerical experiments
In this section, we present two numerical examples to demonstrate the e ciency of our
method. The  rst example involves a one dimensional nonlinear system and measurement
equation while the second is a 2-D bearing-only tracking problem. We shall compare our
method with the standard EKF and particle  lter. Here the particle  lter we are using is
31
the sequential important sampling with resampling (SIR).
Example 1
Consider the following nonlinear model
xk = 40 tan(xk 1 + 10) + 50wk 1;
yk = 40 xk2000 +x
k
+vk;
(3.35)
where wk and vk are two independent zero-mean white noise processes with variance 1:0,
yk is the noise perturbed observation of xk. The initial position is taken to be x0 = 2 and
Figure 3.1 shows a 50 step realization of the state equation in model (3.35).
0 5 10 15 20 25 30 35 40 45 50?600
?400
?200
0
200
400
600
800
1000
k
Xk
Figure 3.1: Original Position
Figure 3.2, Figure 3.3 and Figure 3.4 are the simulation results obtained by using EKF
and particle  lter and our implicit particle  lter method, respectively. The true state is
represented by blue diamonds while simulation results are given as red \stars" and connected
by solid lines. The prior pdf p(x0) is initialized with the standard normal distribution with
the mean value x0 and the variance 1:0.
32
In particle  lter method, we use 500 particles (sample points) to represent the pdf and
in our implicit  lter, we use 100 nodes to partition the region and the number of Monte-Carlo
samples is M = 10.
0 5 10 15 20 25 30 35 40 45 50?1500
?1000
?500
0
500
1000
1500
 
 Real State
EKF
Figure 3.2: Extended Kalman Filter
0 5 10 15 20 25 30 35 40 45 50?600
?400
?200
0
200
400
600
800
1000
 
 Real State
Particle Filter
Figure 3.3: Particle Filter
Form the three  gures, one can see that when the variation between two consecutive
points is not very large, all three methods produce very accurate approximations to the true
state. On the other hand, when the true state has very large variations at some time steps,
i.e., the state variable has a large jump from its previous state, both EKF and particle  lter
33
fail to produce accurate approximations. However, our implicit particle algorithm still pro-
duces accurate estimations at these points.
0 5 10 15 20 25 30 35 40 45 50?800
?600
?400
?200
0
200
400
600
800
1000
 
 Real State
Implicit Algorithm
Figure 3.4: implicit Algorithm
Example 2
In this example, we consider the following bearing-only tracking problem.
dX1(t) =  X2(t)dt+ X1(t)(X
1(t))2 + (X2(t))2
+ 1dW1(t);
dX2(t) =  X1(t)dt+ X2(t)(X
1(t))2 + (X2(t))2
+ 2dW2(t) (3.36)
where W1(t) and W2(t) are two independent Brownian Motions. This stochastic dynamical
system may serve to model the motion of a ship which moves with a constant radial and
angular velocity, perturbed by a white noise. The observations are collected by a detector
located at a platform with time intervals of length  = 0:05 and the data are angular
measurements corrupted by noise.
34
To approximate the state variables X = (X1;X2), we discretize the dynamical system
(3.36) in time and obtain a discrete nonlinear  ltering problem. Let xk = (x1k;x2k). We have
the discrete system model
x1k = x1k 1    x2k 1
+   x1k 1(x1
k 1)2+(x
2
k 1)2
+ 1p  w1k 1
x2k = x2k 1 +   x1k 1
+   x2k 1(x1
k 1)2+(x
2
k 1)2
+ 2p  w2k 1:
(3.37)
The mathematical formula for the measurement equation is given by
yk = arctan(x
2
k x
2
platform
x1k x1platform) +
p v
k; (3.38)
where xplatform = (x1platform;x2platform) is the location of the platform where a detector is
placed.
?80 ?60 ?40 ?20 0 20 40?80
?60
?40
?20
0
20
40
60
X
Y platform
Figure 3.5: Target Positions
In the numerical simulations the model parameters are chosen as  = 5,  = 2 and
 1 =  2 = 8. Figure 3.5 gives the target path in the x-y plan, with the position of the target
at each time step shown by a diamond. The location of the detector platform is chosen as
xplatform = ( 15; 15), marked by a red box.
35
The problem is initialized with a best guess of the target position at the initial time,
which is (x10;x20) = (0:5;0:5). In this example, we use 8000 particles (sample points) to
represent the pdf in the particle  lter method and in the implicit  lter, we use 1600 nodes
to partition the region and the number of Monte-Carlo samples is M = 10. p Figure 3.6
shows the simulation results of observation angle using EKF, particle  lter and our implicit
particle  lter method. From this  gure, one can see that both particle  lter and the implicit
particle  lter produce good approximation for the relative observation angle of the target.
Although the EKF provides the trend of the movement of the target, the estimation is a few
steps delayed from the true target observation angle.
0 10 20 30 40 50 60 70 80?100
?80
?60
?40
?20
0
20
40
60
80
100
 
 
RealImplicit Algorithm
Particle FilterEKF
Figure 3.6: Comparison result for the observation angle
Figure 3.7 shows the results of system state simulations using EKF while Figure 3.8
compares the performance between particle  lter and the implicit particle  lter. Clearly
both the particle  lter and implicit particle  lter outperform EKF. While the accuracy of
the particle  lter and the implicit particle  lter are very close to each other at the initial
stage, the implicit particle  lter becomes more accurate at the  nal stage of time interval.
36
?80 ?60 ?40 ?20 0 20 40?80
?60
?40
?20
0
20
40
60
 
 
Real StateEKF
Figure 3.7: Simulation result of EKF
?80 ?60 ?40 ?20 0 20 40 60?100
?80
?60
?40
?20
0
20
40
60
 
 
Real Implicit Algorithm
Particle Filter
Figure 3.8: Comparison result between Particle Filter and implicit Algorithm
37
Chapter 4
A Hybrid Sparse Grid Approach for Nonlinear Filtering Problems Based on
Adaptive-Domain of the Zakai Equation Approximations
In this Chapter, we develop an e cient hybrid sparse grid approach for nonlinear  l-
tering problems based on numerical approximations of the Zakai equation. Consider the
following stochastic di erential system that combines an equation for the \state" and for the
\observation" de ned on the probability space ( ;F;P)
8>
<
>:
dXt = b(Xt)dt+ (Xt)dWt; (state)
dYt = h(Xt)dt+dBt: (observation)
(4.1)
Here fXt 2Rd;t 0g and fYt 2Rr;t 0g are two stochastic processes, fWt;t 0g and
fBt;t 0g are independent Brownian Motions in Rp and Rr, with covariance matrices CW
(identity) and CB, respectively, and the given initial value X0 is independent of Wt and Bt
with probability distribution u0(x)dx.
4.1 Nonlinear Filtering Problems and Zakai Equations
Now, we outline the derivation of the Zakai equation and its relationship to the nonlinear
 ltering problem (4.1). Throughout this Chapter, we assume that the coe cients b : Rd!
Rd,  : Rd !Rd p and h : Rd !Rr in (4.1) are globally Lipschitz continuous functions.
Denote
 (t) := exp
 Z t
0
h (Xs)dYs 12
Z t
0
jh(Xs)j2ds
 
;
38
then the measure ~P de ned by ~P =  (t)dP is also a probability measure on ( ;F) equivalent
to P . Furthermore, in the probability space ( ;F; ~P), Yt is a Browanian motion independent
of Xt ( for details, see [82] ).
Assuming that u = u(t;x) is the conditional density function of the state Xt given an
observed path Yt, then the optimal  ltering solution is given by (see [82, 85])
E( (Xt)jYt) =
R  (x)u(t;x)dx
R u(t;x)dx : (4.2)
Under regularity assumptions for the coe cients b and h given above, u satis es the following
stochastic partial di erential equation, known as Zakai equation
du(t;x) = L u(t;x)dt+h (x)u(t;x)dYt; x2Rd; (4.3)
with the initial value u(0;x), and L the in nitesimal generator associated with the state
process Xt such that
Lu = 12
dX
i;j
(   )i;j @
2u
@xi@xj +
dX
i=1
bi @u@x
i
; (4.4)
and \ " is the transpose operator which transforms Lu to be
L u = 12
dX
i;j
@2(   )i;ju
@xi@xj  
dX
i=1
@biu
@xi : (4.5)
The goal of the Zakai  lter method is to obtain numerical solutions of the Zakai equation
(4.3). However, there are several challenges in the construction of an e cient numerical
algorithm for the Zakai equation: (i) high-dimensionality of the state equations; (ii) low
regularity of the solution; and (iii) unbounded solution domain. In the next two sections we
construct a hybrid algorithm combining the ideas of split-up  nite di erence method, sparse
grid interpolation and the importance sample approach to overcome these obstacles.
39
4.2 Hierarchical Local Sparse Grid Interpolation
In this section, we introduce a sparse grid interpolation constructed from a local hier-
archical basis which will be used in the  nite di erence approximation of the Zakai equation
in the spatial domain.
4.2.1 Standard hierarchical sparse grid ipnterpolation
Assume that we have the following one dimensional interpolation formula at our disposal:
Qi(u) =
miX
j=1
u(xij)  ij(x); x2R; (4.6)
where i2N is the resolution level of the interpolant Qi, mi is the number of grid points
on level i, xij and  ij(x) for j = 1;:::;mi are the interpolation points and the corresponding
basis functions, respectively. In the context of linear hierarchical interpolation, mi, xij and
 ij in the standard interval [ 1;1] for i2N;j = 1;:::;mi are de ned by
mi =
8
><
>:
1; if i = 1;
2i 1 + 1; if i> 1;
(4.7)
xij =
8
>><
>>:
0; for j = 1; if mi = 1;
2(j 1)
mi 1  1; for j = 1;:::;mi; if mi > 1;
(4.8)
and for i = 1,  11 = 1; for i> 1 and j = 1;:::;mi,
 ij =
8
>><
>>:
1 mi 12  jx xijj; if jx xijj< 2m
i 1
;
0; otherwise.
(4.9)
Note that the nodal basis function  ij has local support [xij 21 i;xij + 21 i].
40
In the multi-dimensional case, i.e. d> 1, the tensor-product interpolatant is
 Qi
1      Qid
 (u) = mi1X
j1=1
   
midX
jd=1
u xi1j1;   ;xidjd   ij(x); (4.10)
where  ij = Qdk=1 ikjk. Clearly, the above product requires  di=1mi function values, which is
computationally prohibitive when d is large. The sparse gird interpolation [17] is a linear
combination of a series of tensor-product interpolants, each of which is de ned on a coarse
grid with di erent resolutions in di erent dimensions, i.e.,
IL;d(u) =
X
L d+16jij6L
( 1)L jij
 d 1
q jij
  
Qi1      Qid (u); (4.11)
where L d, the multi-index i = (i1;:::;id) and jij = i1 +   + id. Here, ik(k = 1;:::;d)
is the level of the tensor-product interpolant Qi1      Qid along the kth direction. The
Smolyak algorithm builds the interpolant by adding a combination of all tensor-product
interpolants satisfying L d + 1 jij L. The structure of the algorithm becomes clearer
when one considers the incremental interpolant,  i given in [17]
Q0(u) = 0;  i =Qi(u) Qi 1(u): (4.12)
The sparse grid interpolant (4.11) is then equivalent to
IL;d(u) =
X
jij6L
( i1       id) =IL 1;d(u) +
X
jij=L
( i1       id)(u): (4.13)
The corresponding sparse grid associated with IL;d(u) is represented by
HL;d =
[
L d+16jij6L
( i1       id); (4.14)
41
where  i denotes the set of interpolation points used by Qi. According to (4.13), to extend
the Smolyak interpolantIL;d(u) from level L 1 to L, one only needs to evaluate the function
at the incremental grid  HL;d de ned by
 HL;d =
[
jij=L
(  i1        id); (4.15)
where   ij =  ij n ij 1, j = 1;:::;d. According to the nested structure of the one-
dimensional hierarchical grid de ned by (4.8), it is easy to see that  i 1   i and   i =
 in i 1 has mi = mi mi 1 points. By consecutively numbering the points in   i, and
denoting the jth point of   i as xij, the incremental interpolant in (4.12) can be represented
by (see [17, 55] for details)
 i(u) =
mi X
j=1
 ij  u(xij) Qi 1(u)(xij) ; (4.16)
where !ij = u(xij) Qi 1(u)(xij) is de ned as the one-dimensional hierarchical surplus on
level i. This is just the di erence between the values of the interpolating polynomials and
the function evaluated at xij. From (4.16), the hierarchical sparse grid interpolant (4.13) can
be rewritten as
IL;d(u) =IL 1;d(u) +
X
jij=L
( i1       id)(u)
=AL 1;d(u) +
X
jij=L
j2Bi
!ij  ji(x)
=
X
jij6L
X
j2Bi
!ij  ji(x);
(4.17)
where the multi-index set Bi is
Bi =  j2Nd : xikjk 2  ik for jk = 1;:::;mik ;k = 1;:::;d ; (4.18)
42
and the surpluses !ij are
!ij = u(xi1j1;:::;xidjd) IL 1;d(u)(xi1j1;:::;xidjd): (4.19)
As proved in [17], for smooth functions, the hierarchical surpluses tend to zero as the in-
terpolation level tends to in nity. On the other hand, the magnitude of the surplus is a
good indicator about the smoothness of the interpolated function. In general, the larger the
magnitude, the stronger the underlying discontinuity.
For a bounded box domain D Rd :
D= [a;b] :=  di=1[ai;bi]; (4.20)
we  rst transform it to [ 1;1]d through a simple linear transform. The corresponding sparse
grid interpolation is then de ned according to (4.11).
4.2.2 Hierarchical sparse grid approximation of bell-shaped cpurves
In many nonlinear  ltering problems in practical applications, the conditional target
state PDF resembles a \bell-shaped" Gaussian curve or surface, if not exactly Gaussian. In
such cases, standard hierarchical sparse grid interpolations may lead to large errors. This is
especially the case for two or higher dimensional problems. As a demonstration, we consider
a Gaussian-type function
U(x) :=
dY
i=1
exp
(
 12
 x
i  
 
 2)
; x2Rd; (4.21)
where  2R and  2R+. From [17], we know that the L2 error between the function U and
the standard sparse grid approximation  U is
kU  UkL2  2 jUj2;212d  2 2L A(d;L); (4.22)
43
where L2N+ represents the level of hierarchical sparse grid with
A(d;L) = L
d
(d 1)! +O(k
d 2); and jUj2;2 =kD2UkL2:
It can be shown that jUj2;2 in (4.22) is bounded by (see [64] for detailed derivation)
jUj2;2  1 4d(maxfxi  g2d)kUkL2: (4.23)
We can see from the above estimate that when the constant  is small, the right hand side
100 101 102 103 104 10510
?3
10?2
10?1
100
101
# hierarchical sparse?grid points
Interpolation error
1?Dim
2?Dim
3?Dim
4?Dim
5?Dim
Figure 4.1: The interpolation error for regular sparse grid approximation up to 5 dimensions.
becomes exponentially large as d increases.
Here, we consider a special case of (4.21) with  = 0 and  = 0:15, i.e.,
U(x) =
dY
i=1
exp
 
 x
2
i
0:045
 
: (4.24)
Figure 4.1 shows the interpolation errors using the L2-norm for regular sparse grid approxi-
mations of U. We can see that as the dimension d increases, the interpolation errors barely
decrease even, though the number of interpolation points has increased signi cantly.
44
100 101 102 103 104 10510
?4
10?3
10?2
10?1
100
101
# hierarchical sparse?grid points
Interpolation error
1?Dim
2?Dim
3?Dim
4?Dim
5?Dim
Figure 4.2: The interpolation error for logarithmic sparse grid approximation up to 5 dimen-
sions.
One way of alleviating this poor performance of the sparse grid interpolant when ap-
proximating bell-shaped-functions is to utilize logarithmic interpolation (see [64]), in which
we take the logarithm of U, i.e. V := log(U), and build the approximation  V using the
standard sparse grid approach. Then we obtain the approximation of U by  U = e V . Figure
4.2 shows the absolute interpolation errors measured in L2-norm for the logarithmic sparse
grid approximation of function U de ned in (4.24). Compared with Fig. 4.1, we can see
from Figure 4.2 that the convergence is improved as the dimension increases.
A visual demonstration of the e ciency of the logarithmic sparse grid interpolation is
shown in Fig. 4.3, where we plot the level 6 approximation for a marginal distribution of
both U and log(U) in d = 4 dimensions.
4.3 Hybrid Approach for Numerical Solution of the Nonlinear Filtering Prob-
lem.
In this section we describe our hybrid numerical algorithm for the solution of the Zakai
equation (4.3).
45
?1
0
1
?1
0
10
2
4
6
8
x1
(a)
y1 ?1
0
1
?1
0
1?5
0
5
10
15
x1
(b)
y1
?1
0
1
?1
0
1?30
?20
?10
0
10
x1
(c)
y1 ?1
0
1
?1
0
10
2
4
6
8
x1
(d)
y1
Regularsparsegrid
approximation
V=Log(U);Regularsparsegrid
approximationforV
U=Exp(V)
Figure 4.3: (a) is the x1x2-marginal surface of U with x3 = x4 = 0 while (b) is its interpolate
approximation obtained by applying the regular sparse grid approximation to function U.
We can see from the Figure that the approximation is quite poor and signi cant oscillations
occur at the bottom of the surface. On the other hand, from (d) one can see that the
logarithmic interpolation approximation described above is far more accurate.
4.3.1 Adaptive selection of solution domains
Since the Zakai equation (4.3) is de ned on the whole space Rd, it is essential to choose
a proper bounded domain to achieve an accurate numerical approximation. Motivated by
importance sampling and particle  lter methods, we adaptively select a hypercube at each
time step according to an estimation of the density function of the solution at the next step.
In particular, let Rt be a partition of [0;T] such that:
Rt =ftnjtn2[0;T];tn <tn+1;n = 0;1;   ;NT 1;t0 = 0;tNT = Tg
46
and denote  tn = tn+1 tn, n = 0;1;   ;NT 1. For n = 0;1;   NT, assume that un is the
numerical solution of the Zakai equation (4.3) at tn. We use an importance sampling method
to draw M realizations according to the conditional PDF un of the state Xn, denoted by
fpmngm=1;   ;M, where M is a pre-de ned positive integer. We then propagate each of these
samples from time step tn to tn+1 using the state equation in the nonlinear  ltering problem
(4.1) to get M updated sample points, denoted byfpmn+1gm=1;   ;M. To complete our adaptive
domain selection, we denote
Dn+ 1
2
= [an+ 1
2
;bn+ 1
2
] Rd
as the smallest hypercube containing all the samples fpmn+1gm=1;   ;M and  = ( 1;:::; d)
as the vector of marginal standard deviations of these samples. Then, for a user de ned
positive constant  we let
an+1 := an+ 1
2
   
bn+1 := bn+ 1
2
+  
(4.25)
and  nally we choose
Dn+1 = [an+1;bn+1] (4.26)
as the solution domain for un+1.
Remark 1 The idea of the adaptive selection of solution domain Dn is similar to that in
the prediction step of particle  lter method. As such the region Dn+ 1
2
which includes all
updated samples is similar to the domain where particle  lter method builds the prior PDF
of the target state. In our approach, we choose a con dence region surrounding Dn+ 1
2
as our
solution domain. This way, we can use a much smaller number of samples than the particle
 lter technique and maintain the accuracy of our approximation. In addition, our numerical
experiments also indicate that this adaptive domain selection approach is more accurate at
predicting the tail distribution than the particle  lter method.
47
4.3.2 Spliting-up  nite di erence method on sparse grid
Following [10, 48, 85], the splitting-up scheme for (4.3) consists of prediction and update
steps, described in Sections 4.2.1 and 4.2.2 respectively. At each time step tn, we de ne the
sampled observation Zn from the observation process Yt by
Zn := Ytn+1  Ytn t
n
= 1 t
n
 
 Z tn+1
tn
h(Xs)ds+Btn+1  Btn
 
:
In what follows, we denote un to be the approximate solution for ut at t = tn, n =
0;1;2;   ;NT.
Prediction Step
In the prediction step we solve for un+ 1
2
from
un+ 1
2
= un +L un tn; (4.27)
which is equivalent to solving the deterministic PDE, known as the Fokker Plank equation
[82, 85],
@ut
@t = L
 ut; tn t tn+1 (4.28)
with an one-step forward Euler scheme. An e cient spatial discretization technique for the
Fokker-Planck equation (4.28) in the prediction step is essential to the success of the splitting-
up scheme. In the one-dimensional case, a simple  nite di erence method to discretize the
operator L is su cient. In a straight forward fashion, such a discretization can be extended to
multi-dimensional case with use of a direct tensor product. However, the computational cost
of the numerical solution based on a tensor-product approximation increases exponentially
as the dimension d increases, known as the \curse of dimensionality". To alleviate this
numerical challenge and reduce the overall computational complexity, we use the sparse-grid
method described in section 3 to construct a  nite di erence algorithm for solving (4.28).
48
To see this, we present the upwind  nite di erence method for approximating the partial
di erential operator L de ned in (4.4) on a sparse grid. First, let u0 be the initial value of the
solution u of the Zakai equation (4.3). For a positive integer n, letHL;dn be the set of sparse
grid points de ned by (4.15) in the hypercubeDn de ned by (4.14). For n = 0;   ;NT and
x2HL;dn+1, we approximate the  rst order partial derivative with given coe cient function
 2Rd by
 i@un@x
i
(x)  i ~Dxiun(x) :=
8
><
>:
 i ^un(x+eihi) un(x)h
i
if  i 0
 iun(x) ^un(x eihi)h
i
if  i < 0
;
where ei is the unit vector in the ith coordinate direction and hi is a properly chosen meshsize
in the ith coordinate direction.
To approximate second order partial derivatives at the sparse grid point x2HL;dn+1 with
given coe cient function  2Rd d, we use central di erences to obtain
 i;i @
2un
@xi@xi(x)  i;i
~D2x
ixiun(x)
:=  
i;j
^un(x+eihi) 2un(x) + ^un(x eihi)
h2i
and
 i;j @
2un
@xi@xj(x)  i;j
~D2x
ixjun(x)
:=
8
>>>
>>>>
>>>
>><
>>>
>>>
>>>
>>>:
 i;j
2hi
 ^u
n(x+eihi +ejhj) ^un(x+eihi)
hj  
^un(x+ejhj) un(x)
hj
+un(x) ^un(x ejhj)h
j
 ^un(x eihi) ^un(x eii ejhj)h
j
 
;if  i;j 0;
 i;j
2hi
 ^u
n(x+eihi) ^un(x+eihi ejhj)
hj  
un(x) ^un(x ejhj)
hj
+ ^un(x+ejhj) un(x)h
j
 ^un(x eihi +ejhj) ^un(x eihi)h
j
 
;if  i;j < 0:
49
With the above  nite di erence operators in hand, we de ne the  nite di erence approxi-
mation of the Fokker-Planck equation (4.28) on sparse grid HL;dn+1 as follows.
un+ 1
2
(x) = un(x) +L nun(x) tn; x2HL;dn+1; (4.29)
where
L nun(x) = 12
dX
i;j
 @2(   )
i;j
@xi@xj un(x) +
@(   )i;j
@xi
~Dx
jun(x) +
@(   )i;j
@xj
~Dx
iun(x)
+(   )i;j ~D2xixjun(x)
o
 
dX
i=1
 @bi
@xiun(x) +b
i ~Dx
iun(x)
 
:
Update Step
In the update step, we use the new observation Zn and the Bayes formula [?] to update
the prior un+ 1
2
to the posterior un+1 as follows.
un+1(x) = Cn n(x;Zn)un+ 1
2
; x2HL;dn+1; (4.30)
where Cn is a normalization factor and function  n is de ned by
 n(x;Zn) = exp
 
  tn2  jZn h(x)j2R
 
:
The normj jR is de ned byj j2R =  R 1 , where R is the covariance matrix offBt;t 0g
in (4.1) ( Please see Page 2, [85] for more details). Finally following the procedure described
in Section 3, we derive the logarithmic sparse grid interpolation un+1 = un+1(x); x2 Rd
using its values on the sparse grid HL;dn+1.
We summarize our hybrid sparse grid adaptive-domain splitting-up  nite di erence al-
gorithm as follows:
50
Step 1: Input u0 as the initial value of the solution u of the Zakai equation (4.3).
Step 2: For n = 0   ;NT 1,
1 Compute dynamic domainDn+1 for the solution un+1 using the importance sampling
method.
2 Generate sparse grid HL;dn+1 on the solution domain Dn+1.
3 Evaluate un+1 on the sparse grid HL;dn+1 by using  nite di erence scheme (4.29).
4 Extend the solution un+1 to the whole space Rd through the logarithmic sparse grid
interpolation described in Section 3.
Step 3: Normalization.
Remark 2 Since we use an explicit  nite di erence scheme to solve equation (4.28), time
step  tn must satisfy the following stability condition
max
0 n NT 1
 tn 1dX
i=1
j(  T)i;ij+jbijhi
h2i
:
4.4 Numerical Experiments
In this section, we present three numerical experiments to demonstrate the e ectiveness
of our new numerical algorithm, for solving nonlinear  ltering problems.
Example 1
In the  rst example, we use a two dimensional nonlinear  ltering problem to illustrate
the accuracy of the selection process of the dynamic solution domain Dn. To see this, we
consider the following dynamical system
dXt = (40;2 (10t)2)T dt+ 0:5 dWt; (4.31)
51
15 20 25 30 35 40 45 50 55 6015
20
25
30
35
40
45
50
55
60
X
Y
Figure 4.4: Target Trajectory and Adaptive Solution Domain. The red curve shows the
real target state. The blue points are actual states of the target and the blue boxes are the
corresponding solution domains.
where Wt is a two-dimensional Brownian Motion and the initial state is given by X0 =
(30;30)T. The observation process is given by
dYt =
q
(X1t  20)2 + (X2t )2 dt+dBt; (4.32)
which measures the perturbed distance between the target state and a reference point P =
(20;0), and Bt is a one-dimensional Brownian motion independent of Wt.
In this numerical simulation, we take T = 0:4 and use an uniform partition in time with
stepsize  tn = 0:005. The initial value is given by u0  N(X0; ); a normal distribution with
mean X0 and standard deviation  = (1;0:5)T. In the adaptive solution domain selection
process, we choose the sample size M = 500 and the parameter  in (4.25) as  = 4. Figure
4.4 shows the trajectory of the target state and solution domain Dn for n = 1;20;50;70;80.
In Figure 4.5 we show solution domain Dn with the corresponding contour plot of target
52
15 25 35 45 5520
25
30
35
40
45
1e?008
1e?008
1e?005
1e?005
0.005
X
Y
 
 Real StatePDF
(a)
15 25 35 45 5520
25
30
35
40
45
5e?005
5e?0050.001
0.0010.01
0.010.050.1
X
Y
 
 Real StatePDF
(b)
15 25 35 45 5520
25
30
35
40
45
5e?005
5e?005
0.0010.0010.010.01
0.050.1
X
Y
 
 Real StatePDF
(c)
Figure 4.5: Target state PDF at time step: (a) n=1; (b) n=50; (c)=80; in the corresponding
solution domain
state PDF for n = 1;50;80. From this  gure one can see the that solution domains are
extremely accurate in approximating the high density area of un.
53
Example 2
In this example, we consider the following nonlinear  ltering problem:
8>
<
>:
dXt = (10;6 (sinX1t + 2))T dt+dWt; (state)
dYt = (X1t;X2t )T dt+dBt: (observation)
(4.33)
In this numerical experiment, we let T = 0:5 and use a uniform partition in time with
stepsize  tn = 0:01. An example of the signal trajectory is shown in  gure 4.6a. The initial
value is given by u0  N(X0; ); a normal distribution with mean X0 = (2;0)T and standard
deviation  = (0:5;0:5)T. We also choose the hierarchical sparse grid level as L = 6. In the
adaptive solution domain selection process, the sample size and parameter  are given by
M = 500 and  = 4, respectively.
Figure 4.6b shows the comparison of the estimated values of the target state between
the particle  lter and sparse grid Zakai  lter. In the particle  lter simulation, we use 8000
particles to represent the PDF of the target state. The black dashed line shows the trajectory
of the real target state, the red triangles and blue dots show the estimate target state (the
mean of the estimate posterior PDF) obtained by using the particle  lter and the Zakai  lter
respectively. As we can see from Figure 4.6b, our hybrid approach is more accurate than the
PFM for a longer period of time.
To further examine the performance of the hybrid sparse grid Zakai  lter method, in
Figure 4.7a and Figure 4.7b we plot the marginal probability density functions at time
T = 0:5, obtained by using the particle  lter and our hybrid with respect to xy-coordinates.
From the plots we observe that the convergence of the particle  lter as the particle size
increases. Moreover, the PDE obtained by our hybrid sparse grid Zakai  lter is very close to
the one obtained by the particle  lter with 160;000 particles. However, in table 4.1 we can
see that with 160;000 particles the particle  lter is far more costly in terms of CPU time
than our method. Finally, we also show the con dence bands in Figure 4.8. The blue dashed
54
2 3 4 5 6 7 80
1
2
3
4
5
6
X
Y
 
 
Target Trajectory
Starting Point
(a) Signal trajectory: Example 2
2 3 4 5 6 7 80
1
2
3
4
5
6
X
Y
 
 
Target Trajectory
Particle Filter
Zakai Filter (Sparse Grid)
(b) Comparison of estimated mean of the target state between particle  lter
and sparse grid Zakai  lter
Figure 4.6: 2D nonlinear  ltering problem
curves show the real target trajectory with respect to xy-coordinates, respectively, the red
55
2 3 4 5 6 7 8 9 10 11 120
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x 
 PF: 2,000
PF: 8,000PF: 40,000
PF: 160,000Zakai Filter
(a) Marginal distribution with respect to X-coordinate
1 2 3 4 5 6 7 8 9 10 110
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
y 
 PF: 2,000
PF: 8,000PF: 40,000
PF: 160,000Zakai Filter
(b) Marginal distribution with respect to Y-coordinate
Figure 4.7: Marginal distributions
curves represent the estimate of posterior means, and the green dashed curves represent the
estimated 95% con dence bands.
56
0 10 20 30 40 500
1
2
3
4
5
6
7
8
9
10
n
X n
 
 Real State
ZF(Sparse Grid) Estimated StateZF(Sparse Grid) 95% probability region
(a)
0 10 20 30 40 50?1
0
1
2
3
4
5
6
7
8
9
n
Y n
 
 Real State
ZF(Sparse Grid) Estimated StateZF(Sparse Grid) 95% probability region
(b)
Figure 4.8: Sparse grid Zakai  lter estimate of posterior mean with 95% probability region .
(a) Estimate of posterior mean and 95% probability region: X-coordinate
(b) Estimate of posterior mean and 95% probability region: Y-coordinate
Table 4.1: Comparison of computing costs
Particle  lter CPU time (seconds)
2,000 particles 1:49
8,000 particles 32:67
40,000 particles 516:91
160,000 particles 7622:29
Sparse grid Zakai  lter 20:45
Example 3
In this example, we consider the following \ bearing-only " tracking problem given by:
dXt = bdt+ dWt; (4.34)
where Xt = (x;y;u;v)Tt is a four-dimensional vector which models the movement of a target
ship sailing on the sea plane (x y plane). Here (x;y) and (u;v) are the position and velocity
57
components respectively, the vector b = (u;v;0;0)Tt , the covariance matrix  2 is de ned by
 2 =
0
BB
BB
BB
B@
 21 0 0 0
0  22 0 0
0 0  23 0
0 0 0  24
1
CC
CC
CC
CA
; (4.35)
and Wt is a four-dimensional Brownian motion. To estimate the target state, a passive sonar
0 5 10 15 20 25 300
5
10
15
20
25
30
x
y
 
 
Taget
Ownship
Observation
angle ?
Figure 4.9: Bearing-only tracking
is located on an observation ship, denoted \ownship". The observation process is given by:
dYt = h(Xt)dt+dBt;
58
where the observation function h is the angle
h(Xt) = arctan
 y
t yobst
xt xobst
 
;
andBt is a one-dimensional Brownian motion independent fromWt. Here, Xobst =  xobs;yobs Tt
describes the movement of the ownship given by dXobst = bobsdt.
5 10 15 20 256
8
10
12
14
16
18
x
y
 
 
Target Trajectory
Particle Filter
Zakai Filter (Sparse Grid)
Figure 4.10: Comparison of estimated mean of the target state between particle  lter and
sparse grid Zakai  lter
In the numerical simulations, we choose  1 =  2 = 0:75,  3 =  4 = 0:05 in (4.35)
and bobs = (8;0)T; thus the movement of the ownship is along the Y-axis with a constant
speed. In addition, we set T = 1, the time partition  tn = 0:005, initial target state X0 =
(2;8;20;8)T, and the initial PDF of the target sate N(  X; ), where  X = (3;9;19:8;7:9)T
and  = (0:75;0:75;0:5;0:25)T.
59
20 40 60 80 100 120 140 160 180 2000
5
10
15
20
25
n
x
 
 Target Trajectory
ZF(SG) Estimated stateZF(SG) 95% probability region
(a)
20 40 60 80 100 120 140 160 180 2006
8
10
12
14
16
18
20
n
y
 
 Target Trajectory
ZF(SG) Estimated stateZF(SG) 95% probability region
(b)
20 40 60 80 100 120 140 160 180 20016
17
18
19
20
21
22
23
24
25
26
n
u
 
 Target Trajectory
ZF(SG) Estimated stateZF(SG) 95% probability region
(c)
20 40 60 80 100 120 140 160 180 2006
7
8
9
10
11
12
n
v 
 
 Target Trajectory
ZF(SG) Estimated stateZF(SG) 95% probability region
(d)
Figure 4.11: Sparse grid Zakai  lter estimate of posterior mean with 95% probability region.
(a) Estimate of posterior mean of the location x and 95% probability region. (b) Estimate
of posterior mean of the location y and 95% probability region. (c) Estimate of posterior
mean of the velocity u and 95% probability region. (d) Estimate of posterior mean of the
velocity v and 95% probability region
The target-observer plane (x y plane) is illustrated in Figure 4.9. The red dot shows
the initial position of the target ship while the blue triangle shows the initial position of our
ownship. The dashed red curve gives a possible trajectory of the target ship and the blue
arrow describes the movement of our ownship.
Figure 4.10 shows the comparison of the estimated mean values of the relative target
position with respect to the ownship, in the target-observer plane. For the sparse grid Zakai
60
18 19 20 21 22 23 24 25 26 270
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
x 
 PF: 20,000
PF: 40,000PF: 80,000
PF: 160,000Zakai Filter
(a) Marginal distribution of the target state on x-coordinate
13 14 15 16 17 18 19 20 210
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
y 
 PF: 20,000
PF: 40,000PF: 80,000
PF: 160,000Zakai Filter
(b) Marginal distribution of the target state on y-coordinate
Figure 4.12: Marginal distributions
 lter, we let the hierarchical level L = 6 and for the adaptive solution domain selection
process, we set M = 1;000 and  = 4. The estimate for the particle  lter is obtained by
using 160;000 particles. The actual trajectory of the target position relative to our ownship
is given by the black curve. The red curve and blue curve show the estimate target state
(the mean of the estimate posterior PDF ) obtained by using the particle  lter and the Zakai
61
Table 4.2: Comparison of computing costs
Particle  lter CPU time (seconds)
20,000 particles 554:24
40,000 particles 1976:39
80,000 particles 7702:57
160,000 particles 32380:38
Sparse grid Zakai  lter 587:14
 lter respectively. We can see from Figure 4.10 that our hybrid sparse grid Zakai  lter yields
similar estimate results compared with the particle  lter with 160;000 particles.
In Figure 4.12a and 4.12b we compare the marginal probability distribution functions
at time T = 1 with respect to xy coordinates using our hybrid sparse grid Zakai  lter with
the particle  lter. Similar to Example 2, we see that the convergence of the particle  lter
with the particle size 160;000 to the approximate PDE is very close to the approximate
PDE obtained by the hybrid sparse grid Zakai  lter. In table 4.2, we show the comparison of
computational from which one can see that the particle  lter with 160;000 particles takes 60
times more computing time to achieve a similar PDF. We also plot the con dence curves in
Figure 4.11 for further examination of the performance of sparse grid Zakai  lter. The blue
dashed curves show the real target trajectory with respect to xy coordinates. The red curves
represent the estimate of posterior means, and the green dashed curves give the estimate
95% con dence curves ( 2 times of the estimated standard deviation).
62
Chapter 5
Numerical Algorithms for Backward Doubly Stochastic Di erential Equations and it?s
Applications to Nonlinear Filtering Problems
5.1 FBDSDEs and SPDEs
To derive the numerical algorithm and conduct its convergence analysis, we provide a
brief introduction to forward backward doubly stochastic di erential equations (FBDSDEs)
and the relationship between FBDSDEs and the SPDE.
Let ( ;F;P) be a complete probability space and T > 0 be the terminal time,fWt;0 
t  Tg and fBt;0  t  Tg be two mutually independent standard Brownian motions
de ned on ( ;F;P) with their values in Rd and in Rl, respectively. Let N denote the class
of P-null sets of F. For each t2[0;T], we de ne
Ft :=FWt _FBt;T;
where F s;t =  f r  s;s r tg_N the  - eld generated by f r  s;s r tg, and
F t = F 0;t for a stochastic process  . Note that the collection fFt;t 2 [0;T]g is neither
increasing nor decreasing and it is not a  ltration. For positive integer n 2 N, we de ne
spaces M2(0;T;Rn) and S2([0;T];Rn) as follows.
M2(0;T;Rn) := f?tj?t2Rn;E
Z T
0
j?tj2dt<1; ?t2Ft; a.e. t2[0;T]g;
and
S2([0;T];Rn) :=f?tj?t2Rn;E( sup
0 t T
j?tj2) <1; ?t2Ft; t2[0;T]g:
63
Let
f :   [0;T] Rk Rk d!Rk
and
g :   [0;T] Rk Rk d!Rk l
be jointly measurable such that for any (y;z)2Rk Rk d,
ft(y;z)2M2(0;T;Rk);
gt(y;z)2M2(0;T;Rk l):
We assume moreover that there exist constants c > 0 and 0 <  < 1 such that for any
(!;t)2  [0;T], (y1;z1), (y2;z2)2Rk Rk l,
jft(y1;z1) ft(y2;z2)j2  c(jy1 y2j2 +kz1 z2k2);
kgt(y1;z1) gt(y2;z2)k2  cjy1 y2j2 + kz1 z2k2:
From [61], under the above assumptions and standard conditions on b and  , we know that
there exists a pair of processesf(Yt;xs ;Zt;xs ); (t;x)2[0;T] Rdgwhich is the unique solution
to the following FBDSDE: For (t;x)2R+ Rd
Xt;xs = x+
Z s
t
b(Xt;xr )dr +
Z s
t
 (Xt;xr )dWr; t s T; (5.1)
Yt;xs = h(Xt;xT ) +
Z T
s
f(r;Xt;xr ;Yr;xr ;Zt;xr )dr
+
Z T
s
g(r;Xt;xr ;Yr;xr ;Zt;xr )d Br 
Z T
s
Zt;xr dWr; t s T;
(5.2)
where (Yt;xs ;Zt;xs ) 2 S2([0;T];Rk) M2(0;T;Rk l). Here d Br denotes the backward It^o
integration, i.e., for a FBt;T adapted process Vt, and quasi-uniform time partitions  : 0 =
64
t0 <t1 <t2 <   <tN 1 <tN = T,
Z T
0
Vtd Bt := lim
 t!0
NX
n=0
Vtn+1(Btn+1  Btn)
where  t = max
0 i N 1
(ti+1 ti). According to Pardoux and Peng ([61]), we have the following
nonlinear Feynman-Kac formula.
Yt;xs = us(Xt;xs ); Zt;xs = (rus )(Xt;xs ); (t;x)2[0;T] Rd;
where u = ut(x)2Rk is the unique solution of the following system of backward stochastic
partial di erential equation:
ut(x) = h(x) +
Z T
t
[Lsus(x) +f(s;x;us(x);(rus )(x))]ds
+
Z T
t
g(s;x;us(x);(rus )(x))d Bs; 0 t T:
(5.3)
Equation (5.3) is the Zakai type equation (2.6) and (4.3) as used to solve nonlinear  ltering
problems.
To simplify our presentation, in what follows we assume that b 0 and   1 in (5.1).
Thus we have
X0;xs = x+Ws; x2 ;s2[0;T];
and the elliptic partial di erential operator L becomes
L = 12
dX
i
@2
@x2i:
When the spatial domain  of the SPDE (5.3) is a subset of Rd and the boundary condition
ut(x) =  (t;x); on[0;T] @ 
65
is prescribed, the corresponding BDSDE is de ned through a stopping time  de ned as
 = inffs;Xt;xs 2@ ;s t;x2 g: (5.4)
Then, we have the BDSDE with stopping time as follows.
Yt;xs =  (Xx ^T) +
Z T^ 
s
f(r;Xt;xr ;Yr;xr ;Zt;xr )dr
+
Z T^ 
s
g(r;Xt;xr ;Yr;xr ;Zt;xr )d Br 
Z T^ 
s
Zt;xr dWr; t s T; x2 ;
(5.5)
where  (Xx ;T) = h(Xt;xT )I  T +  ( ;Xt;x )I  T. When t = 0, the stopping time  de ned in
(5:6) becomes
 = inffs;X0;xs 2@ ; s 0x2 g:
Thus, BDSDE (5.7) changes to the following equation:
Y 0;xt =  (X0;x ^T) +
Z T^ 
t
f(s;X0;xs ;Y 0;xs ;Z0;xs )ds
 
Z T^ 
t
Z0;xs dWs +
Z T^ 
t
g(s;X0;xs ;Y 0;xs ;Z0;xs )d Bs; t2[0;T]; x2 ;
where for given x, X0;x0 = x,  (X0;x ^T) = h(X0;xT )I  T + ( ;X0;x )I  T. The related SPDE is
8
>>>>
>><
>>>>
>>:
ut(x) = h(x) +
Z T
t
[12
dX
i=1
@2us(x)
@x2i +f(s;x;us(x);rus(x))]ds
+
Z T
t
g(s;x;us(x);rus(x))d Bs; x2 ; 0 t T;
ut(x) =  (t;x); on [0;T] @ :
5.2 Half Order Numerical Algorithms
We  rst study numerical algorithms for BDSDEs and assume that b 0 and   1 in
(5.1). Thus we have
X0;xs = x+Ws; x2 ;s2[0;T];
66
and the elliptic partial di erential operator L becomes
L = 12
dX
i
@2
@x2i:
When the spatial domain  of the SPDE (5.3) is a subset of Rd and the boundary condition
ut(x) =  (t;x); on[0;T] @ 
is prescribed, the corresponding BDSDE is de ned through a stopping time  de ned as
 = inffs;Xt;xs 2@ ;s t;x2 g: (5.6)
Then, we have the BDSDE with stopping time as follows.
Yt;xs =  (Xx ^T) +
Z T^ 
s
f(r;Xt;xr ;Yr;xr ;Zt;xr )dr
+
Z T^ 
s
g(r;Xt;xr ;Yr;xr ;Zt;xr )d Br 
Z T^ 
s
Zt;xr dWr; t s T; x2 ;
(5.7)
where  (Xx ;T) = h(Xt;xT )I  T +  ( ;Xt;x )I  T. When t = 0, the stopping time  de ned in
(5:6) becomes
 = inffs;X0;xs 2@ ; s 0x2 g:
Thus, BDSDE (5.7) changes to the following equation:
Y 0;xt =  (X0;x ^T) +
Z T^ 
t
f(s;X0;xs ;Y 0;xs ;Z0;xs )ds
 
Z T^ 
t
Z0;xs dWs +
Z T^ 
t
g(s;X0;xs ;Y 0;xs ;Z0;xs )d Bs; t2[0;T]; x2 ;
67
where for given x, X0;x0 = x,  (X0;x ^T) = h(X0;xT )I  T + ( ;X0;x )I  T. The related SPDE is
8
>>>>
>><
>>>>
>>:
ut(x) = h(x) +
Z T
t
[12
dX
i=1
@2us(x)
@x2i +f(s;x;us(x);rus(x))]ds
+
Z T
t
g(s;x;us(x);rus(x))d Bs; x2 ; 0 t T;
ut(x) =  (t;x); on [0;T] @ :
5.2.1 Numerical algorithms
For the simplicity of presentation we only consider the one dimensional case. The
high dimensional cases can be handled through straightforward generalization of the one
dimensional case. To simplify the notations we shall use (yt;zt) to denote the solution
(Yt;xt ;Zt;xt ) of the BDSDE (5:2). We also denote Et;xs [X] = E[XjFW;t;xs ] where FW;t;xs :=
 (x+Ws Wt;t s T)[ (Bt; 0 t T).
Reference equations
To further simplify the notations, we denotef(s;ys;zs) = f(s;Xt;xs ;ys;zs) andg(s;ys;zs) =
g(s;Xt;xs ;ys;zs), knowing that x2  R. Then we have
yt = yt+ +
Z t+ 
t
f(s;ys;zs)ds 
Z t+ 
t
zsdWs +
Z t+ 
t
g(s;ys;zs)  dBs; (5.8)
where  is a deterministic nonnegative number with t +   T. Taking the conditional
expectation Et;xt [ ] on (5.52), we obtain
yt;xt = Et;xt [yt+ ] +
Z t+ 
t
Et;xt [f(s;ys;zs)]ds+
Z t+ 
t
Et;xt [g(s;ys;zs)]  dBs; (5.9)
68
where yt;xt = Et;xt [yt], that is, yt;xt is the value of yt at the time-space point (t;x). We use the
simple right point formula to approximate the integrals in (5.53):
Z t+ 
t
Et;xt [f(s;ys;zs)]ds =  Et;xt [f(t+ ;yt+ ;zt+ )] +RWy ; (5.10)
Z t+ 
t
Et;xt [g(s;ys;zs)]  dBs = Et;xt [g(t+ ;yt+ ;zt+ )]  Bt +RBy; (5.11)
where RWy and RBy denote the corresponding errors of approximations. Inserting (5:54) and
(5:11) into (5:53), we obtain
yt;xt = Et;xt [yt+ ] + Et;xt [f(t+ ;yt+ ;zt+ )]
+Et;xt [g(t+ ;yt+ ;zt+ )]  Bt +Ry;
(5.12)
where Ry = RWy + RBy is the truncation error for solving yt. Let  Ws = Ws Wt for
t s t +  . Multiplying by  Wt+ on (5.52), taking the conditional expectation Et;xt [ ]
and applying the It^o isometry we get
 Et;xt [yt+  Wt+ ] =
Z t+ 
t
Et;xt [f(s;ys;zs) Ws]ds
+
Z t+ 
t
Et;xt [g(s;ys;zs) Ws]  dBs 
Z t+ 
t
Et;xt [zs]ds:
(5.13)
Similar to (5.12) we approximate the integrals in (5.13) with the right point formula to
obtain Z t+ 
t
Et;xt [f(s;ys;zs) Ws]ds
=  Et;xt [f(t+ ;yt+ ;zt+ ) Wt+ ] +RWz1
(5.14)
 
Z t+ 
t
Et;xt [zs]ds =  zt;xt +RWz2 (5.15)
and
Z t+ 
t
Et;xt [g(s;ys;zs) Ws]  dBs = Et;xt [g(t+ ;yt+ ;zt+ ) Wt+ ] Bt +RBz ; (5.16)
69
where zt;xt is the value of zt at the time-space point (t;x), and RWz1, RWz2 and RBz are the
corresponding approximation errors. Inserting (5:14), (5:15) and (5:16) into (5:13), we get
the second approximation equation for (5:52) as follows.
 Et;xt [yt+  Wt+ ] =  Et;xt [f(t+ ;yt+ ;zt+ ) Wt+ ]  zt;xt
+Et;xt [g(t+ ;yt+ ;zt+ ) Wt+ ] Bt +Rz;
(5.17)
where Rz = RWz1 +RWz2 +RBz is the truncation error for solving zt. (5:12) and (5:17) are two
key equations of solving BDSDE (5:52) numerically. We refer them as reference equations.
Discrete scheme
To derive a numerical algorithm from the reference equations (5.12) and (5.17), we
introduce the following time partition on [0;T].
Rth =ftijti2[0;T];ti <ti+1;i = 0;1;:::;NT 1;t0 = 0;tNT = Tg:
Let  tn = tn+1  tn and  t = max0 n NT 1  tn. We discretize (5.12) and (5.17) by
substituting t,  , yt and zt with tn,  tn with yn and zn respectively and dropping the
errors terms in (5.12) and (5.17), to obtain the following numerical algorithm for solving
BSDE: given random variable yN, for n = N 1;N 2;:::;1;0, solve the random variables
yn and zn backwardly by
yn = Etn;xtn [yn+1] +  tnEtn;xtn f(tn+1;yn+1;zn+1)
+Etn;xtn [g(tn+1;yn+1;zn+1)] Btn
(5.18)
and
0 = Etntn[yn+1 Wtn+1] +  tnEtn;xtn [f(tn+1;yn+1;zn+1) Wtn+1]
+Etn;xtn [g(tn+1;yn+1;zn+1) Wtn+1] Btn  tnzn:
(5.19)
Obviously (yn;zn) is an approximate solution for (yt;zt) at t = tn;n = 0;1;   ;NT.
70
5.2.2 Error Estimates
To derive the error estimates, we  rst need some regularity results for the exact solution.
Regularity of the exact solution
We assume that f and g satisfy the following properties.
E[(f(s;y1;z1) f(t;y2;z2))2] L(js tj+jy1 y2j2 +jz1 z2j2);
E[(g(s;y1;z1) g(t;y2;z2))2] L1(js tj+jy1 y2j2) +L2jz1 z2j2;
(5.20)
where L, L1 and L2 are positive constants and 0 L2 < 1 (see [61] for similar assumptions).
We also assume that the derivatives f0x, f0y, f0z, g0x, g0y and g0z of f and g are all continuous
and bounded. Letryt;xr ,rzt;xr andrXt;xr be the variations of yt;xr , zt;xr , Xt;xr with respect to
x at time level t = r. Then the following equation holds.
ryt;xs = h0(Xt;xT )rXt;xT +
Z T
s
[f0x(r;Xt;xr ;yt;xr ;zt;xr )rXt;xr
+f0y(r;Xt;xr ;yt;xr ;zt;xr )ryt;xr +f0z(r;Xt;xr ;yt;xr ;zt;xr )rzt;xr ]dr
+
Z T
s
[g0x(r;Xt;xr ;yt;xr ;zt;xr )rXt;xr +g0y(r;Xt;xr ;yt;xr ;zt;xr )ryt;xr
+g0z(r;Xt;xr ;yt;xr ;zt;xr )rzt;xr ]d Br 
Z T
s
rzt;xr dWr;
(5.21)
where rXt;xr is the solution of following SDE (see [86], page 464, equation (12)).
rXt;xs = 1 +
Z s
t
@xb(r)rXt;xr dr +
Z s
t
@x (r)rXt;xr dWr:
We have the following result concerning the regularity of the solution (yt;zt) of the
FBDSDE (5.2).
71
Proposition 1 Assume that Hypothesis (5.20) holds and the derivatives f0x, f0y, f0z, g0x, g0y
and g0z of f and g are all bounded, then we have
E[(yt;xs  yt;xt )2] Cjs tj (5.22)
and
E[(zt;xs  zt;xt )2] Cjs tj: (5.23)
Proof: Under the assumptions of the Proposition, Pardoux and Peng ([61] proved the
estimate (5.22) and the following estimate
sup
t s T
E[(yt;xs )2] C;
where C is a constant. To obtain (5.23), we use the fact that (see [61], page 223, Proposition
2.3)
zt;xs =ryt;xs (rXt;xs ) 1 (Xt;xs )
and
zt;xt =ryt;xt  (x):
Now we treat (5.21) the same as equation (5.2) with f replaced by
[f0x(r;Xt;xr ;yt;xr ;zt;xr )rXt;xr +f0y(r;Xt;xr ;yt;xr ;zt;xr )ryt;xr +f0z(r;Xt;xr ;yt;xr ;zt;xr )rzt;xr ]
and g replaced by
[g0x(r;Xt;xr ;yt;xr ;zt;xr )rXt;xr +g0y(r;Xt;xr ;yt;xr ;zt;xr )ryt;xr +g0z(r;Xt;xr ;yt;xr ;zt;xr )rzt;xr ];
72
and use the same result of Pardoux and Peng to obtain
E[(ryt;xs  ryt;xt )2] <Cjs tj
and
sup
t s T
E[(ryt;xs )2] C:
Because of the assumption that b = 0 and  = 1, we have that
rXt;xs = (rXt;xs ) 1 = 1
and  = 1. Thus
E[(zt;xs  zt;xt )2] = E[(ryt;xs  ryt;xt )2]
 Cjs tj:
Estimates of truncation errors
For the sake of simplicity of our presentation, in the sequel, we use Et[ ] to denote
Etn;xt [ ]. Recall the numerical scheme
yn = Etn[yn+1] +  tnEtn[f(tn+1;yn+1;zn+1)] +  Btn+1Etn[g(tn+1;yn+1;zn+1)];
zn = 1 t
n
 E
tn[yn+1 Wtn+1] +  tnEtn[f(tn+1;yn+1;zn+1) Wtn+1]
+ Btn+1Etn[g(tn+1;yn+1;zn+1) Wtn+1] 
and the reference equations
ytn = Etn[ytn+1] +  tnEtn[f(tn+1;ytn+1;ztn+1)] +  Btn+1Etn[g(tn+1;ytn+1;ztn+1)] +Rny;
ztn = 1 t
n
 E
tn[ytn+1 Wtn+1] +  tnEtn[f(tn+1;ytn+1;ztn+1) Wtn+1]
+ Btn+1Etn[g(tn+1;ytn+1;ztn+1) Wtn+1] + 1 t
n
Rnz;
73
where (ytn;ztn) is the exact solution. We have truncation errors Rny and Rnz for y and z
respectively as
Rny =
Z tn+1
tn
Etn[f(s;ys;zs) f(tn+1;ytn+1;ztn+1)]ds
+
Z tn+1
tn
Etn[g(s;ys;zs) g(tn+1;ytn+1;ztn+1)]d Bs
and
Rnz =
Z tn+1
tn
Etn[(f(s;ys;zs) f(tn+1;ytn+1;ztn+1)) Wtn+1]ds
+
Z tn+1
tn
Etn[(g(s;ys;zs) g(tn+1;ytn+1;ztn+1)) Wtn+1]d Bs
+
Z tn+1
tn
Etn[zs ztn]ds:
Denote ft = f(t;yt;zt) and gt = g(t;yt;zt). By Proposition 1, we have the estimates
max
0 n N 1
E[(ytn ytn 1)2] C  t
and
max
0 n N 1
E[(ztn ztn 1)2] C  t:
For the truncation error Rny, we have the following estimate.
E[(Rny)2]  C1 tn
Z tn+1
tn
E[(f(s;ys;zs) f(tn+1;ytn+1;ztn+1))2]ds
+C2
Z tn+1
tn
E[(g(s;ys;zs) g(tn+1;ytn+1;ztn+1))2]ds
 C1 tn
Z tn+1
tn
E[(tn+1 s) + (ys ytn+1)2 + (zs ztn+1)2]ds
+C2
Z tn+1
tn
E[(tn+1 s) + (ys ytn+1)2 + (zs ztn+1)2]ds
 C( t)2:
Similarly, we have
E[(Rnz)2] K( t)3:
74
Error estimate for y
Denote eny = ytn  yn, enz = ztn  zn, enf = f(tn;ytn;ztn) f(tn;yn;zn), and eng =
g(tn;ytn;ztn) g(tn;yn;zn). We have the estimate of eny for scheme (5:18) and (5:19) in the
following theorem.
Theorem 5.1 Let (yt;zt) be the exact solution and (yn;zn) be the solution of the scheme
(5:18) and (5:19). If Hypothesis (5:20) is satis ed and the derivatives f0x, f0y, f0z, g0x, g0y and
g0z of f and g are all bounded, then
max
0 n N 1
E[eny]2  C t;
where C is a constant.
Proof: We  rst decompose the error for y as
eny = Etn[en+1y ] +Etn[en+1f ] tn +Etn[en+1g ] Btn+1 +Rny:
Taking square on both sides of the above equation and then taking expectation we obtain
Ejenyj2 = E[(Etn[en+1y ] +  Btn+1Etn[en+1g ] +Rny)2 + ( tnEtn[en+1f ])2
+2(Etn[en+1y ] +  Btn+1Etn[en+1g ] +Rny)( tnEtn[en+1f ])]
= E jEtn[en+1y ]j2 + ( Btn+1)2jEtn[en+1g ]j2 + (Rny)2
+2Etn[en+1y ] Btn+1Etn[en+1g ] + 2Etn[en+1y ]Rny + 2 Btn+1Etn[en+1g ]Rny 
+E[( tnEtn[en+1f ])2]
+E[2(Etn[en+1y ] +  Btn+1Etn[en+1g ] +Rny)( tnEtn[en+1f ])]
= Ay +By +Cy;
(5.24)
75
where
Ay = E jEtn[en+1y ]j2 + ( Btn+1)2jEtn[en+1g ]j2 + (Rny)2
+2Etn[en+1y ] Btn+1Etn[en+1g ] + 2Etn[en+1y ]Rny + 2 Btn+1Etn[en+1g ]Rny ;
By = E[( tnEtn[en+1f ])2]
and
Cy = E[2(Etn[en+1y ] +  Btn+1Etn[en+1g ] +Rny)( tnEtn[en+1f ])]:
Next we use Cauchy?s inequality and Young?s inequality and the facts that
E[Etn[en+1y ] 
Z tn+1
tn
Etn[g(s;ys;zs) g(tn+1;ytn+1;ztn+1)]d Bs] = 0
and Z
tn+1
tn
E[f(s;ys;zs) f(tn+1;ytn+1;ztn+1)]2ds C( t)2;
to obtain
Ay = E jEtn[en+1y ]j2 + ( Btn+1)2jEtn[en+1g ]j2 + (Rny)2
+2Etn[en+1y ] Btn+1Etn[en+1g ] + 2Etn[en+1y ]Rny + 2 Btn+1Etn[en+1g ]Rny 
 E[jEtn[en+1y ]j2] +  tnE[jEtn[en+1g ]j2] +E[(Rny)2]
+2E[Etn[en+1y ] 
Z tn+1
tn
Extn[f(s;ys;zs) f(tn+1;ytn+1;ztn+1)]ds
+Etn[en+1y ] 
Z tn+1
tn
Extn[g(s;ys;zs) g(tn+1;ytn+1;ztn+1)]d Bs]
+ tn 1E[jEtn[en+1g ]j2] +E[ 1 
1
(Rny)2]
 E[jEtn[en+1y ]j2] +  tnE[jEtn[en+1g ]j2] +E[(Rny)2]
+ tnE[en+1y ]2 +
Z tn+1
tn
E[f(s;ys;zs) f(tn+1;ytn+1;ztn+1)]2ds
+ tn 1E[jEtn[en+1g ]j2] +E[ 1 
1
(Rny)2]
76
 E[jEtn[en+1y ]j2] +  tnE[(1 + 1)jEtn[en+1g ]j2] +  tnE[en+1y ]2 +C1( t)2; (5.25)
where  1 > 0 is a constant to be determined later. By the Lipschitz continuity of f, we have
By = E[( tnEtn[en+1f ])2]
 L( tn)2(E[en+1y ]2 +E[en+1z ]2):
(5.26)
Similarly for Cy, using Cauchy?s inequality and Young?s inequality, we obtain
Cy = E[2(Etn[en+1y ] +  Btn+1Etn[en+1g ] +Rny)( tnEtn[en+1f ])]
  tn 1 
2
E[Etn[en+1y ] +  Btn+1Etn[en+1g ] +Rny]2
+ tn 2E[en+1f ]2
  tn 3 
2
E[Etn[en+1y ]2 +  tn(L1Etn[en+1y ]2 +L2Etn[en+1z ]2) + (Rny)2]
+ tnE[L 2Etn[en+1y ]2 +L 2Etn[en+1z ]2]
  tnC2E[en+1y ]2 + ( tn)2C3(E[en+1y ]2 +E[en+1z ]2) +  tnL 2E[en+1z ]2
+C4( t)2;
(5.27)
where  2 > 0 is a constant to be determined later. Combining (5.24), (5.25), (5.26) and
(5.27) together, we obtain
E[eny]2  E[jEtn[en+1y ]j2] +  tnE[(1 + 1)jEtn[en+1g ]j2]
+ tnL 2E[en+1z ]2 +K1 tnE[en+1y ]2
+K2( tn)2E[en+1z ]2 +K3( t)2:
(5.28)
For enz, we have the identity
 tnenz = Etn[en+1y  Wtn+1] +  tnEtn[en+1f  Wtn+1]
+ Btn+1Etn[en+1g  Wtn+1] +Rnz:
(5.29)
77
Taking square on both sides of equation (5.29) and then taking expectation we obtain
E[ tnenz]2 = E[Etn[en+1y  Wtn+1] +  Btn+1Etn[en+1g  Wtn+1] +Rnz]2
+E[( tn)2Etn[en+1f  Wtn+1]2]
+2E[(Etn[en+1y  Wtn+1] +  Btn+1Etn[en+1g  Wtn+1] +Rnz)
( tnEtn[en+1f  Wtn+1])]
= E[(Etn[en+1y  Wtn+1])2 + ( Btn+1)2(Etn[en+1g  Wtn+1])2 + (Rnz)2
+2(Etn[en+1y  Wtn+1])(Etn[en+1g  Wtn+1]) Btn+1 + 2(Etn[en+1y  Wtn+1])Rnz
+2Etn[en+1g  Wtn+1] Btn+1Rnz]
+E[( tn)2Etn[en+1f  Wtn+1]2]
+2E[(Etn[en+1y  Wtn+1] +  Btn+1Etn[en+1g  Wtn+1] +Rnz)
( tnEtn[en+1f  Wtn+1])]
= Az +Bz +Cz;
(5.30)
where
Az = E[(Etn[en+1y  Wtn+1])2 + ( Btn+1)2(Etn[en+1g  Wtn+1])2 + (Rnz)2
+2(Etn[en+1y  Wtn+1])(Etn[en+1g  Wtn+1]) Btn+1 + 2(Etn[en+1y  Wtn+1])Rnz
+2Etn[en+1g  Wtn+1] Btn+1Rnz];
Bz = E[( tn)2Etn[en+1f  Wtn+1]2]
and
Cz = 2E[(Etn[en+1y  Wtn+1] +  Btn+1Etn[en+1g  Wtn+1] +Rnz)
 ( tnEtn[en+1f  Wtn+1])]:
For any FWt adapted process Xt we have
(Etn[Xtn+1 Wtn+1])2 = (Etn[(Xtn+1  Etn[Xtn+1]) Wtn+1])2
 Etn[(Xtn+1  Etn[Xtn+1])]2 tn
=  tn(Etn[(Xtn+1)2] jEtn[Xtn+1]j2):
(5.31)
78
For Az, using (5.31), Cauchy?s inequality and Young?s inequality, we have
Az = E[(Etn[en+1y  Wtn+1])2 + ( Btn+1)2(Etn[en+1g  Wtn+1])2 + (Rnz)2
+2(Etn[en+1y  Wtn+1])(Etn[en+1g  Wtn+1]) Btn+1 + 2(Etn[en+1y  Wtn+1])Rnz
+2Etn[en+1g  Wtn+1] Btn+1Rnz]
 E[(Etn[en+1y  Wtn+1])2 + ( Btn+1)2(Etn[en+1g  Wtn+1])2 + (Rnz)2
+ (Etn[en+1y  Wtn+1])2 + 1 (Rnz)2
+ 1(Etn[en+1g  Wtn+1] Btn+1)2 + 1 
1
(Rnz)2]
  tnE[Etn[en+1y ]2 jEtn[en+1y ]j2]
+( tn)2E[Etn[en+1g ]2 jEtn[en+1g ]j2] +E[(Rnz)2]
+  tnE[Etn[en+1y ]2 jEtn[en+1y ]j2] + 1 E[(Rnz)2]
+ 1( tn)2E[Etn[en+1g ]2 jEtn[en+1g ]j2] + 1 
1
E[(Rnz)2]
 (1 + ) tnE[Etn[en+1y ]2 jEtn[en+1y ]j2]
+(1 + 1)( tn)2E[Etn[en+1g ]2 jEtn[en+1g ]j2]
+C5( t)3:
(5.32)
Under the conditions in the theorem, we have
Bz = E[( tn)2Etn[en+1f  Wtn+1]2] C6( t)3: (5.33)
Similarly, using Cauchy?s inequality and Young?s inequality, we obtain
Cz = 2E[(Etn[en+1y  Wtn+1] +  Btn+1Etn[en+1g  Wtn+1] +Rnz)
( tnEtn[en+1f  Wtn+1])]
  tn 1 
2
E[(Etn[en+1y  Wtn+1] +  Btn+1Etn[en+1g  Wtn+1] +Rnz)]2
+ tn 2Etn[en+1f  Wtn+1]2
 ( tn)2 3 
2
(E[en+1y ]2 +  tnL1E[en+1y ]2 +  tnL2E[en+1z ]2 + (Rnz)2)
79
+( tn)2L 2(E[en+1y ]2 +E[en+1z ]2)
 C7 ( tn)2E[en+1y ]2 + ( tn)3(E[en+1y ]2 +E[en+1z ]2) 
+( tn)2L 2E[en+1z ]2 +C8( t)3: (5.34)
Here  1 and  2 are same as in equation (5.91), and  is a positive constant which will be
determined later. Combining (5.30), (5.32), (5.33) and (5.34) together, we get
E[ tnenz]2  (1 + ) tnE[Etn[en+1y ]2 jEtn[en+1y ]j2]
+(1 + 1)( tn)2E[Etn[en+1g ]2 jEtn[en+1g ]j2]
+( tn)2L 2E[en+1z ]2
+K4( tn)2Etn[en+1y ]2 +K5( tn)3Etn[en+1z ]2
+K6( t)3:
(5.35)
Next we divide by  tn(1 + ) on both sides of (5.35) to obtain
 tn
1 + E[e
n
z]
2  E[Et
n[e
n+1
y ]
2 jEt
n[e
n+1
y ]j
2]
+( tn)1 + 11 + E[Etn[en+1g ]2 jEtn[en+1g ]j2]
+( tn)L  21 + E[en+1z ]2
+K4 tnEtn[en+1y ]2 +K5( tn)2Etn[en+1z ]2
+K6( t)2:
(5.36)
Adding (5.36) to (5.91), we obtain
E[eny]2 +  tn1 + E[enz]2  E[jEtn[en+1y ]j2] +  tnE[(1 + 1)jEtn[en+1g ]j2]
+ tnL 2E[en+1z ]2 +K1 tnE[en+1y ]2 +K2( tn)2E[en+1z ]2
+K3(h2 +E[
Z tn+1
tn
jzs ztnj2]ds)
+E[Etn[en+1y ]2 jEtn[en+1y ]j2]
+( tn)1 + 11 + E[Etn[en+1g ]2 jEtn[en+1g ]j2]
80
+( tn)L  21 + E[en+1z ]2
+K4 tnEtn[en+1y ]2 +K5( tn)2Etn[en+1z ]2
+K6(( t)2 +E[
Z tn+1
tn
jzs ztnj2]ds)
= E[eny]2 + ( tn)1 + 11 + E[Etn[en+1g ]2]
+( tn)1 + 11 +  E[jEtn[en+1g ]j2]
+( tn)L 2 + 2 +  21 + E[en+1z ]2
+G1 tnE[en+1y ]2 +G2( tn)2E[en+1z ]2 +G3( t)2
 E[eny]2 + ( tn)(L2 1 + 11 + 
+L2 1 + 11 +  +L 2 + 2 +  21 + )E[en+1z ]2
+G1 tnE[en+1y ]2 +G2( tn)2E[en+1z ]2 +G3( t)2: (5.37)
Now we choose  ,  1 and  2, all positive, su ciently small such that
L2(1 + 1) +L2 (1 + 1) +L(2 2 +  2) 1:
This is possible since L2 < 1. Thus, by equation (5.37), we have
E[eny]2 +  tn1 + E[enz]2  E[eny]2 +  tn1 + E[en+1z ]2
+T1 tn(E[en+1y ]2 +  tn1 + E[en+1z ]2)
+T2( t)2:
Denote en := E[eny]2 +  tn1+ E[enz]2. Then the above equation becomes
en (1 +T1 t)en+1 +T2( t)2:
81
By Gronwall?s inequality, we have
max
0 n N 1
(E[eny]2 +  tn1 + E[enz]2) C t
as required.
Error estimate for z
We  rst construct an approximate solution ( ~Y; ~Z) with step process as follows. Let
~Yt
n+1 = y
n+1 +  tn fn+1 +  Bt
n g
n+1
andGt =FWt _FBT . By an extension of It^o?s martingale representation theorem, we can  nd
an Gt adapted process ~Zt, such that
~Yt
n+1 = E[
~Yt
n+1jGtn] +
Z tn+1
tn
~ZrdWr: (5.38)
De ne a continuous approximate process ( ~Y; ~Z) as follows.
~Yt = yn+1 +fn+1 (tn+1 t) +gn+1 (Bt
n+1  Bt) 
Z tn+1
t
~ZrdWr;
t2(tn;tn+1];n = 0;   N 1
(5.39)
where
fn+1 = f(tn+1;yn+1;zn+1)
and
gn+1 = g(tn+1;yn+1;zn+1):
By (5.19) and (5.38) it?s easy to see that
 tnzn =
Z tn+1
tn
Etn[ ~Zr]dr:
82
Thus
Z tn+1
tn
E[(zs zn)2]ds =
Z tn+1
tn
E[(zs 1 t
n
Z tn+1
tn
Etn[ ~Zr]dr)2]ds
=
Z tn+1
tn
E[( 1 t
n
Z tn+1
tn
Etn[zs ~Zr]dr)2]ds
 E
Z tn+1
tn
1
 tn
Z tn+1
tn
Etn[(zs ~Zr)2]drds
 2(
Z tn+1
tn
E[Etn[(zr ~Zr)2]]dr +
Z tn+1
tn
1
 tn
Z tn+1
tn
E[Etn[(zs zr)2]]drds)
 2(
Z tn+1
tn
E[(zr ~Zr)2]dr + ( tn)2):
(5.40)
Now we are ready to prove an error estimate for z.
Theorem 5.2 Let (yt;zt) be the exact solution and (yn;zn) be the solution of the scheme
(5:18) and (5:19). Assume that Hypothesis (5:20) holds and derivatives f0x, f0y, f0z, g0x, g0y and
g0z of f and g are all bounded. Then for  t su ciently small, we have
N 1X
n=0
E
Z tn+1
tn
(zs zn)2ds C t:
Proof: For t2[tn;tn+1], let ety = yt ~Yt, etz = zt ~Zt, ft = f(t;yt;zt) and gt = g(t;yt;zt).
Subtracting BDSDE (5.39) from (5.2) we have that
ety = etn+1y +
Z tn+1
t
(fs fn+1)ds+
Z tn+1
t
(gs gn+1)d Bs 
Z tn+1
t
eszdWs: (5.41)
Taking square on both sides of (5.41), applying It^o?s formula (Pardoux and Peng (1994))
and taking expectation, we have
E[(ety)2] +E
Z tn+1
t
(esz)2ds
= E[(etn+1y )2] + 2E
Z tn+1
t
esy (fs fn+1)ds+E
Z tn+1
t
(gs gn+1)2ds
 E[(etn+1y )2] + 1 
0
E
Z tn+1
t
(esy)2ds+ 0E
Z tn+1
t
(fs fn+1)2ds+E
Z tn+1
t
(gs gn+1)2ds
83
 E[(etn+1y )2] + 1 
0
E
Z tn+1
t
(esy)2ds+E
Z tn+1
t
2c(ys yn+1)2 + ( 1 + )(zs zn+1)2ds
 E[(etn+1y )2] + 1 
0
E
Z tn+1
t
(esy)2ds+CE
Z tn+1
t
(ys ytn+1)2 + (ytn+1  yn+1)2ds
+E
Z tn+1
t
( 1 + 1 
2
+ )[(zs ( tn) 1
Z tn+2
tn+1
Etn+1[zr]dr)2]
+( 1 + 2 + )[(( tn) 1
Z tn+2
tn+1
Etn+1[zr]dr zn+1)2]ds; (5.42)
where  0,  1 and  2 are positive constants to be determined later. Since E[(ys yt)2]+E[(zs 
zt)2] Cjs tj, we have
E
Z tn+1
t
(ys ytn+1)2ds C( t)2
and
E
Z tn+1
t
(zs ( tn) 1
Z tn+2
tn+1
Etn+1[zr]dr)2ds
= E
Z tn+1
t
(zs ztn+1 +ztn+1  ( tn) 1
Z tn+2
tn+1
Etn+1[zr]dr)2ds
= 2E
Z tn+1
t
(zs ztn+1)2 + (( tn) 1
Z tn+2
tn+1
Etn+1[ztn+1  zr]dr)2ds
 C( t)2:
Also, since  tzn+1 =
Z tn+2
tn+1
Etn+1[ ~Zr]dr, we have
(( tn) 1
Z tn+2
tn+1
Etn+1[zr]dr zn+1)2  ( tn) 1
Z tn+2
tn+1
Etn+1[(zr ~Zr)2]dr:
Thus we can rewrite (5.42) as
E[(ety)2] +E
Z tn+1
t
(esz)2ds  E[(etn+1y )2] +CE
Z tn+1
t
(etn+1y )ds
+( 1 + 2 + )E
Z tn+2
tn+1
(esz)2ds+C( t)2
+ 1 
0
E
Z tn+1
t
(esy)2ds:
84
Choose  1 and  2 small enough such that ( 1 + 2 + ) = K < 1 (since  < 1). Then
E[(ety)2] +E
Z tn+1
t
(esz)2ds  C1E
Z tn+1
t
(esy)2ds+E[(etn+1y )2] +C2E
Z tn+1
tn
(etn+1y )2ds
+KE
Z tn+2
tn+1
(esz)2ds+C( t)2
 
 
(1 +C2 t)E[(etn+1y )2] +KE
Z tn+2
tn+1
(esz)2ds+C( t)2
 
+C1E
Z tn+1
t
(esy)2ds:
(5.43)
By Gronwall?s inequality, we get
E[(esy)2] C((1 +C2 t)E[(etn+1y )2] +E
Z tn+2
tn+1
(esz)2ds+C( t)2) (5.44)
for s2[tn;tn+1]. Now we let t = tn in (5.43) and substitute (5.44) in (5.43) to obtain
E[(etny )2] +E
Z tn+1
tn
(esz)2ds  (1 +C1 t)E[(etn+1y )2]
+(K +C2 t)E
Z tn+2
tn+1
(esz)2ds+C( t)2:
Now by Theorem 1, we easily obtain
E[(etny )2] +E
Z tn+1
tn
(esz)2ds E[(etn+1y )2] + (K +C t)E
Z tn+2
tn+1
(esz)2ds+C( t)2:
Let  t be su ciently small such that C t + K L< 1, where L is a constant. Summing
the above equation from n = 0 to n = N 1, we obtain
(1 L)
N 1X
n=0
E
Z tn+1
tn
(esz)2ds C t+L
Z tN
tN 1
E[(esz)2]ds C t: (5.45)
Through a similar argument, it?s easy to obtain
Z tN
tN 1
E[(esz)2]ds C t: (5.46)
85
By (5.40), (5.45) and (5.46), we conclude that
N 1X
n=0
E
Z tn+1
tn
(zs zn)2ds C t
as required in Theorem 2.
5.2.3 Numerical experiments
In this section we carry out numerical experiments to verify the rate of convergence
results obtained in Section 3 and compare our numerical method with the  nite di er-
ence method for stochastic parabolic partial di erential equations ([?]). The conditional
expectations in (5.18) and (5.19) can be evaluated using Monte Carlo method or Gaussian
quadratures ([?, 90]. In our examples, we use the binomial tree method which is amount to
two point Gaussian quadrature ([?]).
Example 1: In the  rst example, we consider the initial boundary value problem.
ut(x) = exp(x T) sin(B(T)2 ) +
Z T
t
[12 @
2
@x2us(x) (x+
1
8)us(x) 
s
2
@
@xus(x)]ds
+
Z T
t
1
2 exp(x s) cos(B(T) 
B(s)
2 )d
 B
s;
ut( 1) = exp(( 1) t) sin(B(T) B(T)2 );
ut(1) = exp(1 t) sin(B(T) B(t)2 ):
(5.47)
86
Table 5.1: Example 1 of Section 6.2
J NT (FD) error(FD) NT (BDSDE) error (BDSDE)
22 25 0:0213 24 0:200
23 27 0:0177 26 0:0383
24 29 0:0112 28 0:0106
25 211 0:00587 210 0:00376
26 213 0:00313 212 0:00152
We construct the SPDE (5.47) in such a way that ut(x) = exp(x t) sin(B(T) B(t)2 ) is the
exact solution. The corresponding BDSDE is given by
y0;x0 = exp(X0;xT  T) sin(B(T)2 )I  T + exp(X0;x   ) sin(B(T) B( )2 )I  T
+
Z T^ 
0
[ (X0;xt + 18)y0;xt  t2z0;xt ]ds
 
Z T^ 
0
z0;xt dWt +
Z T^ 
0
1
2 exp(X
0;x
t  t) cos(B(T) 
B(t)
2 )d
 B
t:
The numerical results are shown in Table 1 and Figure 1. Here J denotes the number of
spatial partition grids, NT(FD) the number of time steps used in  nite di erence method,
NT(BDSDE) the number of time steps used in our method for solving the related BDSDE,
and error(FD) and error(BDSDE) the errors of  nite di erence method and our method,
respectively. The results indicate that our algorithm is comparable to the the algorithm
of solving the SPDE directly using the  nite di erence scheme, with a little higher rate of
convergence.
Example 2: In this example, we consider the unbounded SPDE initial value problem.
ut(x) = sin(x+T) cos(2BT) +
Z T
t
[12 @
2
@x2us(x) 
@
@xus(x)]ds
+
Z T
t
sin(W(s) +s)(sin(BT +Bs) cos(BT +Bs)) +us(x)d Bs;
(5.48)
87
?5 ?4.5 ?4 ?3.5 ?3 ?2.5 ?2?6.5
?6
?5.5
?5
?4.5
?4
?3.5
?3
?2.5
?2
?1.5
log(h)
log(error)
 
 
BDSDE
FD
2
1
Figure 5.1: Example 1: Convergence comparison between the direct  nite di erence scheme
and our scheme
Table 5.2: Example 2 of Section 6.2
J NT error(Y) (u) error(Z) (ru)
23 23 6:4096E 002 0:1188
24 24 3:2019E 002 9:0028E 002
25 25 1:4426E 002 6:3314E 002
26 26 7:2577E 003 4:1337E 002
27 27 3:5995E 003 2:7149E 002
where ut(x) = sin(x+t) cos(BT +Bt) is the solution of the SPDE (5:48). The corresponding
FBDSDE is given by
y0;x0 = sin(W(T) +T) cos(2BT) 
Z T
0
z0;xs ds
+
Z t
0
[sin(W(s) +s)(sin(BT +Bs) cos(BT +Bs))+0;xs ]d Bs
 
Z t
0
z0;xs dWs:
The errors are shown in Table 2, in which error(Y) and error(Z) are errors for Y and Z at
time-space point (t;x) = (0;0), respectively. These data also con rm our rate of converence
results.
88
?5 ?4.5 ?4 ?3.5 ?3 ?2.5 ?2?6
?5.5
?5
?4.5
?4
?3.5
?3
?2.5
?2
log(h)
log(error)
 
 
ErrorY
ErrorZ
2
1
Figure 5.2: Example 2: Convergence comparison between the approximations of y and z
5.3 First Order Numerical Algorithms
To develop  rst order numerical algorithms for BDSDEs, we simplify the BDSDEs
system to
Yt;xs = h(Wt;xT ) +
Z T
s
f(r;Wt;xr ;Yt;xr ;Zt;xr )dr
+
Z T
s
g(r;Wt;xr ;Yt;xr )d Br 
Z T
s
Zt;xr dWr; s t T:
(5.49)
In addition to the de nitions and assumptions in subsection 5.1, we introduce variational
equations of BDSDEs.
5.3.1 Variational Equations
LetrYt;xs ,rZt;xs andrWt;xs be the variations of Yt;xs , Zt;xs and Wt;xs , respectively, with
respect to x at time level s = t. Notice that that rWt;xs  1. To simplify the presentation,
we write ys = Yt;xs , zs = Zt;xs , Ws = Wt;xs , rys =rYt;xs , rzs =rZt;xs and rWs =rWt;xs .
From [61], f(ryt;rzt); 0 t Tg, is the unique solution of the equation
ryt =ryT +
Z T
t
rFsds+
Z T
t
rGsd Bs 
Z T
t
rzsdWs; (5.50)
89
where
ryT = h0(XT)rXT;
rFs = f0W(s;Ws;ys;zs) +f0y(s;Ws;ys;zs) rys +f0z(s;Ws;ys;zs) rzs; t s T
and
rGs = g0W(s;Ws;ys) +g0y(s;Ws;ys) rys; t s T:
Here f0W, f0y, f0z and g0W, g0y are  rst order partial derivatives of functions f and g with respect
to Ws, ys and zs. By Proposition 2.3 of [61],
zs =rys (rWs) 1 =rys; 0 s T: (5.51)
5.3.2 Reference equations
In this subsection we approximate integrals in (5.49) with appropriate quadratures with
 rst order accuracy and name the resulting equations as reference equations (see [87]). For
this purpose, we de ne FW;ts :=  (Wr;t  r  s)_ (Bp; 0  p  T). Let E[X] denote
the mathematical expectation of the random variable X and Et;xt [X] denote the conditional
expectation E[XjFW;tt ] of the random variable X with Wt = x.
To further simplify the notation, we denote
f(s;ys;zs) := f(s;Ws;Bs;ys;zs); g(s;ys) := g(s;Ws;Bs;ys);
g0t(s;ys) := g0t(s;Ws;Bs;ys); g0W(s;ys) := g0W(s;Ws;Bs;ys);
g0B(s;ys) := g0B(s;Ws;Bs;ys); g0y(s;ys) := g0y(s;Ws;Bs;ys);
g00WW(s;ys) := g00WW(s;Ws;Bs;ys); g00BB(s;ys) := g00BB(s;Ws;Bs;ys);
and g00yy(s;ys) := g00yy(s;Ws;Bs;ys); where g0t, g0W, g0B, g0y and g0z are  rst order partial deriva-
tives with respect to variables t, Wt, Bt and yt, respectively; g00WW, g00BB and g00yy are second
order partial derivatives with respect to variables Wt, Bt, and yt, respectively.
90
With the above notations and from equation (5.49), we have for n = 1;   ;NT 1,
ytn = ytn+1 +
Z tn+1
tn
f(s;ys;zs)ds 
Z tn+1
tn
zsdWs +
Z tn+1
tn
g(s;ys)  dBs: (5.52)
Reference equation for ys
We  rst eliminate the forward It^o integral from(5.52). To this end we take the condi-
tional expectation Etn;xtn [ ] on both sides of (5.52) to obtain
ytn;xtn = Etn;xtn [ytn+1] +
Z tn+1
tn
Etn;xtn [f(s;ys;zs)]ds+
Z tn+1
tn
Etn;xtn [g(s;ys;zs)]  dBs; (5.53)
where ytn;xtn = Etn;xtn [ytn] is the value of ytn at the space point x.
Next we approximate the integrals in the above equation with appropriate quadrature
formulas. For the  rst integral on the right hand side of (5.53), we simply use right point
formula to obtain
Z tn+1
tn
Etn;xtn [f(s;ys;zs)]ds =  tnEtn;xtn [f(tn+1;ytn+1;ztn+1)] +RW;ny ; (5.54)
where
RW;ny =
Z tn+1
tn
fEtn;xtn [f(s;ys;zs)] Etn;xtn [f(tn+1;ytn+1;ztn+1)]gds: (5.55)
For the backward It^o integral term in (5.53), the application of the right point formula will
result in only a half order scheme (see [6]) and thus is inadequate for deriving a  rst order
algorithm. To obtain a  rst order numerical scheme, we introduce It^o?s formula for doubly
stochastic integrals ([61] Lemma 1.3).
Let  t = (t;Wt;Bt;yt),  t = (1;0;0; f(t;yt;zt)),  t = (0;0;1; g(t;yt)) and  t =
(0;1;0;zt). In light of (5.49), (5.50) and the identity (5.51), we have following SDE for  t:
 t =  s +
Z t
s
 rdr +
Z t
s
 rd Br +
Z t
s
 rdWr; 0 s<t T:
91
Applying It^o?s formula for doubly stochastic integrals gives
g(t;yt) = g(s;ys) +
Z t
s
 g0
t(r;yr) g
0
y(r;yr) f(r;yr;zr)
 dr
+
Z t
s
 g0
B(r;yr) g
0
y(r;yr) g(r;yr)
 d B
r +
Z t
s
 g0
W(r;yr) +g
0
y(r;yr) zr
 dW
r
 12
Z t
s
 g00
BB(r;yr) +g
00
yy(r;yr) (g(r;yr))
2 dr
+12
Z t
s
 g00
WW(r;yr) +g
00
yy(r;yr) z
2
r
 dr:
Letting t = tn+1 in the above equation we have for tn s tn+1 that
g(s;ys) = g(tn+1;ytn+1) 
Z tn+1
s
 g0
B(r;yr) g
0
y(r;yr) g(r;yr)
 d B
r
 
Z tn+1
s
 g0
W(r;yr) +g
0
y(r;yr) zr
 dW
r +Rng(s);
(5.56)
where
Rng(s) =  
Z tn+1
s
 g0
t(r;yr) g
0
y(r;yr) f(r;yr;zr)
 dr
+12
Z tn+1
s
 g00
BB(r;yr) +g
00
yy(r;yr) (g(r;yr))
2 dr
 12
Z tn+1
s
 g00
WW(r;yr) +g
00
yy(r;yr) z
2
r
 dr:
Taking conditional expectation Etn;xtn [ ] on both sides of (5.56) and noting that
Etn;xtn [
Z tn+1
s
 g0
W(r;yr) +g
0
y(r;yr) zr
 dW
r] = 0
we obtain
Etn;xtn [g(s;ys)] = Etn;xtn [g(tn+1;ytn+1)] 
Z tn+1
s
Etn;xtn [g0B(r;yr)
92
 g0y(r;yr) g(r;yr) d Br + Etn;xtn [Rng(s)]:
Using the right point formula for the backward It^o integral above, we have that
Etn;xtn [g(s;ys)] = Etn;xtn [g(tn+1;ytn+1)] Etn;xtn  g0B(tn+1;ytn+1) (5.57)
 g0y(tn+1;ytn+1) g(tn+1;ytn+1)  
Z tn+1
s
d Br +Rng1(s) +Rng2(s);
where Rng1(s) = Etn;xtn [Rng(s)] and
Rng2(s) =
Z tn+1
s
Etn;xtn  g0B(r;yr) g0y(r;yr) g(r;yr) d Br
 Etn;xtn  g0B(tn+1;ytn+1) g0y(tn+1;ytn+1) g(tn+1;ytn+1)  
Z tn+1
s
d Br
Integrating (5.57) from tn to tn+1 with respect to d Bs gives
Z tn+1
tn
Etn;xtn [g(s;ys)]d Bs = Etn;xtn [g(tn+1;ytn+1)]  Btn Etn;xtn  g0B(tn+1;ytn+1)
 g0y(tn+1;ytn+1) g(tn+1;ytn+1)  
Z tn+1
tn
Z tn+1
s
d Brd Bs +RB;ny(5.58)
where
RB;ny =
Z tn+1
tn
(Rng1(s) +Rng2(s))d Bs: (5.59)
Notice that Z
tn+1
tn
Z tn+1
s
d Brd Bs = 12(( Btn)2  tn):
Thus from (5.58),
Z tn+1
tn
Etn;xtn [g(s;ys)]d Bs = Etn;xtn [g(tn+1;ytn+1) Btn] Etn;xtn  g0B(tn+1;ytn+1)
 g0y(tn+1;ytn+1) g(tn+1;ytn+1)  12(( Btn)2  tn) +RB;ny ;(5.60)
93
Inserting (5.54) and (5.60) to (5.53) yields
ytn;xtn = Etn;xtn [ytn+1] +  tnEtn;xtn [f(tn+1;ytn+1;ztn+1)]
+Etn;xtn [g(tn+1;ytn+1) Btn] Etn;xtn  g0B(tn+1;ytn+1) (5.61)
 g0y(tn+1;ytn+1) g(tn+1;ytn+1)  12(( Btn)2  tn) +Rny;
where
Rny = RB;ny +RW;ny : (5.62)
We name (5.61) the reference equation for the solution yt of the BDSDE (5.49). The term
Rny in the reference equation may serve as the truncation error of the numerical integrations
and the resulting numerical algorithm.
Reference equation for zs
Next we derive the reference equation for the solution zt. We multiply  Wtn on both
sides of (5.52), then take conditional expectation Etn;xtn [ ] to get
Z tn+1
tn
Etn;xtn [zs]ds= Etn;xtn [ytn+1 Wtn +
Z tn+1
tn
f(s;ys;zs) Wtnds+
Z tn+1
tn
g(s;ys) Wtn]d Bs:
(5.63)
In what follows, we will approximate the integrals in the above equations with appropriate
quadratures. The application of either left point formula or even Crank-Nicolson formula
will result in a half order algorithm. To obtain a  rst order numerical scheme, we need to
use a more accurate quadrature rule. Here we use the equation (5.50) to achieve this goal.
Choosing t = tn, T = s and taking conditional expectation Etn;xtn [ ] in (5.50), we obtain
rytn;xtn = Etn;xtn [rys] +
Z s
tn
Etn;xtn [rFr]dr +
Z s
tn
Etn;xtn [rGr]d Br;
94
where rytn;xtn = Etn;xtn [rytn] is the value of rytn at the space point x. Hence,
Etn;xtn [rys] =rytn;xtn  (
Z s
tn
Etn;xtn [rFr]dr +
Z s
tn
Etn;xtn [rGr]d Br): (5.64)
Integrating (5.64) with respect to s from tn to tn+1 yields
Z tn+1
tn
Etn;xtn [rys]ds
=
Z tn+1
tn
[rytn;xtn  (
Z s
tn
Etn;xtn [rFr]dr +
Z s
tn
Etn;xtn [rGr]d Br)]ds
= rytn;xtn   tn 
Z tn+1
tn
Z s
tn
Etn;xtn [rFr]drds 
Z tn+1
tn
Z s
tn
Etn;xtn [rGr]d Brds:
(5.65)
Replacing rGr in the above equation with rGtn+1 gives
Z tn+1
tn
Etn;xtn [rys]ds =rytn;xtn   tn Etn;xtn [rGtn+1]
Z tn+1
tn
Z s
tn
d Brds+Rs;nz (5.66)
where
Rs;nz = 
Z tn+1
tn
Z s
tn
Etn;xtn [rFr]drds 
Z tn+1
tn
Z s
tn
(Etn;xtn [rGr] Etn;xtn [rGtn+1])d Brds:
From the identity (5.51),
Z tn+1
tn
Etn;xtn [zs]ds =
Z tn+1
tn
Etn;xtn [rys]ds:
Thus, replacing ry with z in (5.66), we have
Z tn+1
tn
Etn;xtn [zs]ds = ztn;xtn   tn Etn;xtn [rGtn+1]
Z tn+1
tn
Z s
tn
d Brds+Rs;nz : (5.67)
Notice that rGs = g0W(s;ys) +g0y(s;ys) zs in (5.67).
95
For the  rst integral on the right hand side of (5.63), we use the right point formula to
get
Z tn+1
tn
Etn;xtn [f(s;ys;zs) Wtn]ds = Etn;xtn [f(tn+1;ytn+1;ztn+1) Wtn] tn +RW;nz ; (5.68)
where
RW;nz =
Z tn+1
tn
fEtn;xtn [f(s;ys;zs) Wtn] Etn;xtn [f(tn+1;ytn+1;ztn+1) Wtn]gds:
To deal with the third term on the right hand side of (5.63), we multiply  Wtn and
take conditional expectation Etn;xtn [ ] on both sides of (5.56) to obtain
Etn;xtn [g(s;ys) Wtn]
= Etn;xtn [g(tn+1;ytn+1) Wtn] 
Z tn+1
s
Etn;xtn   g0B(r;yr) g0y(r;yr) g(r;yr)   Wtn d Br
 Etn;xtn [
Z tn+1
s
 g0
W(r;yr) +g
0
y(r;yr) zr
 dW
r  Wtn] + E
tn;x
tn [R
n
g(s) Wtn]:
Then, the application of It^o?s isometry on the third term of the right hand side of the above
equation gives
Etn;xtn [g(s;ys) Wtn] = Etn;xtn [g(tn+1;ytn+1) Wtn]
 
Z tn+1
s
Etn;xtn   g0B(r;yr) g0y(r;yr) g(r;yr)   Wtn d Br
 
Z tn+1
s
Etn;xtn  g0W(r;yr) +g0y(r;yr) zr dr + Etn;xtn [Rng(s) Wtn];
96
from which we have
Etn;xtn [g(s;ys) Wtn]
= Etn;xtn [g(tn+1;ytn+1) Wtn] Etn;xtn   g0B(tn+1;ytn+1)
 g0y(tn+1;ytn+1) g(tn+1;ytn+1)   Wtn  
Z tn+1
s
d Br
 Etn;xtn  g0W(tn+1;ytn+1) +g0y(tn+1;ytn+1) ztn+1  
Z tn+1
s
dr
+ RB;nz1 (s) +RB;nz2 (s) +RB;nz3 (s);
(5.69)
where
RB;nz1 (s) = Etn;xtn [Rng(s) Wtn];
RB;nz2 (s) = 
Z tn+1
s
 Etn;x
tn
  g0
B(r;yr) g
0
y(r;yr) g(r;yr)
   W
tn
 
 Etn;xtn   g0B(tn+1;ytn+1) g0y(tn+1;ytn+1) g(tn+1;ytn+1)   Wtn  d Br
and
RB;nz3 (s) = 
Z tn+1
s
 Etn;x
tn [g
0
W(r;yr) +g
0
y(r;yr) zr
 
 Etn;xtn  g0W(tn+1;ytn+1) +g0y(tn+1;ytn+1) ztn+1  dr:
Integrating the equation (5.69) with respect to d Bs from tn to tn+1, we obtain the approxi-
mation to the third term on right hand side of (5.63) as
Z tn+1
tn
Etn;xtn [g(s;ys) Wtn]d Bs
= Etn;xtn [g(tn+1;ytn+1) Wtn] Btn Etn;xtn [(g0B(tn+1;ytn+1)
 g0y(tn+1;ytn+1)g(tn+1;ytn+1))  Wtn]
Z tn+1
tn
Z tn+1
s
d Brd Bs
 Etn;xtn  g0W(tn+1;ytn+1) +g0y(tn+1;ytn+1) ztn+1  
Z tn+1
tn
Z tn+1
s
drd Bs +RB;nz ;
(5.70)
where
RB;nz =
Z tn+1
tn
(RB;nz1 (s) +RB;nz2 (s) +RB;nz3 (s))d Bs:
97
Inserting (5.67), (5.68) and (5.70) into the equation (5.63), we have
ztn;xtn   tn Etn[rGtn+1]
Z tn+1
tn
Z s
tn
d Brds
= Etn;xtn [ytn+1 Wtn] + Etn;xtn [f(tn+1;ytn+1;ztn+1) Wtn] tn
+ Etn;xtn [g(tn+1;ytn+1) Wtn] Btn Etn;xtn [ g0B(tn+1;ytn+1)
 g0y(tn+1;ytn+1)g(tn+1;ytn+1)   Wtn] 
Z tn+1
tn
Z tn+1
s
d Brd Bs
 Etn;xtn  g0W(tn+1;ytn+1) +g0y(tn+1;ytn+1) ztn+1  
Z tn+1
tn
Z tn+1
s
drd Bs +Rnz
(5.71)
with the error term
Rnz = Rs;nz +RW;nz +RB;nz : (5.72)
Since Rtn+1tn Rtn+1s d Brd Bs = 12[( Btn)2  tn], we replace Rtn+1tn Rtn+1s d Brd Bs by
1
2[( Btn)
2  tn] in the equation (5.71) to obtain
 tn ztn;xtn = Etn;xtn [rGtn+1] 
Z tn+1
tn
Z s
tn
d Brds+ Etn;xtn [ytn+1 Wtn]
+Etn;xtn [f(tn+1;ytn+1;ztn+1) Wtn]  tn + Etn;xtn [g(tn+1;ytn+1) Wtn] Btn
 Etn;xtn   g0B(tn+1;ytn+1) g0y(tn+1;ytn+1) g(tn+1;ytn+1)   Wtn  12[( Btn)2  tn]
 Etn;xtn  g0W(tn+1;ytn+1) +g0y(tn+1;ytn+1) ztn+1  
Z tn+1
tn
Z tn+1
s
drd Bs +Rnz:
(5.73)
To proceed, we need the following lemma for the double integrals Rtn+1tn Rtn+1s drd Bs and
Rtn+1
tn
Rs
tn d
 B
rds appear in (5.73)
Lemma 5 Z
tn+1
tn
Z tn+1
s
drd Bs =
Z tn+1
tn
Z s
tn
d Brds:
Proof:It is obvious that
Z tn+1
tn
Z tn+1
s
drd Bs =
Z tn+1
tn
(tn+1 s)d Bs = tn+1  Btn 
Z tn+1
tn
sd Bs; (5.74)
98
Z tn+1
tn
Z s
tn
d Brds =
Z tn+1
tn
(Bs Btn)ds =
Z tn+1
tn
Bsds Btn  tn: (5.75)
Subtract (5.74) from (5.75) to get
Z tn+1
tn
Z s
tn
d Brds 
Z tn+1
tn
Z tn+1
s
drd Bs
=
Z tn+1
tn
sd Bs +
Z tn+1
tn
Bsds Btn  tn tn+1  Btn:
(5.76)
By It^o?s formula [61, 60], we have that
tn+1Btn+1 = tnBtn +
Z tn+1
tn
Bsds+
Z tn+1
tn
sd Bs: (5.77)
Hence, Z
tn+1
tn
Bsds+
Z tn+1
tn
sd Bs = tn+1Btn+1  tnBtn:
Then, from (5.76) and above identity, we have that
 
Z tn+1
tn
Z tn+1
s
drd Bs
= 
Z tn+1
tn
Z s
tn
d Brds+tn+1Btn+1  tnBtn Btn (tn+1 tn) tn+1 (Btn+1  Btn)
= 
Z tn+1
tn
Z s
tn
d Brds;
(5.78)
which means that Z
tn+1
tn
Z tn+1
s
drd Bs =
Z tn+1
tn
Z s
tn
d Brds:
The proof of the Lemma is completed.  
From the de nition of rGs and using Lemma 5, we can rewrite (5.73) as
 tnztn;xtn = Etn;xtn [ytn+1 Wt] + Etn;xtn [f(tn+1;ytn+1;ztn+1) Wtn] tn
+ Etn;xtn [g(tn+1;ytn+1) Wtn] Btn Etn;xtn   g0B(tn+1;ytn+1)
 g0y(tn+1;ytn+1) g(tn+1;ytn+1)   Wtn  12[( Btn)2  tn] +Rnz;
(5.79)
99
which is the reference equation for zt.
5.3.3 Numerical scheme
Neglecting the truncation error terms Rny and Rnz in (5.61) and (5.79) respectively, we
derive the numerical algorithm for solving the BDSDE (5.49) as follows. Given random
variables yNT and zNT , for n = NT 1;NT 2:::0;1,  nd the approximate solution (yn;zn)
of (yt;zt) backwardly from
yn = Etn;xtn [yn+1] +  tn Mtn +  Btn Ntn; (5.80)
and
 tnzn = Etn;xtn [yn+1  Wtn] +  tn MWtn +  Btn+1  NWtn ; (5.81)
where
Mtn = Etn;xtn [f(tn+1;yn+1;zn+1)] + 12Etn;xtn [g0B(tn+1;yn+1)]
 12Etn;xtn [g0y(tn+1;yn+1) g(tn+1;yn+1)];
Ntn = Etn;xtn [g(tn+1;yn+1)] 12Etn;xtn [g0B(tn+1;yn+1)]  Btn
+ 12Etn;xtn [g0y(tn+1;yn+1) g(tn+1;yn+1)]  Btn;
and
MWtn = Etn;xtn [f(tn+1;yn+1;zn+1)  Wtn] + 12Etn;xtn [g0B(tn+1;yn+1)  Wtn]
 12Etn;xtn [g0y(tn+1;yn+1) g(tn+1;yn+1)  Wtn];
NWtn = Etn;xtn [g(tn+1;yn+1)  Wtn] 12Etn;xtn [g0B(tn+1;yn+1)  Wtn]  Btn+1
+ 12Etn;xtn [g0y(tn+1;yn+1) g(tn+1;yn+1)  Wtn]  Btn+1:
Here, (yn;zn) is the approximate solution for (yt;zt) at t = tn, n = NT 1;NT 2:::;1;0.
100
5.3.4 Error Estimates
In this section, we show the  rst order convergence for the numerical scheme de ned by
(5.80) and (5.81). This is done in two steps. In the  rst step, we derive the upper bounds of
kyt ynk and kzt znk with respect to the truncation errors Rny in (5.61) and Rnz in (5.79).
This amounts to the stability analysis of the algorithm. In the second step we derive the
error estimates by estimating the convergence order of the truncation errors.
To simplify the presentation, we introduce the following notations which will be used
throughout the rest of the paper. Denote
eny := ytn yn; enz := ztn zn; enf := f(tn;ytn;ztn) f(tn;yn;zn);
eng := g(tn;ytn) g(tn;yn); eng0
B
:= g0B(tn;ytn) g0B(tn;yn);
en(g0y g) := g0y(tn;ytn) g(tn;ytn) g0y(tn;yn) g(tn;yn):
With these notations in hand, we subtract (5.80) and (5.81) from (5.61) and (5.79), respec-
tively to obtain
eny = Etn;xtn [en+1y ] +  tn (Etn;xtn [en+1f ] + 12Etn;xtn [en+1g0
B
] 12Etn;xtn [en+1(g0
y g)
])
+  Btn (Etn;xtn [en+1g ] 12Etn;xtn [en+1g0
B
]  Btn + 12Etn;xtn [en+1(g0
y g)
]  Btn) +Rny
(5.82)
and
 tnenz = Etn;xtn [en+1y  Wtn] +  tn (Etn;xtn [en+1f  Wtn] + 12Etn;xtn [en+1g0
B
 Wtn]
 12Etn;xtn [en+1(g0
y g)
 Wtn]) +  Btn+1  (Etn;xtn [en+1g  Wtn] 12Etn;xtn [en+1g0
B
 Wtn]  Btn+1(5.83)
+12Etn;xtn [en+1(g0
y g)
 Wtn]  Btn+1) +Rnz:
101
Furthermore, if f, g, g0B and g0y are Lipschitz continuous, then
Etn;xtn [(en+1f )2] L Etn;xtn [(en+1y )2] + Etn;xtn [(en+1z )2] ;
Etn;xtn [(en+1g0
B
)2] LEtn;xtn [(en+1y )2];
Etn;xtn [(en+1(g0
y g)
)2] LEtn;xtn [(en+1y )2];
Etn;xtn [(en+1g )2] LEtn;xtn [(en+1y )2];
(5.84)
where L is a constant depending on the Lipschitz coe cients of f, g, g0B and g0y.
Theorem 5.3 Assume that f, g, g0B and g0y are all Lipschitz continuous and g is bounded.
Then
max
0 n NT 1
(E[(eny)2] +  tn1 + E[(enz)2]) C (E[(eNTy )2] +  tNT 11 + E[(eNTz )2])
+
NT 1X
n=0
f3E[(Rny)2] + (E[R
W;n
y ])
2
 tn +C ( tn)
 1 E[(Rn
z)
2]g;
(5.85)
where RW;ny , Rny and Rnz are error terms de ned in (5.55), (5.62) and (5.72), respectively,  
is a positive constant, C is a positive constant depending on functions f, g and constant  ,
C is a constant only depending on constant  .
Proof: We  rst derive an estimate for eny. Denote
Hn+1y = Etn;xtn [en+1f ] + 12Etn;xtn [en+1g0
B
] 12Etn;xtn [en+1(g0
y g)
];
Gn+1y = Etn;xtn [en+1g ] 12Etn;xtn [en+1g0
B
]  Btn + 12Etn;xtn [en+1(g0
y g)
]  Btn:
Taking the square and then the expectation E[ ] on both sides of (5.82) gives
E[(eny)2] = E (Etn;xtn [en+1y ])2 + ( tn Hn+1y +  Btn Gn+1y +Rny)2
+2(Etn;xtn [en+1y ]) ( tn Hn+1y +  Btn Gn+1y +Rny)  :
(5.86)
102
Using Cauchy?s inequality and Young?s inequality on the right hand side of the above equa-
tion, we have that
E[(eny)2] E (Etn;xtn [en+1y ])2 + 3( tn)2 (Hn+1y )2 + 3( Btn)2 (Gn+1y )2 + 3(Rny)2
+ C1 
1
 tn (Etn;xtn [en+1y ])2 +  1C
1
 tn (Hn+1y )2
+2Etn;xtn [en+1y ] ( Btn Gn+1y +Rny) ;
where  1 is a positive constant which will be determined later. It follows from (5.84) and the
Cauchy?s inequality that there exist positive constants C1, C2 such that
E[(Hn+1y )2] C1  E[Etn;xtn [(en+1y )2]] +E[Etn;xtn [(en+1z )2]] ;
E[(Gn+1y )2] C2  E[Etn;xtn [(en+1y )2]] +  tnE[Etn;xtn [(en+1y )2]] :
(5.87)
Thus
E[(eny)2]  E[(Etn;xtn [en+1y ])2] + 3C1( tn)2  E[Etn;xtn [(en+1y )2]] +E[Etn;xtn [(en+1z )2]] 
+3C2( tn)  E[Etn;xtn [(en+1y )2]] +  tnE[Etn;xtn [(en+1y )2]] + 3E[(Rny)2] (5.88)
+C1 
1
 tn E[(Etn;xtn [en+1y ])2] + 1 tn  E[Etn;xtn [(en+1y )2]] +E[Etn;xtn [(en+1z )2]] 
+2E[Etn;xtn [en+1y ]  Btn Gn+1y ] + 2E[Etn;xtn [en+1y ]Rny]:
For the term 2E[Etn;xtn [en+1y ]  Btn  Gn+1y ] in the above inequality, we use the de nition of
Gn+1y , the Cauchy?s inequality and (5.84) to deduce that
2E[Etn;xtn [en+1y ]  Btn Gn+1y ]
= 2E[ Etn;xtn [en+1y ]  Btn (Etn;xtn [en+1g ] 12Etn;xtn [en+1g0
B
] Btn + 12Etn;xtn [en+1(g0
y g)
] Btn)]
= 0 +  tn E[ Etn;xtn [en+1y ] ( Etn;xtn [en+1g0
B
] + Etn;xtn [en+1(g0
y g)
])]
 (12 + 2L) tn E[Etn;xtn [(en+1y )2]]:
(5.89)
103
For the last term on the right hand side of (5.88), we recall that Rny = RB;ny +RW;ny , and due
to the property of the backward It^o integral, E Etn;xtn [en+1y ] RB;ny  = 0: Therefore,
2E[Etn;xtn [en+1y ] Rny] = 2E[Etn;xtn [en+1y ] RW;ny ]  tnE[Etn;xtn [(en+1y )2]] + (E[R
W;n
y ])
2
 tn : (5.90)
Inserting (5.89) and (5.90) into (5.88) gives
E[(eny)2] E[(Etn;xtn [en+1y ])2] +C 1 tn E[Etn;xtn [(en+1y )2]] + 1 tn E[Etn;xtn [(en+1z )2]]
+ 3C1( tn)2 E[Etn;xtn [(en+1z )2]] + 3E[(Rny)2] + (E[R
W;n
y ])
2
 tn
(5.91)
where C 1 is a constant depending on  1 and functions f and g.
Similar to the above estimate, we next derive an estimate for enz following a similar
procedure. To this end we square both sides of (5.83), and then take the expectation E[ ] to
obtain
( tn)2E[(enz)2] = E (Etn;xtn [en+1y  Wtn])2 +  tn Hn+1z +  Btn Gn+1z +Rnz 2
+ 2Etn;xtn [en+1y  Wtn]   tn Hn+1z + Btn Gn+1z +Rnz  
(5.92)
where
Hn+1z = Etn;xtn [en+1f  Wtn] + 12Etn;xtn [en+1g0
B
 Wtn] 12Etn;xtn [en+1(g0
y g)
 Wtn];
Gn+1z = Etn;xtn [en+1g  Wtn] 12Etn;xtn [en+1g0
B
 Wtn]  Btn + 12Etn;xtn [en+1(g0
y g)
 Wtn]  Btn:
Similarly to (5.87)
E[(Hn+1z )2] C3 tn  E[Etn;xtn [(en+1y )2]] +E[Etn;xtn [(en+1z )2]] ;
E[(Gn+1z )2] C4 tn  E[Etn;xtn [(en+1y )2]] +  tnE[Etn;xtn [(en+1y )2]] 
(5.93)
104
where C3 and C4 are positive constants. Applying Cauchy?s inequality on the right hand
side of (5.92), we deduce that
( tn)2E[(enz)2] E[Etn;xtn [en+1y  Wtn])2 + 3( tn)2(Hn+1z )2
+ 3( Btn)2(Gn+1z )2 + 3(Rnz)2] + 2E[Etn;xtn [en+1y  Wtn]  tn Hn+1z ]
+ 2E[Etn;xtn [en+1y  Wtn] Btn Gn+1z ] + 2E[Etn;xtn [en+1y  Wtn]Rnz]:
(5.94)
Next we estimate the last three terms of the above inequality. Using Young?s inequality and
(5.93) we have that
2E[Etn;xtn [en+1y  Wtn]  tn Hn+1z ]
 C3 
2
 tn E[(Etn;xtn [en+1y  Wtn])2] +  2C
3
 tn E[(Hn+1z )2]
 C3 
2
( tn)2 E[Etn;xtn [(en+1y )2]] + 2( tn)2 E[Etn;xtn [(en+1y )2] + Etn;xtn [(en+1z )2]]
= (C3 
2
+ 2)( tn)2 E[Etn;xtn [(en+1y )2]] + 2( tn)2 E[Etn;xtn [(en+1z )2]];
(5.95)
where  2 is a positive number which will be determined later. With Cauchy?s inequality and
the estimates in (5.84), we deduce that
2E[ Etn;xtn [en+1y  Wtn]  Btn Gn+1z ]
=2E[ Etn;xtn [en+1y  Wtn]  Btn
 (Etn;xtn [en+1g  Wtn] 12Etn;xtn [en+1g0
B
 Wtn] Btn + 12Etn;xtn [en+1(g0
y g)
 Wtn] Btn) ]
= 0 +  tn E[Etn;xtn [en+1y  Wtn] ( Etn;xtn [en+1g0
B
 Wtn] + Etn;xtn [en+1(g0
y g)
 Wtn])]
 (12 + 2L) ( tn)2 E[Etn;xtn [(en+1y )2]]:
(5.96)
Also, for an arbitrary given constant  > 0, it follows from Young?s inequality that
2E[Etn;xtn [en+1y  Wtn] Rnz]  E[(Etn;xtn [en+1y  Wtn])2] + 1 E[(Rnz)2]: (5.97)
105
Inserting (5.95), (5.96) and (5.97) into the equation (5.94) and using (5.93), we obtain the
estimate
( tn)2E[(enz)2] (1 + )E[(Etn;xtn [en+1y  Wtn])2] +C 2( tn)2 E[Etn;xtn [(en+1y )2]]
+ 2( tn)2 E[Etn;xtn [(en+1z )2]] + 3C3( tn)3 E[Etn;xtn [(en+1z )2]]
+C E[(Rnz)2];
(5.98)
where C 2, C are constants depending on  2,  respectively and functions f and g.
Dividing both sides of equation (5.98) by ( tn)(1 + ) and noticing that
(Etn;xtn [en+1y  Wtn])2   tn (Etn;xtn [(en+1y )2] (Etn;xtn [en+1y ])2);
we obtain
(  tn1 + )E[(enz)2] E[Etn;xtn [(en+1y )2] (Etn;xtn [en+1y ])2] +C 2  tn1 +  E[Etn;xtn [(en+1y )2]]
+ 2  tn1 +  E[Etn;xtn [(en+1z )2]] + 3C31 + ( tn)2 E[Etn;xtn [(en+1z )2]]
+C ( tn) 1 11 +  E[(Rnz)2]:
Since 11 + < 1, the above inequality can be rewritten as
(  tn1 + )E[(enz)2] E[Etn;xtn [(en+1y )2] (Etn;xtn [en+1y ])2] +C 2 tn E[Etn;xtn [(en+1y )2]]
+ 2 tn E[Etn;xtn [(en+1z )2]] + 3C3( tn)2 E[Etn;xtn [(en+1z )2]]
+C ( tn) 1 E[(Rnz)2]:
(5.99)
106
With (5.91) and (5.99) in hand, we are ready to drive the result of the theorem. First
we add (5.99) to (5.91) to obtain
E[(eny)2] +  tn1 + E[(enz)2]
 E[Etn;xtn [(en+1y )2]] + (C 1 +C 2) tn E[Etn;xtn [(en+1y )2]]
+ ( 1 + 2) tn E[Etn;xtn [(en+1z )2]] + 3(C1 +C3)( tn)2 E[Etn;xtn [(en+1z )2]]
+ 3E[(Rny)2] + (E[R
W;n
y ])
2
 tn +C ( tn)
 1 E[(Rn
z)
2]:
(5.100)
For a  xed positive constant  , we choose positive numbers  1,  2 so that
 1 + 2  11 + :
Then, the application of Jensen?s inequality leads to
E[(eny)2] +  tn1 + E[(enz)2] (1 +C 1; 2 tn) (E[(en+1y )2] +  tn1 +  E[(en+1z )2])
+ 3E[(Rny)2] + (E[R
W;n
y ])
2
 tn +C ( tn)
 1 E[(Rn
z)
2];
(5.101)
where C 1; 2 is a constant depending on  1,  2 and functions f and g.
Finally using the discrete Gronwall inequality, we obtain
max
0 n NT 1
(E[(eny)2] +  tn1 + E[(enz)2])
 C (E[(eNTy )2] +  tNT 11 + E[(eNTz )2])
+
NT 1X
n=0
f3E[(Rny)2] + (E[R
W;n
y ])
2
 tn +C ( tn)
 1 E[(Rn
z)
2]g
(5.102)
as required.  
We now apply Theorem 5.3 to obtain error estimates for the proposed numerical scheme
(5.80) and (5.81). In light of (5.102), it su ces to estimate the truncation errors Rny, Rnz and
RW;ny . First we need a couple of regularity results for the exact solution (yt;zt).
107
Lemma 6 [61] For bounded f, g and h,
E[(yt;xs  yt;xt )2] +E[
Z t
s
(zr)2dr] Cjt sj; (5.103)
and for bounded function  with bounded second-order derivatives,
(E[ (t;yt;zt)  (s;ys;zs)])2  C(t s)2
where C is a constant independent of s and t.
Lemma 7 [61] Assume that f, g, h, f0W, f0y, f0z, g0W, g0y, g0z, and h0 are all bounded, then
E[(zt;xs  zt;xt )2] +E[
Z t
s
(rzr)2dr] Cjt sj: (5.104)
where C is a constant independent of s and t.
Now we are ready to derive estimates for the truncation errors RW;ny Rny and Rnz.
Proposition 2 Assume that that functions f, g, g0B, g0W, g0y are Lipschitz continuous,
and f, g and h are bounded. Furthermore, assume that for any s 2 [0;T], (x;y;z) !
(f(s;x;y;z);g(s;x;y)) of class C3, all derivatives are bounded on [0;T]  Rd  Rk and
h2C3(Rd;Rk). Then we have the following estimates
(i): (E[RW;ny ])2  C( tn)4; (5.105)
(ii): E[(Rny)2] C( tn)3; (5.106)
(iii): E[(Rnz)2] C( tn)4; (5.107)
where C is a constant only related to functions f and g.
108
Proof: (i). By Lemma 6, the de nition of RW;ny given by (5.55), and the assumptions
in the Proposition we have that
(E[RW;ny ])2 = (
Z tn+1
tn
E[f(s;ys;zs) f(tn+1;ytn+1;ztn+1)]ds)2
  tn 
Z tn+1
tn
(E[f(s;ys;zs) f(tn+1;ytn+1;ztn+1)])2ds
 C( tn)4;
which is (5.105).
(ii). By de nition (5.62), Rny = RB;ny +RW;ny . Thus it su ces to estimate RB;ny and RW;ny .
For Rng(s) given by (5.56), it easy to see that
E[(Rng1(s))2] = E[(Et;xt [Rng(s)])2] = O(( tn)2); (5.108)
From the de nition of Rng2(s) given by (5.57) we have that
E[(Rng2(s))2] = E[(Etn;xtn [
Z tn+1
s
(g0B(r;yr) g0yg(r;yr))d Br
 (g0B(tn+1;ytn+1) g0y(tn+1;ytn+1) g(tn+1;ytn+1))
Z tn+1
s
d Br])2]
 2E[(Etn;xtn [
Z tn+1
s
(g0B(r;yr) g0B(tn+1;ytn+1))d Br])2]
+ 2E[(Etn;xtn [
Z tn+1
s
(g0y(r;yr) g(r;yr) g0y(tn+1;ytn+1) g(tn+1;ytn+1))d Br])2]:
For the  rst term on the right hand side of the above inequality, we have by Lemma 6, It^o?s
isometry and the assumptions in the Proposition, that
E[(Etn;xtn [
Z tn+1
s
(g0B(r;yr) g0B(tn+1;ytn+1))d Br])2] = O(( tn)2); (5.109)
109
Similarly,
E[(Etn;xtn [
Z tn+1
s
(g0y(r;yr) g(r;yr) g0y(tn+1;ytn+1) g(tn+1;ytn+1))d Br])2] = O(( tn)2):
(5.110)
Thus
E[(Rng2(s))2] O(( tn)2): (5.111)
It follows from (5.108) and (5.111) that
E[(RB;ny )2] = E[(
Z tn+1
tn
(Rng1(s) +Rng2(s))d Bs)2] = O(( tn)3): (5.112)
For RW;ny , it follows from Lemma 6, Lemma 7 and the assumptions of the Proposition that
E[(RW;ny )2] = E[(
Z tn+1
tn
fEtn;xtn [f(s;ys;zs)] Etn;xtn [f(tn+1;ytn+1;ztn+1)]gds)2]
  tn E[
Z tn+1
tn
fEtn;xtn [f(s;ys;zs)] Etn;xtn [f(tn+1;ytn+1;ztn+1)]g2ds] (5.113)
  tn E[
Z tn+1
tn
(C( tn)2 +LEtn;xtn [(ys ytn+1)2] +LEtn;xtn [(zs ztn+1)2])ds]
= O(( tn)3):
Combing (5.111) with (5.113), we obtain the desired estimate (5.106) for Rny.
(iii). At last, we derive estimate (5.107) for Rnz, which is de ned by Rnz = Rs;nz +RW;nz +
RB;nz in (5.72). To this end, we derive the estimates for Rs;nz , RW;nz and RB;nz .
We  rst estimate the error term Rs;nz . Under the assumptions in the Proposition,
E[
Z tn+1
tn
(rFs)2ds] = O( tn)
and
E[(Etn;xtn [rGr] Etn;xtn [rGtn+1])2] = O( tn):
110
It follows from the de nition of Rs;nz given by (5.66) and the above two identities that
E[(Rs;nz )2] = E[( 
Z tn+1
tn
Z s
tn
Etn;xtn [rFr]drds
 
Z tn+1
tn
Z s
tn
(Etn;xtn [rGr] Etn;xtn [rGtn+1])d Brds)2]
 2E[(
Z tn+1
tn
Z s
tn
Etn;xtn [rFr]drds)2] (5.114)
+2E[(
Z tn+1
tn
Z s
tn
(Etn;xtn [rGr] Etn;xtn [rGtn+1])d Brds)2]
= O(( tn)4):
Similarly to (5.113),
E[(RW;nz )2] = E[(
Z tn+1
tn
Etn;xtn [(f(s;ys;zs) f(tn+1;ytn+1;ztn+1))  Wtn]ds)2] = O(( tn)4):
(5.115)
To get the estimate RB;nz given by ( 5.70 ), we need the estimates for RB;nz1 , RB;nz2 and RB;nz3 .
With estimate (5.108), (5.109) and (5.110) in hand, it?s easy to obtain
E[(Et;xt [Rng(s) Wtn])2] = O(( tn)3);
E[(Etn;xtn [(
Z tn+1
s
(g0B(r;yr) g0B(tn+1;ytn+1))d Br) Wtn])2] = O(( tn)3); (5.116)
and
E[(Etn;xtn [(
Z tn+1
s
 g0
y(r;yr) g(r;yr) g
0
y(tn+1;ytn+1) g(tn+1;ytn+1)
 d B
r) Wtn])2]
= O(( tn)3): (5.117)
Thus, we get the estimates for RB;nz1 and RB;nz2 as following
E[(RB;nz1 (s))2] = E[(Et;xt [Rng(s) Wtn])2] = O(( tn)3);p
111
and
E[(RB;nz2 (s))2]
= E[( 
Z tn+1
s
 Etn;x
tn
  g0
B(r;yr) g
0
y(r;yr) g(r;yr)
   W
tn
 
 Etn;xtn   g0B(tn+1;ytn+1) g0y(tn+1;ytn+1) g(tn+1;ytn+1)   Wtn  d Br)2]
 2E[(Etn;xtn [(
Z tn+1
s
(g0B(r;yr) g0B(tn+1;ytn+1))d Br) Wtn])2]
+2E[(Etn;xtn [(
Z tn+1
s
(g0y(r;yr) g(r;yr) g0y(tn+1;ytn+1) g(tn+1;ytn+1))d Br) Wtn])2]
= O(( tn)3):
By using the following estimates obtained directly from Lemma 6, Lemma 7 and the
assumptions in the Proposition
E[(Etn;xtn [
Z tn+1
s
(g0W(r;yr) g0W(tn+1;ytn+1))dr])2] = O(( tn)3); (5.118)
E[(Etn;xtn [
Z tn+1
s
(g0y(r;yr) zr g0y(tn+1;ytn+1) ztn+1)dr])2] = O(( tn)3): (5.119)
We get an estimate for RB;nz3
E[(RB;nz3 (s))2] = E[( 
Z tn+1
s
 Etn;x
tn [g
0
W(r;yr) +g
0
y(r;yr) zr
 
 Etn;xtn  g0W(tn+1;ytn+1) +g0y(tn+1;ytn+1) ztn+1  dr)2]
 2E[(Etn;xtn [
Z tn+1
s
(g0W(r;yr) g0W(tn+1;ytn+1))dr])2]
+2E[(Etn;xtn [
Z tn+1
s
(g0y(r;yr) zr g0y(tn+1;ytn+1) ztn+1)dr])2]
= O(( tn)3):
112
Therefore,
E[(RB;nz )2] = E[
 Z tn+1
tn
(RB;nz1 (s) +RB;nz2 (s) +RB;nz3 (s))d Bs
 2
] = O(( tn)4): (5.120)
Then, from estimate (5.114), (5.115) and (5.120), we have
E[(Rnz)2] C( tn)4: (5.121)
 
Combining Theorem 5.3 and Proposition 2, we obtain the error estimates for our nu-
merical scheme (5.80), (5.81).
Theorem 5.4 Under the conditions of Theorem 5.3 and Proposition 2, if yNT = ytNT and
zNT = ztNT , we have
max
0 n NT 1
(E[(eny)2]) C( t)2; max
0 n NT 1
(E[(enz)2]) C t: (5.122)
5.3.5 Fully discrete scheme and its error estimate
Fully discrete scheme
The scheme we provided above is a semi-discrete scheme. In order to solve for (yn;zn)
numerically, spatial approximations are needed. To simplify the presentation, in this section,
we only consider the one dimensional case. The multi-dimensional cases can be deduced
through a straightforward generalization. In addition to the temporal partition provided in
Section 3, we introduce the following spatial partition of the real line R:
Rh =fxijxi2R;i2Z;xi <xi+1; lim
i!+1
xi = +1; lim
i! 1
xi = 1g; (5.123)
where fxigi2Z are deterministic. We denote hi = xi+1  xi as the spatial step and h =
maxi2Zhi as the maximum spatial step.
113
Assume ? is a functional of Wt, it follows from the Markov property that
E[?(Wt)jFW;tntn ] = E[?( +Wt Wtn)]j =Wtn; t tn;
where E[X] denotes the conditional expectation E[Xj (Bp; 0 p T)] of the random vari-
able X. We use values of E[?( +Wt Wtn)]j =Wtn at partition pointsfxigi2Z to approximate
the entire conditional expectation.
Take (yni;zni ) (n = N;N 1;   ;0;i2Z) as an approximation at the time-space point
(tn;xi). A fully discrete scheme is de ned as follows: given the random variable yNi , zNi ,
i2Z,  nd an approximate solution (yni;zni ) (n = N 1;   ;0;i2Z) satisfying
^yni = ^Etn;xitn [^yn+1] +  tn (^Etn;xitn [ ^f(tn+1; ^yn+1;^zn+1)] + 12 ^Etn;xitn [^g0B(tn+1; ^yn+1)]
 12 ^Etn;xitn [^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1)]) +  Btn (^Etn;xitn [^g(tn+1; ^yn+1)]
 12 ^Etn;xitn [^g0B(tn+1; ^yn+1)] Btn + 12 ^Etn;xitn [^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1)] Btn)
(5.124)
 tn^zni = ^Etn;xitn [^yn+1 Wtn] +  tn 
 ^
Etn;xitn [ ^f(tn+1; ^yn+1;^zn+1) Wtn]
+12 ^Etn;xitn [^g0B(tn+1; ^yn+1) Wtn] 12 ^Etn;xitn [^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1) Wtn]
 
(5.125)
+ Btn 
 
^Etn;xit
n [^g(tn+1; ^y
n+1;^zn+1) Wt
n] 
1
2
^Etn;xit
n [^g
0
B(tn+1; ^y
n+1) Wt
n] Btn
+12 ^Etn;xitn [^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1) Wtn] Btn
 
In this section, we denote Etn;xitn [?(Wtn+1)] to be the value of E[?( +Wtn+1 Wtn)]j =Wtn at
the grid points xi, i.e.  = xi. Here ^yn+1 and ^zn+1 are corresponding interpolated values at
the point xi+ Wtn (i2Z) by using the grid values of yn+1l and zn+1l , l2Z, respectively; for
any function  , ^ is the interpolated value at the point xi +  Wtn (i2Z) by using the grid
values of  (xi) (i2Z); ^Etn;xitn [ ] is an approximation to the conditional expectation Etn;xitn [ ].
Many methods are used to approximate Etn;xitn [ ]. For example, ^Etn;xitn [ ] can be approximations
114
of the conditional expectation by using Monte Carlo method ( see [16, 34, 35] ) or the Gauss
quadrature.
In this paper, ^Etn;xitn [ ] is an approximation to Etn;xitn [ ] using the Gauss-Hermite quadra-
ture. The main reason of doing that is the high-order accuracy of the Gauss-Hermite quadra-
ture when using the values of the integrand at a very few number of points. This will
signi cantly reduce the computation time needed to compute the conditional expectation,
especially when the integrand f is very expensive to calculate.
For a given one-dimensional function g(x), the Gauss-Hermite quadrature formula can
be written as Z
1
 1
e x2g(x)dx 
KX
i=1
wig(ai); (5.126)
where wi (i = 1;   K) are weights de ned by
wi = 2
K+1K!p 
[H0K(ai)]2 ;
and ai (i = 1;   K) are K roots of the Hermite polynomial of degree K de ned by
HK(x) = ( 1)Kex2 d
K
dxK (e
 x2):
The weights fwigKi=1 and the roots faigKi=1 for di erent positive integers K can be easily
found. Let
R(g;K) =
Z 1
 1
e x2g(x)dx 
KX
i=1
wig(ai):
Then R(g;K) is the truncation error of the Gauss-Hermite quadrature and
R(g;K) = K!
p 
2K(2K)!g
2K( ); (5.127)
for some real number  .
115
Now let us de ne the approximate mathematical expectation ^Etn;xitn [^yn+1] on time interval
[tn;tn+1]. Since  Wtn  N(0; tn), we know that
E[yn+1( +  Wtn)]j =Wtn =
Z 1
 1
yn+1( +w) 1p2  t
n
e w
2
2 tndw
From the de nition of Etn;xitn [ ], Etn;xitn [yn+1] is the value of E[yn+1( +  Wtn)]j =Wtn when
 = xi. Then we de ne
^Etn;xit
n [^y
n+1] = 1p
 
KX
j=1
wj^yn+1(xi +
p
2 tnaj)
Similarly, we de ne ^Etn;xitn [^yn+1 Wtn] by
^Etn;xit
n [^y
n+1 Wt
n] =
1p
 
KX
j=1
p
2 tnaj wj^yn+1(xi +
p
2 tnaj);
where ^yn+1(xi +p2 tn+1aj) is the interpolated grid function yn+1i at the spatial point xi +
p2 t
n+1aj by using values of yn+1i at a  nite number of spatial grid points inRh near the spa-
tial point xi+p2 tn+1aj. Approximate functions ^Exitn[ ^f(tn+1; ^yn+1;^zn+1)], ^Exitn[^g(tn+1; ^yn+1)],
^Exit
n[^g
0
B(tn+1; ^y
n+1)], ^Exit
n[^g
0
y(tn+1; ^y
n+1)], ^Etn;xit
n [^g
0
y(tn+1; ^y
n+1) ^g(tn+1; ^yn+1)],
and ^Exitn[ ^f(tn+1; ^yn+1;^zn+1) Wtn], ^Exitn[^g(tn+1; ^yn+1) Wtn], ^Exitn[^g0B(tn+1; ^yn+1) Wtn],
^Exit
n[^g
0
y(tn+1; ^y
n+1) Wt
n],
^Etn;xit
n [^g
0
y(tn+1; ^y
n+1) ^g(tn+1; ^yn+1) Wt
n] are de ned similarly.
Error estimates for the fully discrete scheme
We now present error estimates for the fully discrete scheme de ned by (5.124) and
(5.125) where linear polynomial interpolation is used for computation of the approximate
conditional expectation. To proceed, we assume that the time partition is uniform, that is
 tn =  t, for n = 0;1;2;   NT 1.
116
Letting en;iy = ytn;xitn  yni and subtracting (5.124) from (5.61), we obtain
en;iy = Etn;xitn [ytn+1] ^Etn;xitn [^yn+1] +  t fEtn;xitn [f(tn+1;ytn+1;ztn+1)]
 ^Etn;xitn [ ^f(tn+1; ^yn+1;^zn+1)] + 12Etn;xitn [g0B(tn+1;ytn+1)] 12 ^Etn;xitn [^g0B(tn+1; ^yn+1)]
 12Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1)] + 12 ^Etn;xitn [^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1)]g
+ Btn fEtn;xitn [g(tn+1;ytn+1)] ^Etn;xitn [^g(tn+1; ^yn+1)] (5.128)
 12Etn;xitn [g0B(tn+1;ytn+1)] Btn + 12 ^Etn;xitn [^g0B(tn+1; ^yn+1)] Btn
+12Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1)] Btn
 12 ^Etn;xitn [^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1)] Btng+Rn;iy ;
where Rn;iy is the truncation error Rny at the grid point x = xi.
Similarly, letting en;iz = ztn;xitn  zni and subtracting (5.125) from (5.79), we obtain
 ten;iz = Etn;xitn [ytn+1 Wtn] ^Etn;xitn [^yn+1 Wtn] +  t (Etn;xitn [f(tn+1;ytn+1;ztn+1) Wtn]
 ^Etn;xitn [ ^f(tn+1; ^yn+1;^zn+1) Wtn] + 12Etn;xitn [g0B(tn+1;ytn+1) Wtn]
 12 ^Etn;xitn [^g0B(tn+1; ^yn+1) Wtn] 12Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1) Wtn]
+12 ^Etn;xitn [^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1) Wtn])
+ Btn (Etn;xitn [g(tn+1;ytn+1) Wtn] ^Etn;xitn [^g(tn+1; ^yn+1) Wtn] (5.129)
 12Etn;xitn [g0B(tn+1;ytn+1) Wtn] Btn + 12 ^Etn;xitn [^g0B(tn+1; ^yn+1) Wtn] Btn
+12Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1) Wtn] Btn
 12 ^Etn;xitn [^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1) Wtn] Btn) +Rn;iz ;
where Rn;iz is the truncation error Rnz at the grid point x = xi.
We have the following theorem for the fully discretized scheme.
117
Theorem 5.5 In addition to the conditions of Theorem 5.3 and Lemma 2, assume that y,
z, f, g, g0B, g0y 2C(2K)b , then:
max
0 n NT 1
E[max
i
(en;iy )2] C 
"
( t) 2 
 K!
2K(2K)!
 2
+ h
4
( t)2 + ( t)
2
#
;
max
0 n NT 1
E[max
i
(en;iz )2] C 
"
( t) 3 
 K!
2K(2K)!
 2
+ h
4
( t)3 +  t
#
;
where C is a constant determined by a pre-chosen constant  > 0 and functions f and g.
Recall that for a given random variable  ,
^Etn;xit
n [ ] =
1p
 
KX
j=1
wj (xi +p2 taj): (5.130)
To present the error estimates for the fully discrete scheme, we need the following Lemma
Lemma 8 We assume that u, v are two functions, u;v : R  ! R, and  is any random
variable. Then the following inequalities hold
(^Etn;xitn [^u ^v])2  max
i
(u(xi) v(xi))2;
^Etn;xit
n [(^u ^v)
2] max
i
(u(xi) v(xi))2;
(5.131)
(^Etn;xitn [   Wtn])2   t 
h^
Etn;xitn [ 2] (^Etn;xitn [ ])2
i
(5.132)
and
(^Etn;xitn [   Wtn])2   t ^Etn;xitn [ 2]: (5.133)
118
Proof: From the de nition of ^Etn;xitn [ ] and the fact that 1p 
KX
j=1
wj = 1,
(^Etn;xitn [^u ^v])2 = ( 1p 
KX
j=1
wj 
h
^u(xi +p2 t aj) ^v(xi +p2 t aj)
i
)2
 ( 1p 
KX
j=1
wj 
  
 ^u(xi +
p2 t a
j) ^v(xi +
p2 t a
j)
  
 )2
 ( 1p 
KX
j=1
wj max
i
ju(xi) v(xi)j)2
=
 
max
i
ju(xi) v(xi)j
 2
= max
i
(u(xi) v(xi))2
and
^Etn;xit
n [(^u ^v)
2]) = 1p
 
KX
j=1
wj
 
^u(xi +p2 t aj) ^v(xi +p2 t aj)
 2
 1p 
KX
j=1
wj max
i
(u(xi) v(xi))2
= max
i
(u(xi) v(xi))2:
as required in (5.131).
To prove (5.132), we set  ij =  (xi +p2 t aj), it follows from the Cauchy Schwarz
inequality and the fact 1p 
KX
j=1
p2 t a
j wj = 0 that
 ^
Etn;xitn [   Wtn]
 2
= ( 1p 
KX
j=1
p2 t a
j wj ij)2
= ( 1p 
KX
j=1
p2 t a
j wj [  ij 
KX
p=1
wpp
  
i
p ])
2
 [
KX
j=1
wjp
  (
p2 t a
j)2 ] [
KX
j=1
wjp
  ( 
i
j 
KX
p=1
wpp
  
i
p)
2 ]
119
Since PKj=1 2wjp  (aj)2 = 1 and PKj=1 wjp = 1, we see that
[
KX
j=1
wjp
  (
p2 t a
j)2 ] [
KX
j=1
wjp
  ( 
i
j 
KX
p=1
wpp
  
i
p)
2 ]
=  t [
KX
j=1
2wjp
  (aj)
2 ] [
KX
j=1
wjp
  ( ( 
i
j)
2 2  i
j
KX
p=1
wpp
  
i
p + (
KX
p=1
wpp
  
i
p)
2 ) ]
=  t [
KX
j=1
wjp
  ( 
i
j)
2 (
KX
j=1
wjp
  
i
j)
2 ]
=  t [ ^Etn;xitn [ 2] (^Etn;xitn [ ])2 ];
which proves the inequality (5.132).
The inequality (5.133) follows directly from (5.132).  
Similar to the proof of (5.131),
 ^
Etn;xitn [( ^f(tn+1; ^ytn+1;^ztn+1) ^f(tn+1; ^yn+1;^zn+1))]
 2
= [ 1p 
KX
j=1
wj 
 ^
f(tn+1; ^ytn+1(xi +p2 t aj);^ztn+1(xi +p2 t aj))
 ^f(tn+1; ^yn+1(xi +p2 t aj);^zn+1(xi +p2 t aj))
 
]2
 [ 1p 
KX
j=1
wj max
i
  
 f(tn+1;ytn+1;xitn+1 ;ztn+1;xitn+1 ) f(tn+1;yn+1i ;zn+1i )
  
 ]2
= max
i
 
f(tn+1;ytn+1;xitn+1 ;ztn+1;xitn+1 ) f(tn+1;yn+1i ;zn+1i )
 2
 L [max
i
(en+1;iy )2 + max
i
(en+1;iz )2]:
Following the above procedure, we see that
 ^
Etn;xitn [(^g(tn+1; ^ytn+1) ^g(tn+1; ^yn+1))]
 2
 Lmax
i
(en+1;iy )2;
 ^
Etn;xitn [(^g0B(tn+1; ^ytn+1) ^g0B(tn+1; ^yn+1))]
 2
 Lmax
i
(en+1;iy )2;
 ^
Etn;xitn [(^g0B(tn+1; ^ytn+1)^g(tn+1; ^ytn+1) ^g0B(tn+1; ^yn+1)^g(tn+1; ^yn+1))]
 2
 Lmax
i
(en+1;iy )2:
(5.134)
120
Proof of Theorem 5.5: Notice that for any random variable ?tn+1 and ?n+1,
Etn;xitn [?tn+1] ^Etn;xitn [ ^?n+1]
= Etn;xitn [?tn+1] ^Etn;xitn [?tn+1] + ^Etn;xitn [?tn+1  ^?tn+1] + ^Etn;xitn [ ^?tn+1  ^?n+1]:
(5.135)
From the equation (5.128), we have that
en;iy = ^Etn;xitn [^ytn+1  ^yn+1] +  t f^Etn;xitn [ ^f(tn+1; ^ytn+1;^ztn+1) ^f(tn+1; ^yn+1;^zn+1)]
+12 ^Etn;xitn [^g0B(tn+1; ^ytn+1) ^g0B(tn+1; ^yn+1)] 12 ^Etn;xitn [^g0y(tn+1; ^ytn+1) ^g(tn+1; ^ytn+1)
 ^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1)]g+  Btn f^Etn;xitn [^g(tn+1; ^ytn+1) ^g(tn+1; ^yn+1)]
 12 ^Etn;xitn [^g0B(tn+1; ^ytn+1) ^g0B(tn+1; ^yn+1)] Btn (5.136)
+12 ^Etn;xitn [^g0y(tn+1; ^ytn+1) ^g(tn+1; ^ytn+1) ^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1)] Btng
+Iy1 +Iy2 +Rn;iy ;
where
Iy1 = Etn;xitn [ytn+1] ^Etn;xitn [ytn+1] +  t fEtn;xitn [f(tn+1;ytn+1;ztn+1)]
 ^Etn;xitn [f(tn+1;ytn+1;ztn+1)] + 12Etn;xitn [g0B(tn+1;ytn+1)] 12 ^Etn;xitn [g0B(tn+1;ytn+1)]
 12Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1)] + 12 ^Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1)]g
+ Btn fEtn;xitn [g(tn+1;ytn+1)] ^Etn;xitn [g(tn+1;ytn+1)]
 12Etn;xitn [g0B(tn+1;ytn+1)] Btn + 12 ^Etn;xitn [g0B(tn+1;ytn+1)] Btn
+12Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1)] Btn
 12 ^Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1)] Btng;
and
Iy2 = ^Etn;xitn [ytn+1  ^ytn+1] +  t f^Etn;xitn [f(tn+1;ytn+1;ztn+1) ^f(tn+1; ^ytn+1;^ztn+1)]
121
+12 ^Etn;xitn [g0B(tn+1;ytn+1) ^g0B(tn+1; ^ytn+1)] 12 ^Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1)
 ^g0y(tn+1; ^ytn+1) ^g(tn+1; ^ytn+1)]g+  Btn f^Etn;xitn [g(tn+1;ytn+1) ^g(tn+1; ^ytn+1)]
 12 ^Etn;xitn [g0B(tn+1;ytn+1) ^g0B(tn+1; ^ytn+1)] Btn
+12 ^Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1) ^g0y(tn+1; ^ytn+1) ^g(tn+1; ^ytn+1)] Btng;
Similarly, the equation (5.129) becomes
 ten;iz = ^Etn;xitn [(^ytn+1  ^yn+1)  Wtn+1]
+ t f^Etn;xitn [( ^f(tn+1; ^ytn+1;^ztn+1) ^f(tn+1; ^yn+1;^zn+1))  Wtn+1]
+12 ^Etn;xitn [(^g0B(tn+1; ^ytn+1) ^g0B(tn+1; ^yn+1))  Wtn+1]
 12 ^Etn;xitn [(^g0y(tn+1; ^ytn+1) ^g(tn+1; ^ytn+1) ^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1))  Wtn+1]g
+ Btn f^Etn;xitn [(^g(tn+1; ^ytn+1) ^g(tn+1; ^yn+1))  Wtn+1]
 12 ^Etn;xitn [(^g0B(tn+1; ^ytn+1) ^g0B(tn+1; ^yn+1))  Wtn+1] Btn (5.137)
+12 ^Etn;xitn [(^g0y(tn+1; ^ytn+1) ^g(tn+1; ^ytn+1)
 ^g0y(tn+1; ^yn+1) ^g(tn+1; ^yn+1))  Wtn+1] Btng+Iz1 +Iz2 +Rn;iz ;
where
Iz1 = Etn;xitn [ytn+1 Wtn+1] ^Etn;xitn [ytn+1 Wtn+1] +  t fEtn;xitn [f(tn+1;ytn+1;ztn+1) Wtn+1]
 ^Etn;xitn [f(tn+1;ytn+1;ztn+1) Wtn+1]
+12Etn;xitn [g0B(tn+1;ytn+1) Wtn+1] 12 ^Etn;xitn [g0B(tn+1;ytn+1) Wtn+1]
 12Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1) Wtn+1]
+12 ^Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1) Wtn+1]g
+ Btn fEtn;xitn [g(tn+1;ytn+1) Wtn+1] ^Etn;xitn [g(tn+1;ytn+1) Wtn+1]
 12Etn;xitn [g0B(tn+1;ytn+1) Wtn+1] Btn + 12 ^Etn;xitn [g0B(tn+1;ytn+1) Wtn+1] Btn
122
+12Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1) Wtn+1] Btn
 12 ^Etn;xitn [g0y(tn+1;ytn+1) g(tn+1;ytn+1) Wtn+1] Btng;
Iz2 = ^Etn;xitn [(ytn+1  ^ytn+1)  Wtn+1]
+ t f^Etn;xitn [(f(tn+1;ytn+1;ztn+1) ^f(tn+1; ^ytn+1;^ztn+1))  Wtn+1]
+12 ^Etn;xitn [(g0B(tn+1;ytn+1) ^g0B(tn+1; ^ytn+1))  Wtn+1]
 12 ^Etn;xitn [(g0y(tn+1;ytn+1) g(tn+1;ytn+1) ^g0y(tn+1; ^ytn+1) ^g(tn+1; ^ytn+1))  Wtn+1]g
+ Btn f^Etn;xitn [(g(tn+1;ytn+1) ^g(tn+1; ^ytn+1))  Wtn+1]
 12 ^Etn;xitn [(g0B(tn+1;ytn+1) ^g0B(tn+1; ^ytn+1))  Wtn+1] Btn
+12 ^Etn;xitn [(g0y(tn+1;ytn+1) g(tn+1;ytn+1) ^g0y(tn+1; ^ytn+1) ^g(tn+1; ^ytn+1))  Wtn+1] Btng;
For any function ?,
jEtn;xitn [?] ^Etn;xitn [?]j D? K!2K(2K)!;
where D? is the upper bound for j?(2K)x j .
Since y, z, f, g, g0B, g0y 2C(2K)b , we have
E[(Iy1 ])2  C
 K!
2K(2K)!
 2
and
E[max
i
(en;iy )2] +  t1 + E[max
i
(en;iz )2]
 (1 +C  t) (E[max
i
(en+1;iy )2] +  t1 +  E[max
i
(en+1;iz )2])
+3E[(Iy1 +Iy2 + max
i
Rn;iy )2] + (E[I
y
1 +I
y
2 + maxiR
W;n;i
y ])
2
 t
+C ( t) 1 E[(Iz1 +Iz2 + max
i
Rn;iz )2]:
123
Therefore,
E[max
i
(en;iy )2] +  t1 + E[max
i
(en;iz )2]
 (1 +C  t) (E[max
i
(en+1;iy )2] +  t1 +  E[max
i
(en+1;iz )2])
+3C
" 
K!
2K(2K)!
 2
+h4 + ( t)3
#
+C( t) 1 
" 
K!
2K(2K)!
 2
+h4 + ( t)4
#
+C ( t) 1 C
" 
K!
2K(2K)!
 2
+h4 + ( t)4
#
= (1 +C  t) (E[max
i
(en+1;iy )2] +  t1 +  E[max
i
(en+1;iz )2])
+C ( t) 1 C
" 
K!
2K(2K)!
 2
+h4 + ( t)4
#
:
(5.138)
By applying discrete Grownwall?s inequality, we obtain
E[max
i
(en;iy )2] +  t1 + E[max
i
(en;iz )2] 
NTX
j=n
C ( t) 1 C
" 
K!
2K(2K)!
 2
+h4 + ( t)4
#
:
Thus,
max
0 n NT 1
E[max
i
(en;iy )2] C 
"
( t) 2 
 K!
2K(2K)!
 2
+ h
4
( t)2 + ( t)
2
#
and
max
0 n NT 1
E[max
i
(en;iz )2] C 
"
( t) 3 
 K!
2K(2K)!
 2
+ h
4
( t)3 +  t
#
;
as required.
Remark 3 Notice that the error terms ( t) 2  
 K!
2K(2K)!
 2
and ( t) 3  
 K!
2K(2K)!
 2
decrease very rapidly as the parameter K becomes large.
5.3.6 Numerical experiment
We now demonstrate the e ectiveness and accuracy of our method for solving the BDS-
DE (5.49). We will compare our  rst order scheme with the half order scheme ( see [6] ) in
124
Table 5.3: Example 1 of Section 6.3
NT Error1
2
(Y) Error1
2
(Z) Error1(Y) Error1(Z)
24 0:2899 0:4197 0:1036 0:1486
25 0:2162 0:3193 5:35E 02 7:74E 02
26 0:13478 0:1942 2:52E 02 3:74E 02
27 8:76E 02 0:1292 1:21E 02 1:81E 02
CR 0:5864 0:5817 1:0380 1:0162
example1. In example 2, we show an application of BDSDE in solving a nonlinear  ltering
problem.
Example 1. Consider the BDSDE
Yt;xt = exp(T +Wt;xT ) +
Z T
t
 sin(s) Yt;x
s  Z
t;x
s  sin(s) exp(s+W
t;x
s +Bs BT)
 ds
 
Z T
t
Zt;xs dWs 
Z T
t
Yt;xs d Bs:
The exact solution is Yt;xs = exp(s+Wt;xs +Bs BT), Zt;xs = exp(s+Wt;xs +Bs BT).
Numerical results are shown in Table 5.3. Here, the integer NT is the number of temporal
partitions, CR is the convergence rate, Error1(Y) and Error1(Z) represent, respectively, the
errors in the approximation of Yt;xt and Zt;xt of the  rst order scheme. Also, Error1
2
(Y) and
Error1
2
(Z) are respectively, the errors in approximation of Yt;xt and Zt;xt of the half order
scheme. The results verify the theoretical error estimates we obtained in Section 4.
Example 2. We present a practical example which illustrates the application of our
numerical method to solve a nonlinear  ltering problem. This example is a classical \bearing-
only tracking" problem (see [8]).
The simulation scenario is shown in Figure 5.3. Consider a target moving along the
x-axis, according to a state equation
dXt = (4 + 2 sinXt)dt+d ~Wt; X0 = 40:
125
Figure 5.3: Target tracking by using one detector
0 5 10 15 20 25 30 35 40 45 500
5
10
15
20
25
30
Observation Angle
v = 4 + 2sin x  (m/s)
Platform
Target
A  xed observer located on a platform on the y-axis takes noisy measurement Ot of the
target bearing
dOt = tan 1( 20X
t
) +d ~Bt; O0 = 0:
Here, ~Wt and ~Bt are two independent standard Brownian Motions. The goal is to  nd
Figure 5.4: Tracking Estimate
0 2 4 6 8 10 120
5
10
15
20
25
30
35
40
45
Time (seconds)
Position (meters)
 
 Real state
Estimate state
the best estimate of the target position Xt based on the observations up to time t, i.e. for
E[XtjF(Ot0)], where F(Ot0) denotes the  -algebra generated by the observation fOsg0 s t.
The initial guess of the position of the target is X0 = 44. One has the conditional density
126
Figure 5.5: Probability Distribution
(a) 30 steps
25 30 35 400
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Position
Probability density
 
 Distribution
Real Target
(b) 60 steps
10 15 20 25 300
0.01
0.02
0.03
0.04
0.05
0.06
Position
Probability density
 
 Distribution
Real Target
(c) 90 steps
0 2 4 6 8 100
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Position
Probability density
 
 Distribution
Real Target
p
p(XtjF(Ot0)) = YT t;XtT t , where YT t;xs satis es the following BDSDE( see [61, ?])
YT t;xs = YT t;xT +
Z T
s
 2 cos(WT t;x
r ) Y
T t;x
r + (4 + 2 sin(W
T t;x
r )) Z
T t;x
r
 dr
 
Z T
s
ZT t;xr dWr +
Z T
s
tan 1( 20WT t;x
r
) YT t;xr d OT r;
where T  t  s  T, and x 2 R. The initial condition YT t;xT of the above equation is
the normal distribution N(44;1). The stochastic process Wt is a standard Brownian motion
independent of the observation process Ot. De ne the equivalent probability measure Q by
dQ
dPjGt = exp
 Z t
0
tan 1( 20X
t
)dOs 12
Z t
0
jtan 1( 20X
t
)j2ds
 
; t2R+;
where Gt is the  -algebra generated by fXsg0 s t and fOsg0 s t, t2R+. One has under
probability measure Q, that Wt and Ot are two independent standard Brownian Motions(see
[?]).
In our simulation example, we assume that the observations are collected at intervals of
length  t = 0:125s and we track this target for 12s. Figure 5.4 illustrates the comparison of
simulation results and the real target state. Figure 5.5 shows distributions of the simulated
conditional probability p(XtjF(Ot0)). We present the simulation results to three time levels:
127
t = 3:75s(30  t);7:5s(60  t);11:25s(90  t). The red lines represent the real target
positions and the blue curves show the simulated distribution.
128
Chapter 6
Concluding Remarks
The major contribution of this work is the development of three novel numerical ap-
proximation methods for the solution of nonlinear  ltering problems: (i) an implicit  lter
method; (ii) a hybrid sparse grid Zakai  lter method; (iii) the FBDSDE  lter method.
Our  rst e ort in this work is the development of an implicit  lter algorithm for nonlinear
 ltering problems. This method is under the general framework of the Bayesian  lter. The
essential idea is to evaluate the probability distribution of the current state in the prediction
step by evaluating the previous state through the state equation, given the value of the
current state and a sample of the noise. Through rigorous analysis we proved the convergence
of the algorithm. Numerical experiments indicate that our method is more accurate than
the standard Kalman  lter and is more stable than the standard particle  lter method for
long term simulations. It needs to point out that our method is a grid method in which
the probability distributions are evaluated at all grid points. As such, the computing cost
will increase exponentially as the dimension increases. Thus our method at current form is
suitable for only low dimension problems such as target tracking problems.. In the future,
we plan to improve the algorithm by adding an adaptive mechanism and a new interpolation
method called the \radial basis approximations" to it so that it will be more e cient in
solving higher dimensional problems.
The second e ort in this work focuses on the Zakai  lter. We proposed the construction
of a hybrid numerical algorithm to improve the e ciency of the Zakai  lter for moderately
high dimensional nonlinear  ltering problems. Our algorithm combines the advantages of the
splitting-up approximation scheme for the Zakai  lter, a hierarchical sparse grid method for
moderately high dimensional problems to compute the numerical solution of the Zakai  lter,
129
and an importance sampling method to adaptively construct a bounded domain at each time
step of the temporal discretization. The hierarchical sparse grid method reduces the number
of grid points we need to solve in the splitting-up approximation scheme for the Zakai  lter.
The application of the solution domain adaptive method allows us to solve the Zakai  lter
in the high density region of the target state pdf which reduces the size of domains that we
build the sparse grid points. We used one numerical experiment to show the e ectiveness
of our solution domain adaptive method and two other examples to demonstrate that our
algorithm is more e cient than the standard Zakai  lter while maintaining its high accuracy.
In addition, we proposed two numerical schemes for computing backward doubly s-
tochastic di erential equations (BDSDEs). Becasue of the equivalent relation between BDS-
DEs and SPDEs, our schemes can also be used to  nd numerical solutions of the Zakai
equation. Thus our algorithms also provide numerical methods for solving nonlinear  lter-
ing problems. It is worthy to mention that when solving the BDSDEs, we only need to solve
stochastic ordinary di erential equations instead of solving SPDEs in the Zakai  lter. In this
connection, high order numerical approximation methods for stochastic ordinary di erential
equations can be applied to develop high order numerical schemes for BDSDEs. The  rst
numerical scheme we proposed in this work provides a half order numerical approximation
method for BDSDEs which is derived by using the Euler scheme to discretize the stochastic
integrals. The second numerical scheme gives a  rst order numerical approximation method.
The main idea of the  rst order scheme is to use the two-sided It^o-Taylor expansion for
forward and backward stochastic di erentials to construct high order quadratures for the
stochastic integrals involving both backward and forward Brownian motions. Through rig-
orous analysis, we proved the convergence rates for both the half order scheme and the  rst
order scheme. Although we only introduced a  rst order numerical scheme in this work,
higher order schemes can also be developed using higher order It^o Taylor expansions.
130
Bibliography
[1] N. U. Ahmed and S. M. Radaideh. A powerful numerical technique solving Zakai
equation for nonlinear  ltering. Dynam. Control, 7(3):293{308, 1997.
[2] K. J.  Astr om, G. C. Goodwin, and P. R. Kumar, editors. Adaptive control,  ltering, and
signal processing, volume 74 of The IMA Volumes in Mathematics and its Applications.
Springer-Verlag, New York, 1995.
[3] Tuncer Can Aysal and Kenneth E. Barner. Meridian  ltering for robust signal process-
ing. IEEE Trans. Signal Process., 55(8):3949{3962, 2007.
[4] Syed Baker, Hart Poskar, Falk Schreilber, and Bjorn Junker. An improved constraint
 ltering technique for inferring hidden states and parameters of a biological model.
Bioinformatics, 29:1052{1059, 2013.
[5] V. Bally. Approximation scheme for solutions of BSDE. In Backward stochastic dif-
ferential equations (Paris, 1995{1996), volume 364 of Pitman Res. Notes Math. Ser.,
pages 177{191. Longman, Harlow, 1997.
[6] Feng Bao, Yanzhao Cao, and Weidong Zhao. Numerical solutions for forward backward
doubly stochastic di erential equations and zakai equations. International Journal for
Uncertainty Quanti cation, 1(4):351{367, 2011.
[7] Yaakov Bar-Shalom and Thomas E. Fortmann. Tracking and data association, volume
179 of Mathematics in Science and Engineering. Academic Press Inc., San Diego, CA,
1988.
[8] Yaakov Bar-Shalom and Thomas E. Fortmann. Tracking and data association, volume
179 of Mathematics in Science and Engineering. Academic Press Inc., San Diego, CA,
1988.
[9] Christian Bender and Jessica Steiner. A posteriori estimates for backward sdes.
SIAM/ASA J. Uncertainty Quanti cation, 1(1):139{163, 2013.
[10] A. Bensoussan, R. Glowinski, and A. R a scanu. Approximation of some stochastic d-
i erential equations by the splitting up method. Appl. Math. Optim., 25(1):81{106,
1992.
[11] Alain Bensoussan, Jussi Keppo, and Suresh P. Sethi. Optimal consumption and portfolio
decisions with partially observed real prices. Math. Finance, 19(2):215{236, 2009.
131
[12] A. V. Bernshte  n and A. P. Kuleshov. Optimal  ltering of a random background in
image processing problems. Problemy Peredachi Informatsii, 44(3):70{80, 2008.
[13] Jose Bins, Leandro L. Dihl, and Claudio R. Jung. Target tracking using multiple patches
and weighted vector median  lters. J. Math. Imaging Vision, 45(3):293{307, 2013.
[14] AlMuatazbellah M. Boker and Hassan K. Khalil. Nonlinear observers comprising high-
gain observers and extended Kalman  lters. Automatica J. IFAC, 49(12):3583{3590,
2013.
[15] Miodrag Boli c, Petar M. Djuri c, and Sangjin Hong. Resampling algorithms and archi-
tectures for distributed particle  lters. IEEE Trans. Signal Process., 53(7):2442{2450,
2005.
[16] Bruno Bouchard and Nizar Touzi. Discrete-time approximation and Monte-Carlo
simulation of backward stochastic di erential equations. Stochastic Process. Appl.,
111(2):175{206, 2004.
[17] Hans-Joachim Bungartz and Michael Griebel. Sparse grids. Acta Numer., 13:147{269,
2004.
[18] Zhe Chen. Bayesian  ltering: from kalman  lters to particle  lters, and beyond. Un-
published manuscript, 2011.
[19] D. Chevance. Numerical methods for backward stochastic di erential equations. In
Numerical methods in  nance, Publ. Newton Inst., pages 232{244. Cambridge Univ.
Press, Cambridge, 1997.
[20] J. M. C. Clark, S. A. Robbiati, and R. B. Vinter. The shifted Rayleigh mixture  lter
for bearings-only tracking of maneuvering targets. IEEE Trans. Signal Process., 55(7,
part 1):3218{3226, 2007.
[21] D. Crisan and O. Obanubi. Particle  lters with random resampling times. Stochastic
Process. Appl., 122(4):1332{1368, 2012.
[22] Dan Crisan. Exact rates of convergence for a branching particle approximation to the
solution of the Zakai equation. Ann. Probab., 31(2):693{718, 2003.
[23] Dan Crisan. Exact rates of convergence for a branching particle approximation to the
solution of the Zakai equation. Ann. Probab., 31(2):693{718, 2003.
[24] Dan Crisan and Jie Xiong. A central limit type theorem for a class of particle  lters.
Commun. Stoch. Anal., 1(1):103{122, 2007.
[25] Crisan D. and Doucet A. A survey of convergence results on particle  ltering methods
for practitiners. IEEE Trans. Sig. Proc.
[26] D. H. Dini, C. Jahanchahi, and D. P. Mandic. Kalman  ltering for widely linear complex
and quaternion valued bearings only tracking. IET Signal Process., 6(5):435{445, 2012.
132
[27] Jind rich Dun  k, Miroslav  Simandl, and Ond rej Straka. Unscented Kalman  lter: aspects
and adaptive setting of scaling parameter. IEEE Trans. Automat. Control, 57(9):2411{
2416, 2012.
[28] Garry A. Einicke. Asymptotic optimality of the minimum-variance  xed-interval s-
moother. IEEE Trans. Signal Process., 55(4):1543{1547, 2007.
[29] Bertil Ekstrand. Some aspects on  lter design for target tracking. J. Control Sci. Eng.,
pages Art. ID 870890, 15, 2012.
[30] Patrick Florchinger and Fran cois Le Gland. Time-discretization of the Zakai equation for
di usion processes observed in correlated noise. Stochastics Stochastics Rep., 35(4):233{
256, 1991.
[31] R udiger Frey and Thorsten Schmidt. Pricing corporate securities under noisy asset
information. Math. Finance, 19(3):403{421, 2009.
[32] R udiger Frey, Thorsten Schmidt, and Ling Xu. On Galerkin approximations for the
Zakai equation with di usive and point process observations. SIAM J. Numer. Anal.,
51(4):2036{2062, 2013.
[33] Paul Frogerais, Jean-Jacques Bellanger, and Lot Senhadji. Various ways to com-
pute the continuous-discrete extended Kalman  lter. IEEE Trans. Automat. Control,
57(4):1000{1004, 2012.
[34] Emmanuel Gobet and C eline Labart. Error expansion for the discretization of backward
stochastic di erential equations. Stochastic Process. Appl., 117(7):803{829, 2007.
[35] Emmanuel Gobet, Jean-Philippe Lemor, and Xavier Warin. A regression-based Monte
Carlo method to solve backward stochastic di erential equations. Ann. Appl. Probab.,
15(3):2172{2202, 2005.
[36] Emmanuel Gobet, Gilles Pag es, Huy^en Pham, and Jacques Printems. Discretization and
simulation of the Zakai equation. SIAM J. Numer. Anal., 44(6):2505{2538 (electronic),
2006.
[37] H. Gong, J. Singh, and A. Thavaneswaran. Filtering and option pricing with transfor-
mation. J. Appl. Statist. Sci., 17(3, [2009 on cover]):363{375, 2010.
[38] N.J Gordon, D.J Salmond, and A.F.M. Smith. Novel approach to nonlinear/non-
gaussian bayesian state estimation. IEE PROCEEDING-F, 140(2):107{113, 1993.
[39] Y. Hu, G. Kallianpur, and J. Xiong. An approximation for the Zakai equation. Appl.
Math. Optim., 45(1):23{44, 2002.
[40] Je rey Humpherys, Preston Redd, and Jeremy West. A fresh look at the Kalman  lter.
SIAM Rev., 54(4):801{823, 2012.
133
[41] John D. Jakeman, Richard Archibald, and Dongbin Xiu. Characterization of disconti-
nuities in high-dimensional stochastic problems on adaptive sparse grids. J. Comput.
Phys., 230(10):3977{3997, 2011.
[42] Adrew H. Jazwinski. Stochastic Processing and Filtering Theory, volume 64. Academic
Press, New York, 1973.
[43] Simon J. Julier and Joseph J. LaViola, Jr. On Kalman  ltering with nonlinear equality
constraints. IEEE Trans. Signal Process., 55(6, part 2):2774{2784, 2007.
[44] Xiong Kai, Liu Liangdong, and Liu Yiwu. Robust extended Kalman  ltering for non-
linear systems with multiplicative noises. Optimal Control Appl. Methods, 32(1):47{63,
2011.
[45] R. E. Kalman and R. S. Bucy. New results in linear  ltering and prediction theory.
Trans. ASME Ser. D. J. Basic Engrg., 83:95{108, 1961.
[46] A. N. Kolmogoro . Stationary sequences in Hilbert?s space. Bolletin Moskovskogo
Gosudarstvenogo Universiteta. Matematika, 2:40pp, 1941.
[47] Gennady Yu. Kulikov and Maria V. Kulikova. Accurate numerical implementation of the
continuous-discrete extended Kalman  lter. IEEE Trans. Automat. Control, 59(1):273{
279, 2014.
[48] Fran cois Le Gland. Time discretization of nonlinear  ltering equations. In Proceedings
of the 28th IEEE Conference on Decision and Control, Vol. 1{3 (Tampa, FL, 1989),
pages 2601{2606, New York, 1989. IEEE.
[49] Binghuai Lin and Dennis McLaughlin. E cient Characterization of Uncertain Model
Parameters with a Reduced-Order Ensemble Kalman Filter. SIAM J. Sci. Comput.,
36(2):B198{B224, 2014.
[50] G. Lin, A. M. Tartakovsky, and D. M. Tartakovsky. Uncertainty quanti cation via
random domain decomposition and probabilistic collocation on sparse grids. J. Comput.
Phys., 229(19):6995{7012, 2010.
[51] J. Ma and J. Yong. Approximate solvability of forward-backward stochastic di erential
equations. Appl. Math. Optim., 45(1):1{22, 2002.
[52] Jin Ma, Philip Protter, Jaime San Mart  n, and Soledad Torres. Numerical method for
backward stochastic di erential equations. Ann. Appl. Probab., 12(1):302{316, 2002.
[53] Jin Ma, Philip Protter, and Jiong Min Yong. Solving forward-backward stochastic
di erential equations explicitly|a four step scheme. Probab. Theory Related Fields,
98(3):339{359, 1994.
[54] Jin Ma, Jie Shen, and Yanhong Zhao. On numerical approximations of forward-
backward stochastic di erential equations. SIAM J. Numer. Anal., 46(5):2636{2661,
2008.
134
[55] Xiang Ma and Nicholas Zabaras. An adaptive hierarchical sparse grid collocation
algorithm for the solution of stochastic di erential equations. J. Comput. Phys.,
228(8):3084{3113, 2009.
[56] C. Johan Masreliez and R. Douglas Martin. Robust Bayesian estimation for the lin-
ear model and robustifying the Kalman  lter. IEEE Trans. Automatic Control, AC-
22(3):361{371, 1977.
[57] Motoyasu Nagata and Yoshikazu Sawaragi. Error analysis of the Schmidt-Kalman  lter.
Internat. J. Systems Sci., 7(7):769{778, 1976.
[58] Boris N. Oreshkin and Mark J. Coates. Analysis of error propagation in particle  lters
with approximation. Ann. Appl. Probab., 21(6):2343{2378, 2011.
[59] Umut Orguner and Fredrik Gustafsson. Target tracking with particle  lters under signal
propagation delays. IEEE Trans. Signal Process., 59(6):2485{2495, 2011.
[60]  E. Pardoux and P. Protter. A two-sided stochastic integral and its calculus. Probab.
Theory Related Fields, 76(1):15{49, 1987.
[61]  Etienne Pardoux and Shi Ge Peng. Backward doubly stochastic di erential equations
and systems of quasilinear SPDEs. Probab. Theory Related Fields, 98(2):209{227, 1994.
[62] Nigel George Peach. Bearings-only tracking using a set of range parameterised extended
Kalman  lters. ProQuest LLC, Ann Arbor, MI, 1997. Thesis (Ph.D.){University of
Bath (United Kingdom).
[63] A. Peirce and F. Rochinha. An integrated extended Kalman  lter-implicit level set
algorithm for monitoring planar hydraulic fractures. Inverse Problems, 28(1):015009,
22, 2012.
[64] Dirk P  uger. page 194, 2010. Thesis (Ph.D.){ Technische Universit at M unchen.
[65] J. A. Pocock, S. L. Dance, and A. S. Lawless. State estimation using the particle  lter
with mode tracking. Comput. & Fluids, 46:392{397, 2011.
[66] Charles F. Price. An analysis of the divergence problem in the Kalman  lter. IEEE
Trans. Automatic Control, AC-13:699{702, 1968.
[67] Jacques Printems. On the discretization in time of parabolic stochastic partial di er-
ential equations. M2AN Math. Model. Numer. Anal., 35(6):1055{1078, 2001.
[68] Christoph Reisinger. Analysis of linear di erence schemes in the sparse grid combination
technique. IMA J. Numer. Anal., 33(2):544{581, 2013.
[69] Elliott Robet, J. and Kuen Tak Siu. Option pricing and  ltering with hidden markov-
modulated pure-jump processes. Appl. Math. Finance, 20(1):1{25, 2013.
135
[70] Jie Shen and Haijun Yu. E cient spectral sparse grid methods and applications to
high-dimensional elliptic equations II. Unbounded domains. SIAM J. Sci. Comput.,
34(2):A1141{A1164, 2012.
[71] S. Singh, S. Digumarthy, A Back, J. Shepard, and M Kalra. Radiation dose reduction
for chest ct with non-linear adaptive  lters. Acta Radiologica, 55(2):169{174, 2013.
[72] M. W. A. Smith and A. P. Roberts. An exact equivalence between the discrete-
and continuous-time formulations of the Kalman  lter. Math. Comput. Simulation,
20(2):102{109, 1978.
[73] Subchan and Tahiyatul As hani. The missile guidance estimation using extended
Kalman  lter-unknown input-without direct feedthrough (EKF-UI-WDF) method. J.
Indones. Math. Soc., 19(1):1{14, 2013.
[74] Ali Akbar Tadaion, Mostafa Derakhtian, Saeed Gazor, and Mohammad R. Aref. A fast
multiple-source detection and localization array signal processing algorithm using the
spatial  ltering and ML approach. IEEE Trans. Signal Process., 55(5, part 1):1815{
1827, 2007.
[75] Krystyna Twardowska, Tomasz Marnik, and Monika Pas lawaska-Po luniak. Approxi-
mation of the Zakai equation in a nonlinear  ltering problem with delay. Int. J. Appl.
Math. Comput. Sci., 13(2):151{160, 2003.
[76] Yener Ulker and Bilge Gunsel. Multiple model target tracking with variable rate particle
 lters. Digit. Signal Process., 22(3):417{429, 2012.
[77] Xin Wang and Shu-Li Sun. Measurement feedback self-tuning weighted measurement
fusion Kalman  lter for systems with correlated noises. J. Appl. Math., pages 324296,
16, 2012.
[78] Nick Whiteley and Anthony Lee. Twisted particle  lters. Ann. Statist., 42(1):115{141,
2014.
[79] N. Wiener and E. Hopf. On a class of singular integral equations. Proc. Prussian Acad.
Math. - Phys. Ser., page 696pp, 1931.
[80] Norbert Wiener. Extrapolation, Interpolation, and Smoothing of Stationary Time Series.
With Engineering Applications. The Technology Press of the Massachusetts Institute of
Technology, Cambridge, Mass, 1949.
[81] Dongbin Xiu and Jan S. Hesthaven. High-order collocation methods for di erential
equations with random inputs. SIAM J. Sci. Comput., 27(3):1118{1139 (electronic),
2005.
[82] Moshe Zakai. On the optimal  ltering of di usion processes. Z. Wahrscheinlichkeits-
theorie und Verw. Gebiete, 11:230{243, 1969.
136
[83] Andreas Zeiser. Fast matrix-vector multiplication in the sparse-grid Galerkin method.
J. Sci. Comput., 47(3):328{346, 2011.
[84] Guannan Zhang and Max Gunzburger. Error analysis of a stochastic collocation method
for parabolic partial di erential equations with random input data. SIAM J. Numer.
Anal., 50(4):1922{1940, 2012.
[85] H. Zhang and D. Laneuville. Grid based solution of zakai equation with adaptive local
re nement for bearing-only tracking. IEEE Aerospace Conference, 2008.
[86] Jianfeng Zhang. A numerical scheme for BSDEs. Ann. Appl. Probab., 14(1):459{488,
2004.
[87] Weidong Zhao, Lifeng Chen, and Shige Peng. A new kind of accurate numerical method
for backward stochastic di erential equations. SIAM J. Sci. Comput., 28(4):1563{1581,
2006.
[88] WeiDong Zhao, Yang Li, and Yu Fu. Second-order schemes for solving decoupled forward
backward stochastic di erential equations. Sci. China Math., 57(4):665{686, 2014.
[89] Weidong Zhao, Yang Li, and Guannan Zhang. A generalized  -scheme for solving back-
ward stochastic di erential equations. Discrete Contin. Dyn. Syst. Ser. B, 17(5):1585{
1603, 2012.
[90] Weidong Zhao, Guannan Zhang, and Lili Ju. A stable multistep scheme for solving
backward stochastic di erential equations. SIAM J. Numer. Anal., 48(4):1369{1394,
2010.
[91] Y. Zhao, H. M. Guo, and S. A. Billings. Application of totalistic cellular automata for
noise  ltering in image processing. J. Cell. Autom., 7(3):207{221, 2012.
137