LEBESGUE APPROXIMATION OF SUPERPROCESSES
by
Xin He
A dissertation submitted to the Graduate Faculty of
Auburn University
in partial ful llment of the
requirements for the Degree of
Doctor of Philosophy
Auburn, Alabama
August 03, 2013
Keywords: Super-Brownian motion, measure-valued branching processes,
deterministic distribution properties
Copyright 2013 by Xin He
Approved by
Olav Kallenberg, Chair, Professor of Mathematics
Ming Liao, Professor of Mathematics
Jerzy Szulga, Professor of Mathematics
Erkan Nane, Associate Professor of Mathematics
Abstract
Superprocesses are certain measure-valued Markov processes, whose distributions can
be characterized by two components: the branching mechanism and the spatial motion. It
is well known that some basic superprocesses are scaling limits of various random spatially
distributed systems near criticality.
We consider the Lebesgue approximation of superprocesses. The Lebesgue approxima-
tion means that the processes at a  xed time can be approximated by suitably normalized
restrictions of Lebesgue measure to the small neighborhoods of their support. From this,
we see that the processes distribute their mass over their support in a deterministic and
\uniform" manner. It is known that the Lebesgue approximation holds for the most basic
Dawson{Watanabe superprocesses but fails for certain superprocesses with discontinuous
spatial motion.
In this dissertation we  rst prove that the Lebesgue approximation holds for superpro-
cesses with Brownian spatial motion and a stable branching mechanism. Then we generalize
the Lebesgue approximation even further to superprocesses with Brownian spatial motion
and a regularly varying branching mechanism. We believe that the Lebesgue approxima-
tion holds for superprocesses with Brownian spatial motion and any \reasonable" branching
mechanism. Our present results may be regarded as some progress towards a complete proof
of this very general conjecture.
ii
Acknowledgments
I owe my deepest gratitude to my advisor Dr. Olav Kallenberg, who has put lots of his
time and e ort in trying to develop me into a proper researcher in probability. His taste,
style, and work ethic have all in uenced me immensely, and they will continue to in uence
me in the future. I also sincerely thank Dr. Ming Liao, Dr. Jerzy Szulga, and Dr. Erkan
Nane for serving in my committee and providing me much help when needed. Finally, I am
grateful for the moral support of my family: My parents Anqing He and Ruizhen Hao, my
sister Ni He, and my wife Yajie Chu. I am extremely lucky to have them and I simply can
not imagine my life without them.
iii
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 A short introduction to superprocesses . . . . . . . . . . . . . . . . . . . . . 1
1.2 Summary of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Some Basics of Superprocesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1 Moment measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Cluster representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Historical superprocesses and random snakes . . . . . . . . . . . . . . . . . . 13
2.4 Hausdor dimensions and Hausdor measures . . . . . . . . . . . . . . . . . 16
2.5 Lebesgue approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Lebesgue Approximation of Dawson-Watanabe Superprocesses . . . . . . . . . . 26
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Lebesgue approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Proofs of lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 Lebesgue Approximation of (2; )-Superprocesses . . . . . . . . . . . . . . . . . 44
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Truncated superprocesses and local  niteness . . . . . . . . . . . . . . . . . . 46
4.3 Hitting bounds and neighborhood measures . . . . . . . . . . . . . . . . . . 54
4.4 Hitting asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.5 Lebesgue approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
iv
5 Lebesgue Approximation of Superprocesses with a Regularly Varying Branching
Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2 Truncation of superprocesses . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3 Hitting bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4 Hitting asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.5 Lebesgue approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
v
List of Notation
 x Dirac measure at x
=_ equality up to a constant factor
^Md space of  nite measures on Rd
 d Lebesgue measure on Rd
<_ inequality up to a constant factor
 f integral of function f with respect to measure  
 " neighborhood measure of  , de ned as the restriction of Lebesgue measure
 d to the "-neighborhood of supp 
A closure of set A
kfk supremum norm of function f
k k total variation of measure  
 =  (v) branching mechanism
 _ the combination of <_ and >_
 = ( t) superprocess or discrete spatial branching process
A" "-neighborhood of set A, A" =fx : d(x;A) <"g
Ac compliment of set A
Brx an open ball around x of radius r
vi
f g f g!0
f g f=g!1
Md space of  - nite measures on Rd
v! vague convergence of measures
w! weak convergence of measures
vii
Chapter 1
Introduction
1.1 A short introduction to superprocesses
In this section we give a short introduction to superprocesses. Three characterizations
of superprocesses will be given. They are the Laplace functional approach, the weak conver-
gence approach, and the martingale problem approach. Superprocesses were introduced by
Watanabe [49] in 1968 and Dawson [3] in 1975, and have been studied extensively ever since.
General surveys of superprocesses include the following excellent monographs and lecture
notes: Dawson [4, 5], Dynkin [15, 16], Etheridge [17], Le Gall [32], Li [35], and Perkins [42].
Two extremely informative yet concise and very accessible introductions of superprocesses
are Perkins [43] and Slade [46].
First let us explain the two de ning components of a superprocess: the branching mech-
anism and the spatial motion. We begin with branching processes, which contain only one
component of superprocesses: the branching mechanism. Galton-Watson processes are dis-
crete branching processes. They describe the evolution in discrete time of a population of
individuals who reproduce according to an o spring distribution, which is a probability mea-
sure on the nonnegative integers with expectation 1 (we only consider the critical case in this
introduction). The distribution of a Galton-Watson process is determined by this o spring
distribution. Continuous-state branching processes are continuous analogues of the Galton-
Watson branching processes. Roughly speaking, they describe the evolution in continuous
time of a \population" with values in the positive real line R+. The \population" consists
of uncountably many \individuals", if its value is not 0. The distribution of a continuous-
state branching process is determined by a function  of the following type (again, we only
1
consider the critical case in this introduction, so no drift term here)
 (v) = av2 +
Z 1
0
(e rv 1 +rv) (dr) (1.1)
where a 0 and  is a  - nite measure on (0;1) such that R10 (r^r2) (dr) <1. This
function  is called the branching mechanism. Continuous-state branching processes may
also be obtained as weak limits of rescaled Galton-Watson processes, see (1.7). This is closely
related to the weak convergence approach to superprocesses, see (1.8).
Spatial branching processes are obtained by combining the branching phenomenon with
a spatial motion, which is usually given by a Markov process X. In the discrete setting, the
branching phenomenon is a Galton-Watson process, and the individuals move independently
in space according to the law of X. More precisely, when an individual dies at position x,
her children begin to move from the initial point x, and they move in space independently
according to the law of X. Writing Y 1t ;Y 2t ;::: for the positions of all individuals alive at
time t, we may de ne
 t =
X
i
 Yit (1.2)
where  y denotes the Dirac measure at y. The process  = ( t;t 0) is the spatial branching
process corresponding to the branching phenomenon of a Galton-Watson process and the
spatial motion X. Note that this is a measure-valued process, whose value at time t records
the positions of all individuals alive at time t.
In the continuous setting, the branching phenomenon is a continuous-state branching
process with branching mechanism  . The construction of the spatial motions is harder,
and so here we proceed only heuristically. For mathematical support of these heuristics,
refer to the weak convergence approach later in this section (see (1.7) and (1.8)), cluster
representation in Section 2.2, and historical superprocesses and random snakes in Section
2.3. Here we let the \individuals" move independently in space according to the law of a
Markov process X. Thus when an \individual" dies at the position x, her \children" begin
2
to move from the initial point x, and they move in space independently according to the law
of X. Again we get a measure-valued process  = ( t;t 0), whose value at time t records
the positions of all \individuals" alive at time t. This measure-valued process  = ( t) is
called the (X; ){superprocess (or (X; )-process, for short).
Superprocesses are measure-valued Markov processes. We  rst use the Laplace func-
tional approach to characterize their distributions. For a (X; )-process on Rd, the spatial
motion X is a Markov process in Rd. Use  f to denote the integral of the function f with
respect to the measure  . Write P ( 2 ) for the distribution of the process  with ini-
tial measure  , and E for the expectation corresponding to P . The Laplace functional
E exp(  tf) satis es
E [exp(  tf)j s] = exp(  svt s) (1.3)
where  vt(x);t 0;x2Rd is the unique nonnegative solution of the integral equation
vt(x) +  x
 Z t
0
 (vt s(Xs))ds
 
=  x (f(Xt)) (1.4)
Here we write  x(X 2 ) for the distribution of the process X starting from x. If X is a
Feller process in Rd with generator L, the integral equation (1.4) is the integral form of the
following PDE, the so-called evolution equation
_v = Lv  (v) (1.5)
with initial condition v0 = f. More explicitly, PDE (1.5) means
@vt
@t (x) = (Lvt)(x)  (vt(x)):
For the equivalence of the integral equation (1.4) and the di erential equation (1.5), see
Section 7.1 in [35]. If X is a rotation invariant (or spherically symmetric, or isotropic)  -
stable L evy process in Rd for some  2 (0;2] (see De nition 14.12 and Theorem 14.14 in
3
[45]) and  (v) = v1+ for some  2(0;1], we get a superprocess corresponding to the PDE
_v = 12   v v1+ 
where 12   is the generator of the rotation invariant  -stable process X (see Theorem
19.10 in [24]), and   = (  ) =2 is the fractional Laplacian ( 2 =  is the Laplacian, see
Section 2.6 in [39]). Taking  = 1 in the above PDE, we get a superprocess corresponding
to the PDE
_v = 12  v v1+ : (1.6)
We call it the ( ; )-superprocess (( ; )-process for short). For the most basic and most
important superprocess, we take  = 2 and  = 1 to get a (2;1)-process, which is often
called the Dawson{Watanabe superprocess (DW-process for short). Clearly a DW-process
has Brownian spatial motion and branching mechanism  (v) = v2. We may abuse the
notation further by referring to ( ; )-processes and (X; )-processes. Speci cally, an ( ; )-
process has rotation invariant  -stable spatial motion and branching mechanism  , and an
(X; )-process has spatial motion X and branching mechanism  (v) = v1+ .
Next we move to the weak convergence approach, which is the most intuitive way to
de ne superprocesses. Just as continuous-state branching processes may be obtained as
weak limits of rescaled Galton-Watson processes (see (1.7)), superprocesses can be obtained
as weak limits of rescaled discrete spatial branching processes (see (1.8)). Recall that we can
get intuition about Brownian motion from rescaled random walks, similarly here we may get
some intuition about superprocesses from rescaled discrete spatial branching processes.
We consider a sequence Nk; k 1 of Galton-Watson processes such that as k!1,
 1
akN
k
[kt];t 0
 
fd !(Z
t;t 0) (1.7)
4
where constants ak "1, Z is a continuous-state branching process with branching mecha-
nism  , and the symbol fd !means weak convergence of  nite-dimensional marginals. Then,
according to (1.2), we consider a sequence  k; k  1 of spatial branching processes corre-
sponding to the Galton-Watson processes Nk; k 1 and the spatial motion X. Clearly  kt
is a random element with values in the space of  nite measures on Rd, equipped with the
topology of weak convergence. Now, according to (1.7), we consider a sequence of rescaled
spatial branching processes 1a
k
 k[k ]; k  1. Suppose that the initial measures converge as
k!1 ( w! denotes weak convergence) :
1
ak 
k
0
w! ;
where  is a  nite measure on Rd. Finally, under adequate regularity assumptions on the
spatial motion X, there exists a measure-valued Markov process  such that
( 1a
k
 k[kt];t 0) fd !( t;t 0); (1.8)
where  is an (X; )-process with initial measure  .
Finally, superprocesses can also be characterized as solutions to martingale problems.
Chapter 7 in [35] is an excellent reference on martingale problems of very general superpro-
cesses. We  rst discuss a martingale problem of (X;1)-processes, where X is a Feller process
in Rd with generator L. Write ^Md for the space of  nite measures on Rd. Then write
(D([0;1); ^Md); t;Ft) for the space of rcll ^Md-valued paths, the coordinate process, and
the canonical completed right continuous  ltration. For any f2D(L) (domain of generator
L), de ne the process Mt(f) by
Mt(f) =  tf  0f 
Z t
0
 s(Lf)ds: (1.9)
5
For any  2 ^Md, useL to denote the distribution of an (X;1)-process with initial measure
 . This is the unique distribution on F =  (St 0Ft) such that the coordinate process
satis es the following martingale problem:  0 =  , and for any f2D(L), the process Mt(f)
de ned in (1.9) is a continuous martingale with quadratic variation process
[M(f)]t =
Z t
0
 s(f2)ds:
For a (X; )-process, the corresponding martingale is not continuous in general. In this
case, we may split the martingale into two parts: the continuous martingale Mct (f) and the
purely discontinuous martingale Mdt (f) (see Theorem 26.14 in [24]). Then we write
Mt(f) = Mct (f) +Mdt (f) =  tf  0f 
Z t
0
 s(Lf)ds;
where Mct (f) is a continuous martingale with quadratic variation process
[Mc(f)]t =
Z t
0
 s(af2)ds; (1.10)
andMdt (f) is a purely discontinuous martingale, which can be de ned through a compensated
random measure relating to the jumps of  . For details, see Section 7.2 in [35]. Note that
the jumps of  are related to the measure  in the branching mechanisam  of (1.1), not
the jumps of the spatial motion X (see Section 2.6 in [17]). We may also note that the
continuous martingale Mct (f) is related to the term av2 in the branching mechanisam  of
(1.1) through its quadratic variation process [Mc(f)]t in (1.10).
6
1.2 Summary of contents
The purpose of this dissertation is to discuss the Lebesgue approximation of superpro-
cesses in details. In Section 1 we discussed the de nitions of superprocesses. Here we give a
summary of the contents of the following chapters.
In Chapter 2, we discuss some basic ingredients of superprocesses in the  rst three
sections, which are crucial for the Lebesgue approximation of superprocesses. Then we
discuss the background of Lebesgue approximation and some related known results in the
last two sections. In Section 1 we discuss the  rst moment measure E  t and the second
moment measure E  2t . In particular, the second moment measure does not exist in general,
which causes a real di culty for generalizing certain results. In Section 2 we discuss the
very important cluster representation of superprocesses, which contains partial information
of the whole genealogical evolution underlying superprocesses. This cluster representation
is transparent in the discrete setting, however in the continuous setting it is not easy at all
to obtain it rigorously. In Section 3 we discuss two approaches to encode the genealogical
information and to obtain the cluster representation. They are Historical superprocesses
approach and random snakes approach. In Section 4 we discuss some classical results about
the Hausdor dimensions and Hausdor measures of superprocesses. The point is that
the Hausdor measure approach is a more traditional, more successful way to do what
the Lebesgue approximation approach tries to do: Construct nontrivial measures on some
random null sets. Finally in Section 5 we discuss basic ideas of Lebesgue approximation and
review almost all known Lebesgue approximation results. At the end of this section we also
discuss some related open problems.
In Chapter 3, we discuss the Lebesgue approximation of Dawson-Watanabe superpro-
cesses of dimension d 3, which is the most basic and most transparent case. This chapter is
based on Kallenberg?s proof of Lebesgue approximation of DW-processes of dimension d 3
in [25], with some technical simpli cations. Note that Tribe  rst proved this result in [48].
Extra e orts haae been made to explain Kallenberg?s approach clearly and to make it more
7
accessible. In Section 1 we explain some crucial components in the proof and review some
terminology and notation. In Section 2 we  rst explain the crucial ideas about cluster rep-
resentations, then state several lemmas which will not be used directly in the main proof of
Lebesgue approximation, including the important upper bound of the hitting multiplicities.
In Section 3 we state and prove the Lebesgue approximation for DW-processes of dimensions
d 3. In order to do so, we list several lemmas that are needed in the main proof. Finally, in
Section 4, we prove all the lemmas in this chapter. We suggest that the reader read the  rst
three sections in the linear order, then, when need arises, read the proofs of some lemmas in
Section 4.
In Chapter 4, we discuss the Lebesgue approximation of (2; )-superprocesses of di-
mension d > 2= . This chapter is based on my 2013 paper [22]. In Section 1 we explain
the additional di culties for the Lebesgue approximation of (2; )-processes and review our
general approach, which overcomes these di culties. In Section 2 we develop further a trun-
cation of ( ; )-processes from [38]. We also characterize the local  niteness of any ( ; )-
superprocess, which can be used to extend certain results to some superprocesses with - nite
initial measures. In Section 3, we develop some lemmas about hitting bounds and neigh-
borhood measures of (2; )-processes, in particular, we improve the upper bounds of hitting
probabilities. In Section 4, we derive some asymptotic results of these hitting probabilities. In
particular, for the (2; )-superprocess  we show that "2=  dP f tB"x > 0g!c ;d (  pt)(x),
which extends the corresponding result for DW-processes. Finally in Section 5 we state and
prove the Lebesgue approximation of (2; )-processes and their truncated processes. When-
ever one feels the lack of details of some results in this chapter, refer back to appropriated
places in Chapter 3.
In Chapter 5, we discuss the Lebesgue approximation of superprocesses with a regu-
larly varying branching mechanism. The branching mechanisms we consider here include
the stable branching mechanisms considered in Chapter 4 as special cases. In Section 5.1
we explain the new di culties for the Lebesgue approximation of superprocesses with the
8
more general branching mechanism and review our general approach, which overcomes these
di culties. In Section 2 we review the truncation of superprocesses in a more general setting.
In Section 3, we develop some lemmas about hitting bounds and neighborhood measures of
the more general superprocesses. In Section 4, we derive some asymptotic results of these
hitting probabilities. Finally in Section 5 we state and prove the Lebesgue approximation of
superprocesses with a regularly varying branching mechanism and their truncated processes.
This general result contains all previous Lebesgue approximation of superprocesses as special
cases.
9
Chapter 2
Some Basics of Superprocesses
2.1 Moment measures
Moment measures play an important role in the study of superprocesses. For the Markov
process X, write Ttf(x) =  x(f(Xt)) for the semigroup of X, where  x(X2 ) denotes the
distribution of the process X starting from x. Then the  rst moment measure of the (X; )-
process  (see (1.3) and (1.4)) is
E ( tf) =  (Ttf): (2.1)
Note that the branching mechanism  of (1.1) plays no role here. Write p t (x) for the
transition density of the rotation invariant  -stable L evy process with generator 12  (see
(1.6)). Then the  rst moment measure of the ( ; )-process  takes the equivalent measure
form
E  t = (  p t )  d;
where   p t (x) = R p t (x y) (dy) and f  d denotes the measure de ned by (f  d)(B) =
R
Bfd 
d.
The second moment measure depends crucially on the branching mechanism. In fact,
second moments do not exist in general. However, they do exist when the measure  = 0 in
the branching mechanism  of (1.1), that is, for the (X; v2)-process  . The second moment
measure of the (X; v2)-process  is
E ( tf)2 = ( (Ttf))2 + 2 
Z t
0
  Ts(Tt sf)2 ds: (2.2)
10
Refer to Section 2.4 in [32] for the proofs of (2.1) and (2.2). For the (X; )-process  with
 < 1, only moments of order less than 1 +  exist. A useful inequality along this line is
Lemma 2.1 in [37]: For 0 < < < 1,
E ( tf)1+  1 +c( )
 
( (Ttf))1+ +
Z t
0
  Ts(Tt sf)1+  ds
 
;
where c( )!1 as  ! . When we need to use the second moments, we may truncate  
at any level K > 0 to get the truncated process  K, which has  nite second moments. For
details about this truncation method, see pages 484 - 487 and Lemma 3 in [38].
Using series expansions of Laplace functionals, Dynkin [13] gives moment measure for-
mulas for very general superprocesses. See Section 14.7 in [16] for a concise review of these
formulas. Finally we mention that, for DW-processes, Theorem 4.2 of Kallenberg [27] con-
tains a basic cluster decomposition of moment measures. Theorem 4.4 of that paper gives a
fundamental connection between moment measures and certain uniform Brownian trees,  rst
noted by Etheridge in Section 2.1 of [17]. It would be interesting to study this connection
for more general superprocesses. For details about the cluster decomposition of moment
measures, See Theorem 5.1 in Kallenberg [26].
2.2 Cluster representation
In this section we discuss the very important cluster representation of superprocesses.
Note that although a superprocess records the positions of all \individuals" alive at time
t, they do not keep track of all the genealogy of these \individuals". More precisely, let us
pick an \individual" alive at time t, then try to identify her \ancestor" at an earlier time
s. Although we know from  s the positions of all \individuals" alive at time s, we don?t
know which speci c \individual" at time s is the \ancestor" of the \individual" we picked
at time t. However, in the study of some deep properties of superprocesses, the genealogical
structure underlying the evolution can be extremely useful, even when the  nal results have
11
nothing directly to do with the genealogy. The cluster representation of superprocesses,
while containing only partial information of the genealogy, is enough for many purposes.
In order to discuss the cluster representation, let us  rst recall the de nition of Poisson
cluster processes. To de ne a cluster process, we start with a point process  = Pi  i on
some space T. For a suitable classMS of measures on S, we consider a probability kernel  
from T to MS. Choosing the random measures  i to be conditionally independent of the  i
with distributions   i, we may introduce a random measure  = Pi i on S. This random
measure  is called a  -cluster process generated by  . If  is Poisson or Cox, we call  a
Poisson or Cox cluster process.
Due to the underlying independence structure, superprocesses have the following branch-
ing property: If  and  0 are two independent (X; )-processes with initial measures  and  0
respectively, then  + 0 is an (X; )-process with initial measure  + 0. This can be veri ed
by using any of the three characterizations in Section 1.1. From this branching property, we
see that, for any t, the superprocess  t is an in nitely divisible random measure. A random
measure is in nitely divisible i it is the sum of a Poisson cluster process and a deterministic
measure (see Theorem 1.28 in [17]). Since Pf t = 0g > 0, the superprocess  t is just a
Poisson cluster process.
The cluster representation of (X; )-processes depends crucially on the branching mech-
anism  of (1.1). For convenience, we  rst discuss the cluster representation of (X;1)-
processes (see Section 3.2 and 6.1 in [17]). For a (X;1)-process  , at time 0, there are
actually uncountably many \individuals". All \individuals" produce \o spring" randomly.
However almost all \individuals" have no \o spring" alive at time t> 0, except  nitely many
\lucky" ones. In other words, the superprocess at time t is actually \o spring" of  nitely
many \ancestors". The point process records the locations of these  nite many \ancestors"
is a Poisson process  0 with intensity measure t 1 . This is the generating process in the
Poisson cluster representation of  t. Each one of these  nitely many \ancestors" generates a
random cluster at time t. Clearly this cluster is just her \o spring" at time t. These clusters
12
are \the same", means that they have the same distribution if we move their \ancestors" to
a common point. In summary,  t being a Poisson cluster process, is a  nite sum of condi-
tionally independent clusters, equally distributed apart from shifts and rooted at the points
of a Poisson process  0 of \ancestors" with intensity measure t 1 . By the Markov property
of  , we have a similar representation of  t for every s = t h2(0;t) as a countable sum of
conditionally independent h-clusters (clusters of age h), rooted at the points of a Cox pro-
cess  s directed by h 1 s. In other words,  s is conditionally Poisson given  s with intensity
measure h 1 s (see page 226 in [24]).
Under some restrictions of the branching mechanism  of (1.1), (X; )-processes also
have a similar cluster representation (see Section 11.5 in [5] and Section 3 in [7]). The
function t 1 in the above intensity measure t 1 should be replaced by another function of
t, determined by the branching mechanism  . The cluster distributions are also di erent,
determined by both X and  .
2.3 Historical superprocesses and random snakes
It is clear that the cluster representation of the previous section cannot be recovered
from the superprocess  itself, since  t records only the positions of all \individuals" alive at
time t. A complete picture of the evolution underlying a superprocess is given by a random
tree composed from the paths of all individuals. Two approaches to encode this picture are
provided by historical superprocesses and by random snakes. Both approaches can be used
to verify the cluster representation.
The basic idea of historical superprocesses is very simple (see Section 1.9 in [17]). Let
us explain the idea in the discrete setting, to make it even more transparent. For a discrete
spatial branching process  , pick two individuals alive at time t> 0, and assume that they
have their last common ancestor at time s2 (0;t). Based on the Markov properties of the
spatial motion X and the independence structures of spatial branching processes, clearly
we can think that these two individuals perform the spatial motion X together as a single
13
individual before time s, then separate at time s and begin to perform independent spatial
motion X ever since. In other words, we can think of these two individuals as a single
path before time s, and this path splits into two independent paths at time s. The same
idea still holds in the continuous setting, that is, for the superprocesses. Note that in the
construction of superprocesses, the spatial motion Xt is only the location of an individual
at time t. In order to remember the spatial locations of all the members in her genealogy
line before her, we may just replace Xt by the corresponding path process ^Xt, which is a
path-valued process. The value of ^Xt is the path of X over the time interval [0;t]. Now we
construct the ( ^X; )-superprocess ^ , which corresponds to the (X; )-superprocess  . We
call ^ the (X; )-historical superprocess. Note that the way we de ne ^ from  is di erent
from the naive way we de ne ^X from X. If we de ne ^ as the corresponding path process
of  , then we would not be able to specify the ancestors of any individual.
Denote the space of all rcll paths over the time interval [0;t] by Wt. Then the state
space of ^Xt is Wt, and so ^Xt is a time-inhomogeneous Markov process. Write  r;w for the
probability measure under which ^X starts from the path w at time r. Clearly w is an
rcll path over the time interval [0;r]. Let Mt be the space of all  nite measures on Wt.
This is the state space of ^ t, and so ^ t is also a time-inhomogeneous Markov process. The
(X; )-historical superprocess ^ can be characterized by all three approaches in Section 1.1.
It is obvious how to carry out the weak convergence approach. For the Laplace functional
approach, we have
Er; e ^ tf = e  vrt;
where  is a  nite measure on Wr, ^ t is a  nite measure on Wt, f is a function on Wt, and
vrt is a function on Wr. The function vrt(w) with r t and w2Wr is uniquely determined
by the integral equation
vrt(w) +  r;w
 Z t
r
 
 
vst( ^Xs)
 
ds
 
=  r;w
 
f( ^Xt)
 
:
14
This may be compared with (1.4), the integral equation of  . The only di erence is that ^ 
is a superprocess of historical paths, while  is a superprocess of spatial positions.
The concept of historical superprocesses was developed in Dawson and Perkins [7] and
Dynkin [14]. We may also refer to Chapter 12 in [5] and Section II.8 in [42].
From the construction of historical superprocesses it is clear that ^ t encodes the geneal-
ogy information of all \individuals" alive at time t.
The random snake approach developed by Le Gall and his co-authors allows to give
a complete description of the genealogy. Here we only focus on the basic ideas, since the
technical details and notation can be overwhelming.
The basic idea of random snakes stems from an important fact of branching processes:
The genealogical structures of branching processes can be completely encoded by a (random)
function on R+. More precisely, the genealogical structure of Galton-Watson processes can
be completely encoded by a discrete (random) function de ned on nonnegative integers.
These nonnegative integers correspond to all the individuals and the function values are
the generations of these individuals. For continuous-state branching processes, similarly
the genealogical structure can be completely encoded by a continuous (random) function
on R+. Again, any number in R+ corresponds to an \individual", and the function value
is the \generation" or lifetime of this \individual". This continuous coding function is
called the lifetime process. In fact, continuous coding function can be obtained from a
sequence of rescaled discrete coding functions. Clearly this is closed related to the fact that
continuous-state branching processes may be obtained as weak limits of rescaled Galton-
Watson processes. For a continuous-state branching process with the branching mechanism
 = v2, the lifetime process &t is actually just the re ected one dimensional Brownian motion.
The time parameter of the lifetime process &t is a labeling of all individuals in a certain order.
For the complete evolution of (X;1)-processes, we then need to somehow combine the
paths of individuals with this coding continuous random function. This is done by the so
called Brownian snake Wt, which is a path-valued Markov process evolving according to both
15
the spatial motion X and the lifetime process &t. Note that the term \Brownian" refers to the
branching mechanism, actually the lifetime process, not the spatial motion. The behavior of
the Brownian snake is actually not hard to explain, at least informally. The value Wt at time
t of the Brownian snake is a path of the underlying spatial motion X (started at a  xed initial
point) with the random lifetime &t. Informally, when &t decreases, the path Wt is shortened
from its tip, and when &t increases, the path Wt is extended by adding (independently of the
past) small \pieces of paths" following the law of the spatial motion X. In this way, we can
generate the full set of historical paths of a (X;1)-process by running the Brownian snake
according to the lifetime process, in this way we are visiting all the \individuals" one by one.
For superprocesses with a general branching mechanism  , similarly the so called L evy
snakes can be de ned. The basic ideas are similar, but technically it is much more com-
plicated. The main reason is that the corresponding lifetime process is not Markov and its
de nition is quite involved. Actually part of the beauty, and the power, of the Brownian
snake is that the lifetime process is itself a Markov process. The standard reference of Brow-
nian snake is the excellent lecture notes [32] by Le Gall in 1999. For L evy snakes, refer to
the excellent monograph [12] by Duquesne and Le Gall in 2002.
2.4 Hausdor dimensions and Hausdor measures
In this section we review some classical results about the Hausdor dimensions and
Hausdor measures of superprocesses. First let us review the de nitions of Hausdor di-
mension and Hausdor measure. For a nice introduction of this topic in a probabilistic
setting, see Chapter 4 and Section 6.4 in [36].
We  rst de ne Hausdor measure, then Hausdor dimension. Assume A to be a metric
space with the metric  . Use jAj to denote the diameter of the set A, which is de ned by
jAj= supf (x;y) : x;y2Ag:
16
For every   0 de ne
H  (A) = inff
1X
i=1
jAij : A 
1[
i=1
Ai;jAij  for all ig: (2.3)
Easy to see that the quantity H  (A) is increasing as  decreases, so that the limit
H (A) = lim
 !0
H  (A)
is well-de ned, although it could be in nite. We call the limit H (A) = lim !0H  (A) the
 -Hausdor measure of A.
Since subsets of a metric space are metric spaces on their own, the  -Hausdor measure
H can be de ned for all subsets of the space A. Using the de nition ofH , we can check that
the function H de ned for all subsets satis es all the properties of a metric Carath eodory
exterior measure (see Section 7.1 in [47], or Section 11.2 in [20]). Thus H is a countably
additive measure when restricted to the Borel sets of A. So indeed, the  -Hausdor measure
de ned on all Borel sets is a measure.
Let us de ne the Hausdor dimension. The  -Hausdor measure H (A) has the fol-
lowing natural properties: If 0   <  , and H (A) <1, then H (A) = 0; If 0   <  ,
and H (A) > 0, then H (A) = 1. So there exists a unique number which is denoted by
dimA such that H (A) = 1 for  < dimA, and H (A) = 0 for  > dimA. We call this
unique number the Hausdor dimension of the set A, denoted by dimA. Or in other words,
we de ne the Hausdor dimension of the set A by
dimA = supf :H (A) =1g= inff :H (A) = 0g:
Using the Hausdor dimension, we can associate a nonnegative number to any set,
which generalizes the usual integer dimensions. For example, the classical Cantor set has
Hausdor dimension log 2=log 3. The graph of a one dimensional Brownian motion, which is
17
a continuous (random) curve on R2, has Hausdor dimension 3=2 a.s. (see Theorem 16.4 in
[19]). This is related to the fact that one dimensional Brownian path is a.s. locally H older
continuous with exponent c for any c2(0; 12).
Now we turn to superprocesses. For a DW-process  in Rd, we denote the support of  t
by supp t, which is a random closed set in Rd. Actually this is even a random compact set,
assuming  0 =  is a  nite measure (see Theorem 1.2 in [6]). For  xed t> 0, if d 2, then
a.s. this is a null set (means that it has Lebesgue measure 0). Here Hausdor dimension is
useful for us to get some more understanding of the size of supp t. It is well-known that a.s.
dim (supp t) = 2^d; on f t6= 0g:
Note that if  t = 0, then supp t = ;. More generally, if  is a (2; )-process in Rd with
 2(0;1], then for  xed t> 0 , if d 2= , a.s. supp t is a null set (again, this is a random
compact set, adapt the proof of Theorem 1.2 in [6] to generalize Theorem 1.1 in [8], or see
Section 4.3 in [1]) and
dim (supp t) = (2= )^d; on f t6= 0g:
For this result and even more, see Theorem 2.1 in [9]. The situation for (2; )-processes is
in stark contrast to ( ; )-processes  in Rd with  2(0;2) and  2(0;1], where the spatial
motion has jumps. In this case, Evans and Perkins [18, 40] showed that For  xed t> 0 a.s.
supp t = Rd; on f t6= 0g: (2.4)
We can also discuss the Hausdor dimension of the range of superprocesses. First for
I R+, de ne the range of  on I by
R(I) =
[
t2I
supp t; (2.5)
18
and the closed range of  on I by R(I) = R(I), where R(I) is the closure of R(I) . Then
the range of  is de ned by
R=
[
">0
R([";1)):
For a DW-process  in Rd, if d 4, a.s. R is a null set and dimR= 4^d. More generally,
if  is a (2; )-process in Rd with  2(0;1], if d (2= ) + 2, a.s. R is a null set and
dimR= [(2= ) + 2]^d;
see Corollary 2.2 in [9]. Again, for ( ; )-processes  in Rd with  2 (0;2) and  2 (0;1],
a.s. R= Rd. This is immediate from (2.4) and the de nition of the range.
Let us turn back to the  -Hausdor measures. Although for any nonnegative  the  -
Hausdor measure is a Borel measure, for some metric spaces it is always a trivial measure
for any  , means that for any  , the  -Hausdor measure H (B) can only be 0 or 1 for
any B2B(A). For example, if  is a DW-process in Rd with d 2, then for a  xed t> 0,
a.s. H2(supp t) = 0 and
H (B\supp t) =1 or 0
for any  < 2 and B 2B(Rd) (see (2.7), (2.8), and (2.9)). So we need to generalize the
 -Hausdor measures if we want to construct a nontrivial measure on supp t.
The de nition of Hausdor dimension still makes sense if we evaluate coverings by
applying, instead of a simple power, an arbitrary non-decreasing function to the diameters
of the sets in a covering. We call this function a gauge function. By a gauge function we
mean a non-decreasing function  : [0;")![0;1) with  (0) = 0.
As before, we de ne
H  (A) = inff
1X
i=1
 (jAij) : A 
1[
i=1
Ai;jAij  for all ig: (2.6)
19
Clearly the  -Hausdor measure H  in (2.3) is just the special case of H  with  (x) = x .
Then de ne the  -Hausdor measure of A by
H (A) = lim
 !0
H  (A):
As before, H is a measure on Borel sets.
Under this more general framework, it is more likely to construct nontrivial measures
on a metric space, although this is still not always possible. For a DW-superprocess  , this
approach is extremely successful. Perkins and his co-authors proved the exact Hausdor 
measure of the support at a  xed time or of the range of the process. First about the
support supp t. For a  xed t> 0, a.s. we have
H ( \supp t) =  t( ); (2.7)
where for d 3 (see Theorem 5.2 in [7]),
 (x) = x2 log log(1=x); (2.8)
and for d = 2 (see Theorem 1.1 in [33]),
 (x) = x2 log(1=x) log log log(1=x): (2.9)
Next the range R(0;t]. For a  xed t> 0, a.s. we have
H ( \R(0;t]) =
Z t
0
 sds( ); (2.10)
where for d 5,
 (x) = x4 log log(1=x); (2.11)
20
and for d = 4,
 (x) = x4 log(1=x) log log log(1=x): (2.12)
Note that Rt0  sds is a measure on B(Rd), which is de ned by
Z t
0
 sds(B) =
Z t
0
 s(B)ds; for any B2B(Rd): (2.13)
It is easy to see that the Hausdor measure results here contain the Hausdor dimension
results that we reviewed previously.
One obvious remaining question is the exact Hausdor measure function of (2; )-
processes, but this may be technically too challenging. Then it is also interesting to try
to get some good upper bound and lower bound of the exact Hausdor measure function.
2.5 Lebesgue approximations
From the previous section, we see that by choosing carefully a suitable gauge function
 , we can de ne some nontrivial random measures on certain random null sets. Since from
the beginning we know that there are some naturally de ned nontrivial random measures
on these random null sets (the DW-process  t on supp t, and the local time measure of
one dimensional Brownian motion on its level set, see below), in fact the Hausdor measure
approach gives representations of these measures with respect to only their support. So
in order to recover these measures, we can forget about the related stochastic processes,
only the support of these measures is needed. In this regard, we also have the packing
measure approach (see [11, 34]), which is similar to the Hausdor measure approach generally
speaking. A more di erent approach to do this is the so called Lebesgue approximation
approach. Kingman [28] explained this approach in a very accessible manner and also used
this approach to recover the local time measure of certain Markov processes intrinsically
from the level set.
21
Let us  rst explain Kingman?s idea. For the subset A2Rd, we use A" to denote the
"-neighborhood of A , that is,
A" =fx : d(x;A) <"g:
It?s easy to see that A" = (A)". Recall that A is the closure of A. For their corresponding
Lebesgue measures, clearly  dE"2[0;1] and  dE"! dE. So when E is a null set, we get
 dE"!0. Here the interesting thing is that, the rate at which  dE" converges to zero is an
indication of the size of E. For example, if E is a part of a su ciently smooth d0-dimensional
surface in Rd, where d0<d, then
"d0 d dE" C d0E;
where C is a constant depending only on d and d0.
Now let us explain a special case of Kingman?s Lebesgue approximation result. Let
L(t;x) be the local time of a one dimensional Brownian motion B1(t). Let Z(t;x) = fs :
s t;B1(s) = xg. Kingman showed that there exists a constant c such that for  xed t and
x, a.s.
" 1=2 d[Z(t;x)]"!cL(t;x):
Kingman argued that unlike other approximation results of local time (see Corollary 1.9
and Theorem 1.10 in Chapter VI of [44]), this Lebesgue approximation result only requires
the knowledge of Z(t;x), which is the support of the local time measure L(t;x), to recover
L(t;x).
Next we discuss all known Lebesgue approximation results of superprocesses. For any
measure  on Rd and constant "> 0, write  " for the restriction of Lebesgue measure  d to
the "-neighborhood of supp . Note that using our notations, the "-neighborhood of supp 
22
is denoted by (supp )". So we may write  " explicitly by
 "( ) =  d ( \(supp )"):
For a DW-process  in Rd with d 3, Tribe [48] showed that for any  xed t > 0 and any
bounded Borel set B in Rd, a.s. as "!0,
"2 d "t(B)!cd t(B);
where cd > 0 is a constant depending on d.
Shortly after, in order to prove the strong Markov property of the support process
supp t, Perkins [41] showed that the Lebesgue approximation result holds simultaneously
for all time t> 0. More precisely, for a DW-process  in Rd with d 3, a.s. as "!0
"2 d "t w!cd t; for all t> 0; (2.14)
where w! denotes the weak convergence of measures. The corresponding Lebesgue approxi-
mation of two dimensional DW-processes  was still open at that time, even for  xed t> 0.
However, Perkins conjectured that for  xed t > 0 and bounded Borel set B in R2, a.s. as
"!0,
jlog"j "t(B)!c t(B):
Later, Kallenberg [25] essentially con rmed the above conjecture. More precisely, for a
DW-process  in R2, Kallenberg showed that for  xed t> 0 a.s. as "!0,
~m(")jlog"j "t w ! t;
where ~m is a suitable normalizing function bounded below and above by two positive con-
stants. Note that both the conjecture of Perkins and the proof of Kallenberg depend crucially
23
on the hitting bounds of DW-processes  in R2 from Le Gall [31]. Kallenberg?s approach
also works for DW-processes  in Rd with d 3, and results in a more probabilistic proof of
Lebesgue approximation of  t.
In [?], we adapted Kallenberg?s probabilistic approach in [25] to prove the Lebesgue
approximation of (2; )-processes with  < 1, combined with a truncation method of super-
processes from Mytnik and Villa [38], in order to overcome the additional di culty imposed
by the in nite variance of (2; )-processes. More precisely, for a (2; )-process  in Rd with
 < 1 and d> 2= , we proved that, for  xed t> 0 a.s. as "!0,
"2=  d "t w!c ;d t;
where c ;d > 0 is a constant depending on  and d.
In view of the Hausdor measure results (2.10), (2.11), and (2.12), we may ask about
the Lebesgue approximation of the range of a superprocess. Here, Delmas [10] proved the
Lebesgue approximation of the range of DW-processes  in Rd with d 4, using Le Gall?s
Brownian snake. More precisely, for a DW-process  in Rd with d 4, Delmas showed that,
for  xed t> 0 and bounded Borel set B in Rd, a.s. as "!0,
 (")R"t(B)!cd
Z 1
t
 sds(B); (2.15)
where Rt is the R([t;1)) de ned in (2.5), and R1t  sds is de ned as in (2.13). About the
normalizing function  , it is shown that
 (") = "4 d for d> 5; and,  (") =jlog"j for d = 4:
These known results lead to a couple immediate open problems. First we may ask if the
Lebesgue approximation of (2; )-processes holds simultaneously for all time t> 0, in view
of (2.14). Since intuitively the (2; )-process  and its support supp do not jump at the
24
same time, an immediate guess should be no. We may then ask if it is possible to prove some
results supporting this guess. What about the strong Markov property of the (2; )-support
process supp t? It seems that we need to  nd new approaches to prove it (or disprove it).
The second question is that, whether it is possible to prove the Lebesgue approximation of
the range of (2; )-processes, in view of (2.15).
More generally, we may try to \translate" all Hausdor measure results into corre-
sponding Lebesgue approximation ones. Here the challenge is that while there is a solid
theory behind Hausdor measures which one could rely on, there is no such support for
Lebesgue approximation results. One has to \invent" some approaches when trying to es-
tablish Lebesgue approximation results. Still it is very interesting to see that whenever we
can get the Lebesgue approximation results, the results are always shorter and cleaner then
the corresponding Hausdor measure results.
25
Chapter 3
Lebesgue Approximation of Dawson-Watanabe Superprocesses
3.1 Introduction
In this chapter we discuss the Lebesgue approximation of Dawson-Watanabe superpro-
cesses in detail. The Lebesgue approximation of DW-processes of dimension d 3 was  rst
proved by Tribe [48], using both probabilistic and analytic techniques. The case of critical
dimension d = 2 is more di cult. However, Kallenberg [25] obtained a similar result for
DW-processes in R2 using a more probabilistic approach. His approach can also be applied
to DW-processes of dimension d 3, and indeed this was done in [25]. The present chapter
is based on Kallenberg?s proof of Lebesgue approximation of DW-processes of dimension
d 3 in [25], with some technical simpli cations. Extra e orts have been made to explain
Kallenberg?s approach clearly and to make it more accessible.
We use  = ( t) to denote the DW-process of dimension d  3. Recall that  is a
measure-valued Markov process, so for  xed t and !, the value  t(!) is a measure on Rd.
We write  "t for the restriction of Lebesgue measure  d to the "-neighborhood of supp t, the
support of the measure  t, which for  xed t and !, is a compact set in Rd (see Theorem
1.2 in [6]). The Lebesgue approximation of DW-processes of dimension d  3, which is
Theorem 3.5 in this chapter, states that for  xed t> 0, "2 d "t w!cd t a.s. as "!0, where
w! denotes weak convergence of measures and c
d > 0 is a universal constant depending on
d. In particular, this con rms that  t \distributes its mass over supp t in a deterministic
manner" (cf. [17], p. 115, or [42], p. 212), as previously inferred from some deep results
involving the exact Hausdor measure (cf. [7]).
The proof depends crucially on some basic hitting estimates, due to Dawson, Iscoe, and
Perkins [6]. Here we need the lower bound and upper bound of P f tB"0 > 0g (Theorem
26
3.1.(a) in [6]), and also the precise convergence result "2 dP f tB"0 > 0g!cd pt for d 3
as "!0 (Theorem 3.1.(b) in [6]), where Brx denotes an open ball around x of radius r.
The proof also depends crucially on the representation of the DW-process as a countable
sum of conditionally independent clusters. Precisely, each  t can be expressed as a countable
sum of conditionally independent clusters of age h2(0;t], where the generating ancestors at
time s = t h form a Cox process  s directed by h 1 s (cf. [7, 30]). Typically we let h!0
at a suitable rate depending on ". However, a technical complication when dealing with
cluster representations is the possibility of multiple hits. More speci cally, a single cluster
may hit (charge) several of the "-neighborhoods of n distinct points x1;:::;xn, or one of
those neighborhoods may be hit by several clusters. In particular, Lemma 2.4 deals with
this multiple hitting of a single neighborhood by several clusters. To minimize the e ect of
such multiplicities, we need the cluster age h to be su ciently small. On the other hand,
it needs to be large enough for the mentioned hitting estimates to apply to the individual
clusters. Notice that we can translate the hitting estimates for the superprocess  to the
hitting estimates for the cluster  , based on the connection between the superprocess and
its clusters.
The reason we don?t cover the case of critical dimension d = 2 is that, although the
two cases of d = 2 and d 3 use the same general approach, technically the case of d = 2
is much more involved, since we then have to deal with the Logarithm normalizing function
jlog(")j rather than the power normalizing function "2 d as in the case of d 3. Also when
d = 2, a corresponding crucial result to the precise convergence result for d 3, as "! 0,
"2 dP f tB"0 > 0g! cd pt, is not readily available. So in this chapter, we restrict our
attention to the case of d 3.
We proceed with some general remarks on terminology and notation. A random measure
 on Rd is de ned as a measurable function from  to the spaceMd of locally  nite measures
on Rd, equipped by the  - eld generated by all evaluation maps  B :  7! B with B2Bd,
where Bd denotes the Borel  - eld on Rd. The subclasses of measures and bounded sets
27
are denoted by ^Md and ^Bd, respectively. The weak topology in Md is generated by all
integration maps  f :  7!  f = R fd with f belonging to the space Cdb of bounded,
continuous functions Rd!R+. Thus,  n w! in Md i  nf! f for all f2Cdb.
Throughout the chapter we use relations such as =_ , <_ ,  _ , and  , where the  rst
three mean equality, inequality, and asymptotic equality up to a constant factor, and the
last one is the combination of <_ and >_ . We often write a b to mean a=b! 0. The
double bars k k denote the supremum norm when applied to functions and total variation
when applied to signed measures. In any Euclidean space Rd, we write Brx for the open ball
of radius r > 0 centered at x2Rd. The shift and scaling operators  x and Sr are given by
 xy = x+y and Srx = rx, respectively, and for measures  on Rd we de ne   x and  Sr by
(  x)B =  ( xB) and ( Sr)B =  (SrB), respectively. In particular, ( Sr)f =  (f S 1r )
for measurable functions f on Rd. Convolutions of measures  with functions f are given by
(  f)(x) = R f(x u) (du).
This chapter is organized as follows. In Section 2 we  rst explain the crucial ideas about
cluster representations, then state several lemmas which will not be used directly in the main
proof of Lebesgue approximation, including the important upper bound of the hitting mul-
tiplicities. In Section 3 we state and prove the Lebesgue approximation for DW-processes
of dimensions d 3. In order to do so, we list several lemmas that are needed in the main
proof. Finally, in Section 4 we prove all the lemmas in this chapter. We suggest that the
reader read the  rst three sections in the linear order, then, when need arises, read the proofs
of some lemmas in Section 4.
3.2 Preliminaries
Let us  rst explain the cluster representations of DW-processes. We write L ( ) =
P f 2 g for the distribution of the process  with initial measure  . For every  xed  ,
the DW-process  is in nitely divisible under P and admits a decomposition into a Poisson
28
\forest" of conditionally independent clusters, corresponding to the excursions of the contour
process in the ingenious \Brownian snake" representation of Le Gall [32]. In particular, this
yields a cluster representation of  t for every  xed t > 0. More generally, the \ancestors"
of  t at an earlier time s = t h form a Cox process  s directed by h 1 s (meaning that  s
is conditionally Poisson with intensity h 1 s, given  s; cf. [24], p. 226), and the generated
clusters  ih are conditionally independent and identically distributed apart from shifts. In
this paper, a generic cluster of age t> 0 is denoted by  t; we write Lx( t) = Pxf t2 g for
the distribution of a t-cluster centered at x2Rd and put P f t2 g= R  (dx)Pxf t2 g.
The  rst lemma is about some basic scaling properties of DW-processes and their asso-
ciated clusters.
Lemma 3.1 Let  be a DW-process in Rd with associated clusters  t. Then for any measure
 on Rd, and r;t> 0,
(i) L Sr(r2 t) =Lr2 ( r2tSr),
(ii) L Sr(r2 t) =L ( r2tSr).
Although the above two compact identities look nice, they may not be very intuitive for
some people. In order to appreciate better these scaling properties,  rst we translate the L
notation back to the P notation
P Srfr2 t2 g = Pr2 f r2tSr2 g;
P Srfr2 t2 g = P f r2tSr2 g:
Recall that the evaluation map  B :  7! B is a function de ned on the spaceMd of locally
 nite measures on Rd. According to the de nition of  - eld on Md, the set f B1=r
0
> 0g
is a measurable set on Md. In the above two identities, take r = 1=",  Sr =  x, and,
 =f B1=r
0
> 0g, we get
Pxf tB"0 > 0g = P(1="2) x="f t="2B10 > 0g;
29
Pxf tB"0 > 0g = Px="f t="2B10 > 0g:
Now these two identities should be intuitive enough for one to appreciate the scaling prop-
erties.
Next we state a well-known relationship between the hitting probabilities of  t and  t.
Lemma 3.2 Let the DW-process  in Rd with associated clusters  t be locally  nite under
P , and  x any B2Bd. Then
P f tB > 0g =  t log (1 P f tB > 0g);
P f tB > 0g = 1 exp ( t 1P f tB > 0g):
In particular, P f tB > 0g t 1P f tB > 0g as either side tends to 0.
The following lemma contains some slight variations of classical hitting estimates for
DW-processes of dimension d 3. By Lemma 3.2 it is enough to consider the corresponding
clusters  t, and by shifting it su ces to consider balls centered at the origin.
Lemma 3.3 Let the  t be clusters of a DW-process in Rd with d 3, and consider a  - nite
measure  on Rd. Then for 0 <" pt, we have
 pt <_ t 1"2 dP f tB"0 > 0g<_  p2t;
The classical upper bound is  pt+". Note that as " ! 0, the upper bound  pt+"
is approaching the lower bound  pt, however the constants before these two bounds are
de nitely di erent. Still this suggests that as " ! 0, the normalized hitting probability
t 1"2 dP f tB"0 > 0g converges to c pt for some constant c > 0. This is indeed the case.
Although the classical upper bound can give us this intuitive impression, for all practical
purposes our upper bound  p2t is as good, if not better. The reason is that mathematically
speaking, p2t is almost the same as pt.
30
Next we need to estimate the probability that a small ball in Rd is hit by more than one
subcluster of our DW-process  . This result will play a crucial role throughout the remainder
of the chapter.
Lemma 3.4 Let the DW-process  in Rd be locally  nite under P . For any t  h > 0
and " > 0, let  "h be the number of h-clusters hitting B"0 at time t. Then for d 3 and as
"2  h t, we have
E  "h( "h 1) <_ "2(d 2) h1 d=2 pt + ( p2t)2 :
Here the intuition is that, if compare to h, the radius " is small enough, then most likely
there will be only one cluster hitting this tiny ball, or no cluster at all. Actually what we
want to control is the discrete quantity ( "h 1)+. However it seems that the only natural
way to relate this quantity to the DW-process  t is through the following simple inequality
( "h 1)+   "h( "h 1):
Then we can relate E  "h( "h 1) to E  t and E  2t , the  rst and second moment of the
DW-process  t. This is actually a very important point, especially in the next chapter when
we are dealing with the (2; )-superprocesses. Since the (2; )-superprocesses have in nite
second moment, to control E ( "h 1)+ we have to truncate the (2; )-processes, in order to
get the  nite second moment.
3.3 Lebesgue approximation
In this section we  rst state the main result of this chapter, the Lebesgue approximation
of DW-processes of dimension d 3, which is Theorem 3.5. In order to give the proof of
Theorem 3.5, we then state Lemma 3.6, 3.7, and 3.8, which will be used directly in the proof
31
of Theorem 3.5. However we leave all proofs of lemmas in the next section. At the end of
the present section we give the proof of Theorem 3.5.
For any measure  on Rd and constant " > 0, we de ne the associated neighborhood
measure  " as the restriction of Lebesgue measure  d to the "-neighborhood of supp , so
that  " has Lebesgue density 1f B"x > 0g. First note that  " is a measure de ned from
the measure  . Then recall that  t(!) is a measure for  xed t and !, so  "t(!) is just the
neighborhood measure of  t(!). Also recall that ^Md is the space of  nite measures on Rd.
For random measures  n and  with values in ^Md, the weak convergence in L1, denoted by
 n w! in L1;
means that  nf ! f in L1 for all f in Cdb. Write ~cd = 1=cd for convenience, where cd is
such as in (3.1).
Now we are ready to state the main result of this chapter, the Lebesgue approximation
of DW-processes of dimension d 3.
Theorem 3.5 Let  be the DW-process in Rd with d 3. Fix any  2 ^Md and t> 0. Then
under P , we have as "!0
~cd"2 d "t w! t a.s. and in L1:
Here the a.s. convergence means that for every ! outside a null set,
~cd"2 d "t(!) w! t(!):
Note that for  xed t and !, both  "t(!) and  t(!) are deterministic measures.
Next we are going to study ( ih)", the neighborhood measures of the clusters. Since
we will use the cluster decomposition  t =  i ih throughout the proof, naturally in order to
prove ~cd"2 d "t w! t we also need to study ( ih)". Write ( ih)" =  i"h for convenience.
32
Lemma 3.6 Let the  ih be conditionally independent h-clusters in Rd, rooted at the points
of a Poisson process  with E =  . Fix any measurable function f 0 on Rd. Then
(i) E Pi i"h = (  p"h)  d,
(ii) Var Pi i"hf <_ h2"2(d 2)kfk2k k for "2  h.
In part (i), notice that Pi i"h is a random measure, its expectation is the deterministic
measure (  p"h)  d, which means that for any measurable f 0
E 
X
i 
i"
hf = ((  p
"
h)  
d)f
where ((  p"h)  d)f is the integral of the function f with respect to the measure (  p"h)  d.
In part (ii), notice that Pi i"hf is a real-valued random variable, its variance is bounded
above by h2"2(d 2)kfk2k k.
Next we compare  "t and Pi i"h , and prove that asymptotically they are the same, so
that we can just replace  "t byPi i"h . Intuitively this result is clear: Since the ages of clusters
h and the parameter of neighborhood measures " are both going to 0 at some suitable rates,
asymptotically there are no overlaps between the neighborhood measures of clusters, so that
asymptotically Pi i"h and  "t are the same.
Lemma 3.7 Let  be a DW-process in Rd with d 3, and for  xed t> 0, let  ih denote the
subclusters in  t of age h> 0. Fix a  2 ^Md. Then as "2  h!0,
E 
  
  
 
X
i
 i"h   "t
  
  
 <_ ("
2=ph)d 2:
Recall that for a signed measure  on Rd with f as the density with respect to  d, the
total variation k k satis es
k k=  djfj=
Z
jf(x)jdx:
33
Note that Pi i"h and  "t have the density Pi1f ihB"x > 0g and 1f B"x > 0g respectively, so
E 
  
  
 
X
i
 i"h   "t
  
  
 = E 
Z    
  
X
i
1f ihB"x > 0g 1f B"x > 0g
  
  
 dx:
Now clearly the integrandjPi 1f ihB"x > 0g 1f B"x > 0gjis related to the multiple hitting
of Lemma 3.4.
The last lemma is a precise convergence result about the hitting probability Pxf hB"0 >
0g.
For a DW-process  of dimension d 3, we know from Theorem 3.1 of Dawson, Iscoe,
and Perkins [6] (cf. Remark III.5.12 in [42]) that, for  xed t > 0, x2Rd, and  nite  , as
"!0
"2 dP f tB"x > 0g!cd (  pt)(x); (3.1)
where cd > 0 is a constant depending only on d, and the convergence is uniform for x2Rd
and for bounded t 1 andk k. Notice that in this classical result t can change, but it has to
be bounded away from 0.
By using the scaling property of DW-processes, from the classical result above we can
get a precise convergence result about Pxf hB"0 > 0g as both h and " are approaching 0
at some suitable rates. More precisely, after the scaling term (cd) 1h 1"2 d multiplied to
Pxf hB"0 > 0g, the measure
(cd) 1h 1"2 dPxf hB"0 > 0gdx
converges in a certain sense to  0, the Dirac measure at 0  0, as both h and " are approaching
0 at some suitable rates. This result should be easy to understand since if x6= 0, then for
small enough h and ", the  h stated from x will not be able to reach B"0 before time h.
Only the  h stated from 0 will be able to reach B"0 before time h, although the probability
34
is decreasing to 0. After the scaling term h 1"2 d multiplied to P0f hB"0 > 0g, it converges
to the constant cd.
Lemma 3.8 Write p"h(x) = Pxf hB"0 > 0g, where the  h are clusters of a DW-process in Rd,
and  x a bounded, uniformly continuous function f  0 on Rd. Then as 0 < "2  h! 0,
we have
  h 1"2 d (p"
h f) cdf
  !0:
The result holds uniformly over any class of uniformly bounded and equicontinuous functions
f 0 on Rd.
Here   h 1"2 d (p"h f) cdf  is the supremum norm of the function
h 1"2 d (p"h f)(x) cdf(x);
as a function of x.
Now we are ready to prove Theorem 3.5, but before giving the proof let us discuss the
main ideas in the proof carefully. First of all, we have two possible approaches to attack this
theorem: one is to prove the L1-convergence  rst, then use some interpolation to get the a.s.
convergence from the L1-convergence (this is indeed what Tribe did in [48]); the other is to
prove the a.s. convergence  rst. In the  rst approach, we need to get the a.s. convergence
from the L1-convergence by the usual Borel-Cantelli argument: If EPjfnj < 1, then
fn ! 0 a.s. as n!1. In order to do so, we need an upper bound of the approximating
error
"2 dP f tB"x > 0g cd (  pt)(x);
which we don?t have here. So we will use the second approach: prove the a.s. convergence
 rst.
In order to prove the a.s. convergence, we need to show that a.s. for all f2Cdb, we have
that ~cd"2 d "tf ! tf, where Cdb is the class of bounded, continuous functions Rd ! R+.
35
However since there exists a countable, convergence-determining class of functions f in Cdb,
we only need to prove for any  xed f2Cdb, we have ~cd"2 d "tf! tf a.s.
In order to prove this, we write
  "2 d "
tf cd tf
   "2 d  
  "tf 
X
i 
i"
hf
  
 
+"2 d
  
 
X
i 
i"
hf h
 1 s(p"
h f)
  
 
+k sk  "2 dh 1 (p"h f) cdf  
+cdj sf  tfj:
Notice that the last term converges to 0 by the a.s. weak continuity of  and the third term
converges to 0 by Lemma 3.4. The  rst term is related to Lemma 3.3 and the second term is
related to Lemma 3.2, however these two lemmas are about the expectations and variances
of those terms.
In order to get a.s. convergence from results of expectations and variances, we use the
usual Borel-Cantelli argument: take a sequence "n and get f("n)!0 as n!1by showing
that EPjf("n)j<1. Finally we extend the a.s. convergence from the sequence "n to the
whole interval (0;1) by interpolation.
As for the L1-convergence, since by (1) we easily get
"2 dE  "tf!cdE  tf;
so the L1-convergence follows from the a.s. convergence by an usual proposition.
Proof of Theorem 3.1:
Proof: (i) Let d  3, and  x any t > 0,  2 ^Md, and f 2 CdK. Write  ih for the
subclusters of  t of age h. Since the ancestors of  t at time s = t h form a Cox process
directed by  s=h, Lemma 3.6 (i) yields
E 
hX
i 
i"
hf
  
  s
i
= h 1 s(p"h f);
36
and so by Lemma 3.6 (ii)
E 
  
 
X
i 
i"
hf h
 1 s(p"
h f)
  
 
2 = E
 Var
hX
i 
i"
hf
  
  s
i
<_ "2(d 2) h2kfk2E k s=hk
= "2(d 2)hkfk2k k:
Combining with Lemma 3.7 gives
E    "tf h 1 s(p"h f)  
 E 
  
  "tf 
X
i 
i"
hf
  
 +E 
  
 
X
i 
i"
hf h
 1 s(p"
h f)
  
 
<_ "2(d 2)h1 d=2kfk+"d 2h1=2kfk
= "d 2
 p
h+ ("=
p
h)d 2
 
kfk:
Taking h = " = rn for a  xed r2(0;1) and writing sn = t rn, we obtain
E 
X
nr
n(2 d)   rn
t f r
 n s
n(p
rn
rn f)
  
<_ X
n
 rn=2 +rn(d 2)=2 kfk<1;
which implies
rn(2 d)   rnt f r n sn(prnrn f)  !0 a.s. P : (3.2)
Now we write
  "2 d "
tf cd tf
   "2 d   "
tf h
 1 s(p"
h f)
  +c
dj sf  tfj
+k sk  "2 dh 1 (p"h f) cdf  :
Using (3.2), Lemma 3.8, and the a.s. weak continuity of  (cf. Proposition 2.15 in [17]), we
see that the right-hand side tends a.s. to 0 as n!1, which implies "2 d "tf cd tf a.s.
37
as "! 0 along the sequence (rn) for any  xed r2 (0;1). Since this holds simultaneously,
outside a  xed null set, for all rational r2 (0;1), the a.s. convergence extends by Lemma
2.3 in [25] to the entire interval (0;1).
Now let  2Md be arbitrary with  pt <1for all t> 0. Write  =  0+ 00 for bounded
 0, and let  =  0+ 00 be the corresponding decomposition of  into independent components
with initial measures  0 and  00. Fixing an r > 1 with suppf  Br 10 and using the result
for bounded  , we get a.s. on f 00tBr0 = 0g
"2 d "tf = "2 d 0"t f!cd 0tf = cd tf:
As  0" , we get by Lemma 4.3 in [25]
P f 00tBr0 = 0g= P 00f tBr0 = 0g!1;
and the a.s. convergence extends to  . Applying this result to a countable, convergence-
determining class of functions f (cf. Lemma 3.2.1 in [5]), we obtain the required a.s. vague
convergence. If  is bounded, then  t has a.s. bounded support (cf. Corollary 6.8 in [17]),
and the a.s. convergence remains valid in the weak sense.
To prove the convergence in L1, we note that for any f2CdK
"2 dE  "tf = "2 d
Z
P f tB"x > 0gf(x)dx
!
Z
cd (  pt)(x)f(x)dx = cdE  tf; (3.3)
by Theorem 5.3.(i) in [25]. Combining this with the a.s. convergence under P and using
Proposition 4.12 in [24], we obtain E j"2 d "tf cd tfj!0. For bounded  , (5.14) extends
to any f2Cdb by dominated convergence based on Lemmas 4.1 and 4.2 (i) in [25], together
with the fact that  d(  pt) =k k<1 by Fubini?s theorem.
38
3.4 Proofs of lemmas
Proof of Lemma 3.1:
(i) If v solves the evolution equation for  , that is,
_v = 12 v v2
then so does ~v(t;x) = r2v(r2t;rx). Writing ~ t = r 2 r2tSr, ~ = r 2 Sr, and ~f(x) = r2f(rx),
we get
E e ~ t ~f = E e  r2tf = e  vr2t = e ~ ~vt = E~ e  t ~f;
and so L (~ ) =L~ ( ), which is equivalent to (i).
(ii) De ne the cluster kernel  by  x = Lx( ), x 2 Rd, and consider the cluster de-
composition  = R m (dm), where  is a Poisson process with intensity   when  0 =  .
Here
r 2 r2tSr =
Z
(r 2mr2tSr) (dm); r;t> 0:
Using (i) and the uniqueness of the L evy measure, we obtain
(r 2 Sr) =  ( fr 2 ^mr2Sr2 g);
which is equivalent to
r 2L Sr( ) =Lr 2 Sr( ) =L (r 2^ r2Sr):  
Proof of Lemma 3.2:
Under P we have  t = Pi it, where the  it are conditionally independent clusters of age t
rooted at the points of a Poisson process with intensity  =t. For a cluster rooted at x, the hit-
ting probability is bx = Pxf tB > 0g. Hence (e.g. by Proposition 12.3 in [24]), the number of
clusters hitting B is Poisson distributed with mean  b=t, and so P f tB = 0g= exp(  b=t),
39
which yields the asserted formulas.  
Proof of Lemma 3.3:
 
Proof of Lemma 3.4:
Let  s be the Cox process of ancestors to  t at time s = t h, and write  ih for the
associated h-clusters. Using Lemma 3.3, the conditional independence of the clusters, and
the fact that E  2s = h 2E  2s outside the diagonal, we get with p"h(x) = Pxf hB"0 > 0g
E  "h( "h 1) = E 
XX
i6=j1f 
i
hB
"
0^ 
j
hB
"
0 > 0g
=
ZZ
x6=y
p"h(x)p"h(y)E  2s(dxdy)
<_ "2(d 2)
ZZ
ph(")(x)ph(")(y)E  2s(dxdy):
By the formula of  rst moment, Fubini?s theorem, and the semigroup property of (pt), we
get
Z
ph(")(x)E  s(dx) =
Z
ph(")(x) (  ps)(x)dx
=
Z
 (du) (ph(") ps)(u) =  pt("):
Next, we get by the formula of second moment, Fubini?s theorem, the properties of (pt), and
the relations t t" 2t s
ZZ
ph(")(x)ph(")(y) Cov  s(dxdy)
= 2
ZZ
ph(")(x)ph(")(y)dxdy
Z
 (du)
Z s
0
dr
Z
pr(v u)ps r(x v)ps r(y v)dv
40
= 2
Z
 (du)
Z s
0
dr
Z
pr(u v) pt(") r(v) 2dv
<_
Z
 (du)
Z s
0
(t r) d=2 (pr p(t(") r)=2)(u)dr
=
Z
 (du)
Z s
0
(t r) d=2p(t(")+r)=2(u)dr
<_
Z
pt(u) (du)
Z t
h
r d=2dr<_  pth1 d=2:
The assertion follows by combination of these estimates.  
Proof of Lemma 3.6:
(i) By Fubini?s theorem and the de nitions of  "h and p"h, we have
Ex "hf = Ex
Z
1f hB"u > 0gf(u)du = (p"h f)(x);
and so by independence
E
hX
i 
i"
hf
  
  
i
=
Z
 (dx)Ex "hf =  (p"h f): (3.4)
Hence, by Fubini?s theorem
E 
X
i 
i"
hf = E  (p
"
h f) =  (p
"
h f) = ((  p
"
h)  
d)f:
(ii) First,
Varx( K"h f) Ex( K"h f)2  Exk K"h k2kfk2 =kfk2Exk K"h k2:
For Exk K"h k2, using Cauchy inequality and Lemma 3.3, we get
Exk K"h k2 = Ex
 Z
1f Kh B"y > 0gdy
Z
1f Kh B"z > 0gdz
 
41
=
Z Z
Px f Kh B"y > 0g\f Kh B"z > 0g dydz
 
Z Z
(Pxf Kh B"y > 0gPxf Kh B"z > 0g)1=2dydz
<_ ah"d 2= 
Z Z
(p2h(y x)p2h(z x))1=2dydz
=_ ah"d 2= hd=2
Z Z
p4h(y x)p4h(z x)dydz
= ah"d 2= hd=2:
Hence, by independence
E Var
hX
i 
Ki"
h fj 
i
= E 
Z
 (dx)Varx( K"h f) <_ ah"d 2= hd=2kfk2k k:
 
Proof of Lemma 3.7:
Let  "h(x) denote the number of subclusters of age h hitting B"x at time t. Then Lemma
3.4 yields,
E 
  
 
X
i 
i"
h   
"
t
  
 = E 
Z   
 
X
i1f 
i
hB
"
x > 0g 1f B
"
x > 0g
  
 dx
=
Z
E ( "h(x) 1)+dx
<_ "2(d 2) d h1 d=2(  pt) + (  p2t)2 
<_ "2(d 2) h1 d=2k k+t d=2k k2 :
 
Proof of Lemma 3.8:
42
Using (3.1) and Lemmas 3.1 (ii), 3.2, and 3.3, we get by dominated convergence
 dp"h = hd=2 dp"=
ph
1  cdh
d=2 ("=ph)d 2 dp1 = cd"d 2h: (3.5)
Similarly, Lemma 3.3 yields for  xed r> 0 and a standard normal random vector  in Rd
"2 dh 1
Z
jxj>r
p"h(x)dx <_
Z
juj>r=ph
pl(")(u)du
= P
n
j jl1=2" >r=
p
h
o
!0: (3.6)
By (3.5) it is enough to show thatk^p"h f fk!0 as h, "2=h!0, where ^p"h = p"h= dp"h.
Writing wf for the modulus of continuity of f, we get
k^p"h f fk = supx
  
  
Z
^p"h(u) (f(x u) f(x))du
  
  
 
Z
^p"h(u)wf(juj)du
 wf(r) + 2kfk
Z
juj>r
^p"h(u)du;
which tends to 0 as h, "2=h!0 and then r!0, by (3.6) and the uniform continuity of f.
 
43
Chapter 4
Lebesgue Approximation of (2; )-Superprocesses
4.1 Introduction
Throughout this chapter, we use  f to denote the integral of the function f with respect
to the measure  . By an ( ; )-superprocess (or ( ; )-process, for short) in Rd we mean a
vaguely rcll, measure-valued strong Markov process  = ( t) in Rd satisfying E e  tf = e  vt
for suitable functions f 0, where v = (vt) is the unique solution to the evolution equation
_v = 12  v v1+ with initial condition v0 = f. Here   =  (  ) =2 is the fractional
Laplacian,  2 (0;2] refers to the spatial motion, and  2 (0;1] refers to the branching
mechanism. When  = 2 and  = 1 we get the Dawson{Watanabe superprocess (DW-
process for short), where the spatial motion is standard Brownian motion. General surveys
of superprocesses include the excellent monographs and lecture notes [5, 15, 17, 32, 35, 42].
In this chapter we consider superprocesses with possibly in nite initial measures. Indeed,
by the additivity property of superprocesses, we can construct the ( ; )-process  with any
 - nite initial measure  . In Lemma 4.5 we show that  t is a.s. locally  nite for every t> 0 i 
 p (t; ) <1for all t, where p (t;x) denotes the transition density of a symmetric  -stable
process in Rd. Note that when  = 2, p2(t;x) = pt(x) is the normal density in Rd.
For any measure  on Rd and constant " > 0, write  " for the restriction of Lebesgue
measure  d to the "-neighborhood of supp . For a DW-process  in Rd with any  nite initial
measure, Tribe [48] showed that "2 d "t w!cd t a.s. as "!0 when d 3, where w! denotes
weak convergence and cd > 0 is a constant depending on d. For a locally  nite DW-process  
in R2, Kallenberg [25] showed that ~m(")jlog"j "t v! t a.s. as "!0, where v!denotes vague
convergence and ~m is a suitable normalizing function. Our main result in this chapter is
Theorem 4.18, where we prove that, for a locally  nite (2; )-process  in Rd with  < 1 and
44
d> 2= , "2=  d "t v!c ;d t a.s. as "!0, where c ;d > 0 is a constant depending on  and d.
In particular, the (2; )-process  t distributes its mass over supp t in a deterministic manner,
which extends the corresponding property of DW-processes (cf. [17], page 115, or [42], page
212). See the end of the present chapter for a detailed explanation of this deterministic
distribution property. For DW-processes, this property can also be inferred from some deep
results involving the exact Hausdor measure (cf. [7]). However, for any ( ; )-process  
with  < 2, supp t = Rd or ; a.s. (cf. [18, 40]), and so the corresponding property fails.
Our result shows that this property depends only on the spatial motion.
To prove our main result, we adapt the probabilistic approach for DW-processes from
[25]. However, the  nite variance of DW-processes plays a crucial role there. In order to
deal with the in nite variance of (2; )-processes with  < 1, we use a truncation of ( ; )-
processes from [38], which will be further developed in Section 2 of the present chapter. By
this truncation we may reduce our discussion to the truncated processes, where the variance
is  nite.
To adapt the probabilistic approach from [25] to study the truncated processes, we also
need to develop some technical tools. Thus, in Section 3 we improve the upper bounds
of hitting probabilities for (2; )-processes with  < 1 and their truncated processes. As
an immediate application, in Theorem 4.8 we improve some known extinction criteria of
the (2; )-process  by showing that the local extinction property  t d!0 and the seemingly
stronger support property supp t d!; are equivalent. Then in Section 4 we derive some
asymptotic results of these hitting probabilities. In particular, for the (2; )-process  we
show in Theorem 4.15 that "2=  dP f tB"x > 0g! c ;d (  pt)(x), where Brx denotes an
open ball around x of radius r, which extends the corresponding result for DW-processes (cf.
Theorem 3.1(b) in [6]). Since the truncated processes do not have the scaling properties of
the (2; )-process, our general method is  rst to study the (2; )-process, then to estimate the
truncated processes by the (2; )-process, in order to get the needed results for the truncated
processes.
45
The extension of results of DW-processes to general ( ; )-processes is one of the major
themes in the research of superprocesses. Since the spatial motion of the ( ; )-process is
not continuous when  < 2 and the ( ; )-process has in nite variance when  < 1, many
extensions are not straightforward, and some may not even be valid. However, it turns out
that several properties of the support of (2; )-processes depend only on the spatial motion.
These properties include short-time propagation of the support (cf. Theorem 9.3.2.2 in [5])
and Hausdor dimension of the support (cf. Theorem 9.3.3.5 in [5]). Our result also belongs
to that category.
In this chapter we are mainly using the notations in [25]. Recall that the double bars
k kdenote the supremum norm when applied to functions and total variation when applied
to signed measures. We also use relations such as =_ , <_ , and  , where the  rst two mean
equality and inequality up to a constant factor, and the last one is the combination of <_
and >_ . Other notation will be explained whenever it occurs.
4.2 Truncated superprocesses and local  niteness
Although our main result of the present chapter is about (2; )-processes, in this section
we discuss the truncation and local  niteness of all ( ; )-processes, due to their independent
interests.
It is well known that the ( ;1)-process has weakly continuous sample paths. By contrast,
the ( ; )-process  with  < 1 has only weakly rcll sample paths with jumps of the form
  t = r x, for some t> 0, r> 0, and x2Rd. Let
N (dt;dr;dx) =
X
(t;r;x):   t=r x
 (t;r;x):
46
Clearly the point process N on R+ R+ Rd records all information about the jumps of  .
By the proof of Theorem 6.1.3 in [5], we know that N has compensator measure
^N (dt;dr;dx) = c (dt)r 2  (dr) t(dx); (4.1)
where c is a constant depending on  . Due to all the \big" jumps,  t has in nite variance.
Some methods for ( ;1)-processes, which rely on the  nite variance of the processes, are not
directly applicable to ( ; )-processes with  < 1.
In [38], Mytnik and Villa introduced a truncation method for ( ; )-processes with
 < 1, which can be used to study ( ; )-processes with  < 1, especially to extend results
of ( ;1)-processes to ( ; )-processes with  < 1. Speci cally, for the ( ; )-process  with
 < 1, we de ne the stopping time  K = infft > 0 : k  tk> Kg for any constant K > 0,
where inf;=1 as usual. When   t = r x, we see that k  tk= r. Clearly  K is the time
when  has the  rst jump greater than K. For any  nite initial measure  , they proved
that one can de ne  and a weakly rcll, measure-valued Markov process  K (which is YK on
page 485 of [38]) on a common probability space such that  t =  Kt for t< K. Intuitively,
 K euqals  minus all masses produced by jumps greater than K along with the future
evolution of those masses. In this chapter, we call  K the truncated K-process of  . Since
all \big" jumps are omitted,  Kt has  nite variance. They also proved that  Kt and  t agree
asymptotically as K!1. We give a di erent proof of this result, since similar ideas will
also be used at several crucial stages later. We write P f 2 gfor the distribution of  with
initial measure  .
Lemma 4.1 Fix any  nite  and t> 0. Then P f K >tg!1 as K!1.
Proof: If  K  t, then  has at least one jump greater than K before time t. Noting
that N ([0;t];(K;1);Rd) is the number of jumps greater than K before time t, we get by
47
Theorem 25.22 of [24] and (4.1),
P f K  tg  E N  [0;t];(K;1);Rd 
= E ^N  [0;t];(K;1);Rd 
=_ K 1  E 
Z t
0
k skds = tk kK 1  !0
as K!1, where the last equation holds by E k sk=k k.  
Using Lemma 1 of [38] and a recursive construction, we can prove that  Kt (!)   t(!)
for any t and !. So indeed,  K is a \truncation" of  .
Lemma 4.2 We can de ne  and  K on a common probability space such that:
(i)  is an ( ; )-process with  < 1 and a  nite initial measure  , and  K is its truncated
K-process, which has no jumps greater than K,
(ii)  t(!)  Kt (!) for any t and !,
(iii)  t(!) =  Kt (!) for t< K(!).
Proof: Let  m;n(t) denote the process  m;n at time t. Use D([0;1); ^Md) as our  , the
space of rcll functions from [0;1) to ^Md, where ^Md is the set of  nite measures on Rd. We
endow  with the Skorohod J1-topology. Let A= B( ).
Let  1(t;!) = !(t) be an ( ; )-process de ned on ( ;A;P) with initial measure  , and
de ne  K1 = infft > 0 : k  1(t)k> Kg. Then de ne a kernel u from ^Md to  such that
u( ; ) is the distribution of an ( ; )-process with initial measure  , and a kernel uK from
^Md to  such that uK( ; ) is the distribution of the truncated K-process of an ( ; )-process
with initial measure  . By Lemma 6.9 in [24], we can de ne  1;1 to be an ( ; )-process with
initial measure  1( K1) on an extension of ( ;A;P), and  01;1 to be the truncated K-process
48
of an ( ; )-process with initial measure  1(  K1). Now de ne  1 and  K1 by
 1(t) =
8
><
>:
 1(t); t< K1;
 1;1(t  K1); t  K1;
 K1 (t) =
8
><
>:
 1(t); t< K1;
 01;1(t  K1); t  K1:
By the strong Markov property of ( ; )-processes and the above construction, we can verify
that  1 is an ( ; )-process. By Lemma 1 in [38],  K1 is the truncated K-process of an
( ; )-process. Moreover,  1 and  K1 satisfy conditions (ii) and (iii) on [0; K1).
Let u0 be a kernel from ^Md ^Md toA Asuch that u0( ; 0; ; ) is the distribution of a
pair of two independent ( ; )-processes with initial measures  and  0 respectively. De ne
( 2;0; 2;1) with distribution
u0  K1 (  K1); 1( K1)  K1 (  K1); ;  :
Let  2 =  2;0 +  2;1,  02 =  2;0, and  K2 = infft > 0 : k  2(t)k > Kg. Let  2;1 be an
( ; )-process with initial measure  2( K2), and let  02;1 be the truncated K-process of an
( ; )-process with initial measure  02(  K2). Now de ne  2 and  K2 by
 2(t) =
8>
>>>
<
>>>
>:
 1(t); t< K1;
 2(t  K1);  K1  t< K1 + K2;
 2;1(t  K1  K2); t  K1 + K2;
 K2 (t) =
8
>>>
><
>>>
>:
 K1 (t); t< K1;
 02(t  K1);  K1  t< K1 + K2;
 02;1(t  K1  K2); t  K1 + K2:
49
Similarly,  2 is an ( ; )-process and  K2 is the truncated K-process of an ( ; )-process.
They satisfy conditions (ii) and (iii) on [0; K1 + K2).
Continue the above construction: For every n, de ne  n and  Kn such that  n is an ( ; )-
process,  Kn it the truncated K-process of an ( ; )-process, and they satisfy conditions (ii)
and (iii) on [0;Pnk=1  Kk).
It su ces to prove thatP1k=1 Kk =1a.s. Suppose P(P1k=1 Kk <1) > 0. Then there
exist t and a such that P(P1k=1  Kk <t) = a> 0. Since for every n,  n is an ( ; )-process
with initial measure  , we get
an E ^N n [0;t];(K;1);Rd :
Noting that by (4.1) E ^N n([0;t];(K;1);Rd) is the same  nite constant for di erent n, we
get a contradiction. So P1k=1 Kk =1 a.s.  
Just as the DW-process, the ( ; )-process  and its truncated K-process  K also have
cluster structures (cf. Corollary 11.5.3 in [5], or Section 3 in [7], especially page 41 there).
Speci cally, for any  xed t,  t is a Cox cluster process, such that the \ancestors" of  t at time
s = t h form a Cox process directed by ( h) 1=  s, and the generated h-clusters  ih are
conditionally independent and identically distributed apart from shifts. For the truncated
K-process  K, the situation is similar, except that the clusters are di erent (because of the
truncation) and the term ( h) 1= for  needs to be replaced by aK(h) (or ah, when K is
 xed). Use  K;ih (or  Kih ) to denote the generated h-clusters of  K. Write Pxf t 2 g for
the distribution of  t centered at x2Rd, and de ne P f t 2 g = R  (dx)Pxf t 2 g. The
following comparison of aK(h) and ( h) 1= , although not used explicitly in the present
chapter, should be useful in other applications of the truncation method.
50
Lemma 4.3 Fix any K > 0. Then as h!0,
( h)1=  aK(h) 2( h)1= :
Proof: From Lemma 3.4 of [7] we know that
( h)1= = lim
 !1
1=v0(h; );
where v0(h; ) is the solution of _v = v1+ with initial condition v  , and
aK(h) = lim
 !1
1=v1(h; );
where v1(h; ) is the solution of (1.12) in [38] with initial condition v  . De ne MK( ) =
C (K) +  K( ), where C (K) and  K are such as in (1.12) of [38]. Then MK satis es
 1+  MK( ) and lim
 !1
MK( )
 1+ = 1:
Clearly it is enough to show that (1=2)v0(h; )  v1(h; )  v0(h; ) as h! 0 and  !1.
This follows from the above properties of MK.  
Unlike the normal densities, we have no explicit expressions for the transition densities
of symmetric  -stable processes when  < 2. However, a simple estimate of p (t;x) is enough
for our needs.
Lemma 4.4 Let p (t;x),  2(0;2], t> 0, and x2Rd, denote the transition densities of a
symmetric  -stable process on Rd. Then for any  xed  and d,
p (t;x+y) <_ p (2t;x); jyj  t:
51
Proof: First let  = 2. Note that p2(t;x) = pt(x) is the standard normal density on Rd.
For jxj 4pt, trivially pt(x+y) <_ p2t(x). For jxj> 4pt, it su ces to check that
 jx+yj
2
2t   
jxj2
4t ;
that is, 2jx+yj2  jxj2, which follows easily from jxj 4jyj.
Now let  < 2. By the arguments after Remark 5.3 of [2],
p (t;x) 
 
t d= ^ tjxjd+ 
 
: (4.2)
Choose K > 21= to satisfy 1 2(1 1=K)d+ . Since jyj t1= , we have for jxj>Kt1= ,
t
jx+yjd+  
2t
jxjd+ :
Noticing also that (2t)=jxjd+ < (2t) d= for jxj>Kt1= , we get p (t;x + y) <_ p (2t;x) for
jyj t1= andjxj>Kt1= . The same inequality holds trivially forjyj t1= andjxj Kt1= .
 
Using Lemma 4.2 and Lemma 4.4, we can generalize Lemma 3.2 in [25] to any ( ; )-
process.
Lemma 4.5 Let  be an ( ; )-process in Rd,  2(0;2] and  2(0;1], and  x any  - nite
measure  . Then for any  xed t> 0, the following two conditions are equivalent:
(i)  t is locally  nite a.s. P ,
(ii) E  t is locally  nite.
Furthermore, (i) and (ii) hold for every t> 0 i 
(iii)  p (t; ) <1 for all t> 0,
and if  < 2, then (iii) is equivalent to
52
(iv)  p (t; ) <1 for some t> 0.
Proof: The formulas for E  t and E  2t (when  < 1), well known for  nite  , as well as
the formulas in Lemma 3 of [38], extend by monotone convergence to any  - nite measure
 . We also need the simple inequality that for any  xed  < 2, s, and t,
p (s;x) p (t;x): (4.3)
To prove it, use (4.2) and consider three cases: jxj  (s^t)1= , jxj  (s_t)1= , and
(s^t)1= <jxj< (s_t)1= .
If  = 2 and  = 1, then this is Lemma 3.2 of [25]. For  < 2 and  = 1, using
Lemma 4.4 and (4.3) we can proceed as in Lemma 3.2 of [25]. For example, for any  xed
t > 0 and x2Rd, p (t;x u) <_ p (jxj1= ;x u) <_ p (2jxj1= ; u) = p (2jxj1= ;u) yields
  p (t; )(x) <1.
Now assume  < 1. Condition (ii) clearly implies (i). Conversely, suppose that E  tB =
1for some B. Then E  Kt B =1for any  xed K > 0 by Lemma 3 of [38]. Also, we get by
Lemma 3 of [38],
P 
  K
t B
E  Kt B >r
 
 (1 r)2 (E  
K
t B)
2
E ( Kt B)2  
(1 r)2
1 +ct(E  Kt B) 1
for any r2(0;1). Arguing as in the proof of Lemma 3.2 in [25], we get  Kt B =1 a.s., and
so  tB =1 a.s. by Lemma 4.2. In particular, this shows that (i) implies (ii). To prove the
equivalence of (ii) and (iii), again using Lemma 4.4 and (4.3) we can proceed as in Lemma
3.2 of [25]. The last assertion is obvious from (4.3).  
53
4.3 Hitting bounds and neighborhood measures
From now on we consider only (2; )-processes. The Lebesgue approximation depends
crucially on estimates of the hitting probability P f tB"0 > 0g. In this section, we  rst
estimate P f tB"0 > 0g and P f Kt B"0 > 0g. Then we use these estimates to study multiple
hitting and neighborhood measures of the clusters  Kh associated with the truncated K-
process  K. We begin with a well-known relationship between the hitting probabilities of  t
and  t, which can be proved as in Lemma 4.1 of [25].
Lemma 4.6 Let the (2; )-process  in Rd with associated clusters  t be locally  nite under
P , let  K be its truncated K-process with associated clusters  Kt , and  x any B2Bd. Then
P f tB > 0g =  ( t)1= log (1 P f tB > 0g);
P f tB > 0g = 1 exp   ( t) 1= P f tB > 0g ;
P f Kt B > 0g =  at log (1 P f Kt B > 0g);
P f Kt B > 0g = 1 exp ( a 1t P f Kt B > 0g):
In particular, P f tB > 0g ( t) 1= P f tB > 0g and P f Kt B > 0g a 1t P f Kt B > 0g
as either side tends to 0.
Upper and lower bounds of P f tB"0 > 0g have been obtained by Delmas [9], using
a subordinated Brownian snake approach. However, in this chapter we need the following
improved upper bound.
Lemma 4.7 Let  t be the clusters of a (2; )-process  in Rd with  < 1 and d > 2= , let
 Kt be the clusters of  K, the truncated K-process of  , and consider a  - nite measure  on
Rd. Then for 0 <" pt,
(i)  pt0 <_ "2=  d( t) 1= P f tB"0 > 0g<_  p2t; where t0 =  t=(1 + ),
(ii) "2=  da 1t P f Kt B"0 > 0g<_  p2t:
54
Proof: (i) From the proof of Theorem 2.3 in [9] we know that
Pxf tB"0 > 0g= 1 exp( NxfYtB"0 > 0g);
where Nx and Yt are de ned in Section 4.2 of [9]. Comparing this with Lemma 5.4 yields
( t) 1= Pxf tB"0 > 0g= NxfYtB"0 > 0g:
By Proposition 6.2 in [9] we get the lower bound. For our upper bound, we will now improve
the upper bound in Proposition 6.1 of [9].
For 0 <"=2 <pt, de ne
 = f(r;y)2R+ Rd; r<t;jyj>"=2g
[
f(r;y)2R+ Rd; r<t "2=16;jyj "=2g:
Following the proof of Proposition 6.1 in [9], we have
( t) 1= Pxf tB"0 > 0g<_ " 2= P0f s2B"=2x for some s2[t "2=16;t)g;
where  is a standard Brownian motion. De ne
T = inffs t "2=16 :  s2B"=2x g;
where inf;=1 as usual. Then fT <tg=f s2B"=2x for some s2[t "2=16;t)g. To get
our upper bound, it remains to show that
P0fT <tg<_ "dp2t(x):
55
To prove this, we need the elementary fact that for any x2Rd, " > 0, y2B"=2x , and
s s0 = "2=16,
Pyf s =2B"xg Pzf s0 =2B"xg<_ Pzf s02B"xg Pyf s2B"xg;
where z is a point on the surface of B"=2x , and the second relation holds since Pzf s0 =2B"xgand
Pzf s02B"xg are both positive constants. Now return to P0fT <tg. Noting t T  "2=16
on fT <tg, we get
P0fT <tg = P0fT <t; t2B"xg+P0fT <t; t =2B"xg
= P0fT <t; t2B"xg+E0fP Tf t T =2B"xg;T <tg
<_ P0fT <t; t2B"xg+E0fP Tf t T 2B"xg;T <tg
= P0fT <t; t2B"xg+P0fT <t; t2B"xg
<_ P0f t2B"xg<_ "dp2t(x);
where the second and fourth relations hold by the strong Markov property of Brownian
motion and the last relation holds by Lemma 4.4.
(ii) This is obvious from (i), Lemma 4.2, and Lemma 5.4.  
As an immediate application of the improved upper bound, we may improve some known
extinction criteria for (2; )-processes in Rd with  < 1 and d> 2= . This extends Theorem
4.5 of [25] for DW-processes of dimension d 2. Note that the special case of convergence of
random measures  t d!0 is equivalent to  tB P!0 for any bounded Borel set B. Convergence
of closed random sets is de ned as usual with respect to the Fell topology (cf. [24], pp. 324,
566). However, in this chapter we need only the special case of convergence to the empty
set supp t d!;, which is equivalent to 1f tB > 0gP!0 for any bounded Borel set B.
56
Theorem 4.8 Let  be a locally  nite (2; )-process in Rd,  < 1 and d> 2= , with arbitrary
initial distribution. Then these conditions are equivalent as t!1:
(i)  t d!0,
(ii) supp t d!;,
(iii)  0pt P!0.
Proof: By Lemma 5.4 and Lemma 5.5(i) we get for any  xed r
P f tBr0 > 0g ( t) 1= P f tBr0 > 0g<_  p2t;
and so P f tBr0 > 0g<_  p2t^1. For a general initial distribution,
Pf tBr0 > 0g<_ E( 0p2t^1);
which shows that (iii) implies (ii). Since clearly (ii) implies (i), it remains to prove that (i)
implies (iii).
Let  be locally  nite under P . We  rst choose f2C++c (Rd) with suppf2B10, where
C++c (Rd) is such as in Proposition 2.6 of [29]. Clearly  tf P!0 if  tB10 P!0. By dominated
convergence
exp(  vt) = E exp(  tf)!1;
and so  vt!0. By Proposition 2.6 of [29], we have for t large enough
pt=2(x)  (t=2;x) <_ vt(x);
where  is de ned in (1.15) of [29] (on page 1061, see also (1.17) and (1.18) there). So
 pt=2 !0. For general  0, we may proceed as in the proof of Theorem 4.5 in [25].  
57
The following simple fact is often useful to extend results for  nite initial measures  to
the general case. Here ^Bd denotes the space of bounded sets in the Borel  -algebra Bd.
Lemma 4.9 Let the (2; )-process  in Rd with  < 1 and d > 2= be locally  nite under
P , and suppose that    n # 0. Then P nf tB > 0g! 0 as n!1 for any  xed t > 0
and B2 ^Bd.
Proof: Follow the proof of Lemma 4.3 in [25], then use Lemma 4.5, Lemma 5.4, and
Lemma 5.5(i).  
As in [25] we need to estimate the probability that a ball in Rd is hit by more than one
subcluster of the truncated K-process  K. This is where the truncation of  is needed.
Lemma 4.10 Fix any K > 0. Let  K be the truncated K-process of a (2; )-process  in Rd
with  < 1 and d> 2= . For any t h> 0 and "> 0, let  "h be the number of h-clusters of
 Kt hitting B"0 at time t. Then for "2  h t,
E  "h( "h 1) <_ "2(d 2= ) h1 d=2 pt + ( p2t)2 :
Proof: Follow Lemma 4.4 in [25], then use Lemma 3 of [38] and Lemma 5.5(ii).  
Now we consider the neighborhood measures of the clusters  Kh associated with the trun-
cated K-process  K. For any measure  on Rd and constant "> 0, we de ne the associated
neighborhood measure  " as the restriction of Lebesgue measure  d to the "-neighborhood of
supp , so that  " has Lebesgue density 1f B"x > 0g. Let pK;"h (x) = Pxf Kh B"0 > 0g, where
the  Kh are clusters of  K. Write pK;"h (x) = pK"h (x) and ( K;ih )" =  Ki"h for convenience.
Lemma 4.11 Let  K be the truncated K-process of a (2; )-process  in Rd with  < 1 and
d> 2= . Let the  Kih be conditionally independent h-clusters of  K, rooted at the points of a
Poisson process  with E =  . Fix any measurable function f 0 on Rd. Then,
58
(i) E Pi Ki"h = (  pK"h )  d ,
(ii) E Var Pi Ki"h fj  <_ ah"d 2= hd=2kfk2k k for "2  h.
Proof: (i) Follow the proof of Lemma 6.2 (i) in [25].
(ii) First,
Varx( K"h f) Ex( K"h f)2  Exk K"h k2kfk2 =kfk2Exk K"h k2:
For Exk K"h k2, using Cauchy inequality and Lemma 5.5(ii), we get
Exk K"h k2 = Ex
 Z
1f Kh B"y > 0gdy
Z
1f Kh B"z > 0gdz
 
=
Z Z
Px f Kh B"y > 0g\f Kh B"z > 0g dydz
 
Z Z
(Pxf Kh B"y > 0gPxf Kh B"z > 0g)1=2dydz
<_ ah"d 2= 
Z Z
(p2h(y x)p2h(z x))1=2dydz
=_ ah"d 2= hd=2
Z Z
p4h(y x)p4h(z x)dydz
= ah"d 2= hd=2:
Hence, by independence
E Var
hX
i 
Ki"
h fj 
i
= E 
Z
 (dx)Varx( K"h f) <_ ah"d 2= hd=2kfk2k k:
 
We also need to estimate the overlap between subclusters.
Lemma 4.12 Let  K be the truncated K-process of a (2; )-process  in Rd with  < 1 and
d > 2= . For any  xed t > 0, let  Kih denote the subclusters in  K of age h > 0. Fix any
59
 2 ^Md. Then as "2  h!0,
E 
  
 
X
i 
Ki"
h   
K"
t
  
 <_ "2(d 2= )h1 d=2:
Proof: Follow the proof of Lemma 6.3(i) in [25], then use Lemma 5.5(ii).  
4.4 Hitting asymptotics
For a DW-process  of dimension d  3, we know from Theorem 3.1(b) of Dawson,
Iscoe, and Perkins [6] that, as "!0,
"2 dP f tB"x > 0g!cd (  pt)(x);
uniformly for bounded k k, bounded t 1, and x2Rd. A similar result for DW-processes of
dimension d = 2 is Theorem 5.3(ii) of [25]. In this section, using Lemma 5.5(i), we can prove
the corresponding result for (2; )-processes in Rd with  < 1 and d> 2= .
First we  x a continuous function f on Rd such that 0 < f(x)  1 for x 2 B10 and
f(x) = 0 otherwise. Let v be the solution of _v = 12 v v1+ with initial condition
v(0) =  f. Since v is increasing in  , we can de ne v1 = lim !1v . Using Lemma 5.5(i),
we can get an upper bound of v1, similar to Lemma 3.2 in [6].
Lemma 4.13 For any t 1 and x2Rd, v1(t;x) <_ p(2t;x).
Proof: Letting  !1 in Ex exp(  t f) = exp[ v (t;x)], we get
Pxf tB10 > 0g= 1 exp[ v1(t;x)]:
Comparing this with Lemma 5.4 yields
v1(t;x) = ( t) 1= Pxf tB10 > 0g: (4.4)
60
Now Lemma 4.13 follows from Lemma 5.5(i).  
As in Lemma 3.3 of [6], we can apply a PDE result to get the uniform convergence of
v1. Notice that the improved upper bound in Lemma 5.5(i) is crucial here.
Lemma 4.14 There exists a constant c ;d > 0 such that
lim"!0" dv1(" 2t;" 1x) = c ;d p(t;x):
The convergence is uniform for bounded t 1 and x2Rd.
Proof: We follow the proof of Lemma 3.3 in [6]. By Lemma 4.13, v1(t;x) is  nite for
any t 1 and x2Rd. Then by a standard regularity argument in PDE theory,
_v1 = 12 v1 v1+ 1 (4.5)
on [1;1) Rd. By Lemma 4.13, v1(1)2L1(Rd). Set
w"(t;x) = " dv1(1 +" 2t;" 1x):
Then by (4.5), _w" = 12 w" " d 2w1+ " with initial condition w"(0;x) = " dv1(1;" 1x).
Applying Proposition 3.1 in [21] gives
lim"!0" dv1(1 +" 2t;" 1x) = c ;d p(t;x);
uniformly on compact subsets of (0;1) Rd. Together with Lemma 4.13 this yields the
uniform convergence on [a;1) Rd for any a> 0. Moreover, letting t = t0 "2, we get
lim"!0" dv1(" 2t0;" 1x) = c ;d p(t0;x);
61
uniformly on [a;1) Rd for any a> 0.
It remains to prove that c ;d > 0. Using (4.4) and the lower bound in Lemma 5.5(i), we
obtain
" dv1(" 2t;" 1x) = " d( t) 1= P" 1xf " 2tB10 > 0g
>_ " dp
  " 2t
1 + ;"
 1x
 
= p
  t
1 + ;x
 
;
and so c ;d > 0.  
Now we can derive the asymptotic hitting rate for a (2; )-process.
Theorem 4.15 Let the (2; )-process  in Rd with  < 1 and d> 2= be locally  nite under
P . Fix any t> 0 and x2Rd. Then as "!0,
"2=  dP f tB"x > 0g!c ;d(  pt)(x):
The convergence is uniform for bounded k k, bounded t 1, and x2Rd. Similar results hold
for the clusters  t with pt replaced by ( t)1= pt.
Proof: We  rst prove that as "!0,
"2=  d( t) 1= P f tB"x > 0g!c ;d(  pt)(x); (4.6)
uniformly for bounded k k, bounded t 1, and x2Rd.
Use   x to denote the measure  shifted by  x. If  is  nite, then by the scaling
of  , (4.4), and Lemma 4.14, we can get the following chain of relations, which proves the
uniform convergence of (4.6):
"2=  d( t) 1= P f tB"x > 0g
62
= "2=  d( t) 1= 
Z
Pyf tB"0 > 0g(  x)(dy)
= "2=  d( t) 1= 
Z
Py="f t="2B10 > 0g(  x)(dy)
= " d
Z
v1(" 2t;" 1y)(  x)(dy)!c ;d(  pt)(x):
Let  be an in nite  - nite measure satisfying  pt <1 for all t. From the proof of
Lemma 4.5, we know that (  p2t)(x) <1for any x2Rd. Then by dominated convergence
based on Lemma 5.5(i), we can still get (4.6).
Now we turn to  t. First note that by Lemma 5.4, as "!0,
"2=  dP f tB"x > 0g!c , "2=  d( t) 1= P f tB"x > 0g!c; (4.7)
"2=  dP f Kt B"x > 0g!c , "2=  da 1t P f Kt B"x > 0g!c: (4.8)
It remains to prove the uniform convergence for  t. Since (  pt)(x)  t d=2k k, we know
that by (4.6), ( t) 1= P f tB"x > 0g! 0, uniformly for bounded k k, bounded t 1, and
x2Rd. Then we may use Lemma 5.4 to get the uniform convergence for  t.  
The following result, especially part (ii), will play a crucial role in Section 5. Here we
approximate the hitting probabilities pK"h by suitably normalized Dirac functions. This will
be used in Lemma 4.17 to prove the Lebesgue approximation of  K.
Lemma 4.16 Let p"h(x) = Pxf hB"0 > 0g, where the  h are clusters of a (2; )-process  in
Rd with  < 1 and d> 2= . Recall that pK"h (x) = Pxf Kh B"0 > 0g, where the  Kh are clusters
of  K, the truncated K-process of  . Fix any bounded, uniformly continuous function f 0
on Rd.
(i) As 0 <"2  h!0,
  "2=  d( h) 1= (p"
h f) c ;df
  !0:
63
(ii) Fix any b2(0;1=2). Then as 0 <"2  h!0 with "2=  dh1+bd!0,
  "2=  da 1
h (p
K"
h  f) c ;df
  !0:
Both results hold uniformly over any class of uniformly bounded and equicontinuous functions
f 0 on Rd.
Proof: (i) We follow the proof of Lemma 5.2(i) in [25]. By scaling of  and (4.6),
"2=  d( h) 1=  dp"h = ("=
p
h)2=  d( ) 1=  dp"=
ph
1 !c ;d: (4.9)
De ning ^p"h = p"h= dp"h, we need to show that k^p"h f fk!0. Write wf for the modulus
of continuity of f, that is, a function wf = w(f; ) de ned by
wf(r) = supfjf(s) f(t)j;s;t2Rd;js tj rg; r> 0:
Clearly wf(r)!0 as r!0 since f is uniformly continuous. Now we get
k^p"h f fk = supx
  
  
Z
^p"h(u) (f(x u) f(x)) du
  
  
 
Z
^p"h(u)wf(juj)du
 wf(r) + 2kfk
Z
juj>r
^p"h(u)du:
It remains to show that Rjuj>r ^p"h(u)du! 0 for any  xed r > 0. Then notice that for any
 xed r> 0 by Lemma 5.5(i),
"2=  d( h) 1= 
Z
juj>r
p"h(u)du<_
Z
juj>r
p2h(u)du!0:
64
(ii) For pK"h , Lemma 5.5(ii) yields for any  xed r> 0,
"2=  da 1h
Z
juj>r
pK"h (u)du<_
Z
juj>r
p2h(u)du!0:
Following the steps of the previous proof, it is enough to show that
"2=  da 1h  dpK"h !c ;d: (4.10)
Since Rjuj>hbp2h(u)du!0, Lemma 5.5 yields
"2=  d( h) 1= 1f(Bhb0 )cg dp"h!0; "2=  da 1h 1f(Bhb0 )cg dpK"h !0:
By (5.12), to prove (4.10) it su ces to show that
"2=  d( h) 1= 1fBhb0 g dp"h "2=  da 1h 1fBhb0 g dpK"h !0;
or equivalently (by (4.7) and (4.8)),
"2=  d
 
P1fBhb
0 g d
f hB"0 > 0g P1fBhb
0 g d
f Kh B"0 > 0g
 
!0:
By Theorem 25.22 of [24] and (4.1),
"2=  d
 
P1fBhb
0 g d
f hB"0 > 0g P1fBhb
0 g d
f Kh B"0 > 0g
 
 "2=  dE1fBhb
0 g d
N  [0;h];(K;1);Rd 
= "2=  dE1fBhb
0 g d
^N  [0;h];(K;1);Rd 
=_ "2=  dE
Z h
0
k skds =_ "2=  dh1+bd!0:  
65
4.5 Lebesgue approximations
To prove the Lebesgue approximation for a (2; )-process  in Rd with  < 1 and
d > 2= , we begin with the Lebesgue approximation for  K, the truncated K-process of
 . Since  and  K agree asymptotically as K !1, we have thus proved the Lebesgue
approximation for  . Write ~c ;d = 1=c ;d for convenience, where c ;d is such as in Lemma
4.14. Recall that  K"t = ( Kt )", the "-neighborhood measure of  Kt .
Lemma 4.17 Let  K be the truncated K-process of a (2; )-process  in Rd with  < 1 and
d> 2= . Fix any  2 ^Md and t> 0. Then under P , we have as "!0:
~c ;d"2=  d K"t w! Kt a.s.
Proof: We follow the proof of Theorem 7.1 in [25]. Fix any f 2CdK. Write  Kih for the
subclusters of  Kt of age h. Since the ancestors of  Kt at time s = t h form a Cox process
directed by  Ks =ah, Lemma 5.7(i) yields
E 
hX
i 
Ki"
h f
  
  Ks
i
= a 1h  Ks (pK"h  f);
and so by Lemma 5.7(ii)
E 
  
 
X
i 
Ki"
h f a
 1
h  
K
s (p
K"
h  f)
  
 
2 = E
 Var
hX
i 
Ki"
h f
  
  Ks
i
<_ ah"d 2= hd=2kfk2E k Ks =ahk
 "d 2= hd=2kfk2k k;
where the last inequality follows from E k Ks k k k. Combining with Lemma 5.8 gives
E    K"t f a 1h  Ks (pK"h  f)  
 E 
  
  K"t f 
X
i 
Ki"
h f
  
 +E 
  
 
X
i 
Ki"
h f a
 1
h  
K
s (p
K"
h  f)
  
 
66
<_ "2(d 2= ) h1 d=2kfk+"1=2(d 2= ) hd=4kfk
= "d 2=  "d 2= h1 d=2 +" 1=2(d 2= )hd=4 kfk:
Let c satisfy
(d 2= ) + ( d=2 + 1=2)c = 0: (4.11)
Clearly c 2 (0;2). Taking " = rn for a  xed r 2 (0;1) and h = "c = rcn, and writing
sn = t h = t rcn, we obtain
E 
X
n
rn(2=  d)   Krnt f a 1rcn Ksn(pKrnrcn  f)  
<_ X
n
 r[(d 2= )+( d=2+1)c]n +r[ 1=2(d 2= )+(d=4)c]n kfk<1;
since (d 2= ) + ( d=2 + 1)c> 0 and  1=2(d 2= ) + (d=4)c> 0 by (4.11). This implies
rn(2=  d)   Krnt f a 1rcn Ksn(pKrnrcn  f)  !0 a.s. P : (4.12)
Now we write
  "2=  d K"
t f c ;d 
K
t f
  
 "2=  d   K"t f a 1h  Ks (pK"h  f)  +c ;dj Ks f  Kt fj
+k Ks k  "2=  da 1h (pK"h  f) c ;df  :
For the last term, we  rst  x b = 1=2 1=d, then apply Lemma 4.16. Noting that by (4.11)
(2=  d) + (1 +bd)c = (2=  d) + (d=2)c> 0;
we get by Lemma 4.16
  "2=  da 1
h (p
K"
h  f) c ;df
  !0
67
along the sequence (rn). Using (5.13) and the a.s. weak continuity of K at the  xed timet, we
see that the right-hand side tends a.s. to 0 as n!1, which implies "2=  d K"t f!c ;d Kt f
a.s. as "!0 along the sequence (rn) for any  xed r2(0;1). Since this holds simultaneously,
outside a  xed null set, for all rational r2 (0;1), the a.s. convergence extends by Lemma
2.3 in [25] to the entire interval (0;1).
Applying this result to a countable, convergence-determining class of functions f (cf.
Lemma 3.2.1 in [5]), we obtain the required a.s. vague convergence. Since  is  nite, the
(2; )-process  t has a.s. compact support (cf. Theorem 9.3.2.2 of [5] and the proof of Theo-
rem 1.2 in [6]). By Lemma 4.2,  Kt also has a.s. compact support, and so the a.s. convergence
remains valid in the weak sense.  
Now we may prove our main result, the Lebesgue approximation of (2; )-processes.
Again, we write ~c ;d = 1=c ;d for convenience, where c ;d is such as in Lemma 4.14. Also
recall that  "t = ( t)" denotes the "-neighborhood measure of  t. For random measures  n
and  on Rd,  n v! (or w!) in L1 means that  nf! f in L1 for all f in CdK (or Cdb).
Theorem 4.18 Let the (2; )-process  in Rd with  < 1 and d> 2= be locally  nite under
P , and  x any t> 0. Then under P , we have as "!0:
~c ;d"2=  d "t v! t a.s. and in L1:
This remains true in the weak sense when  is  nite. The weak version holds even for the
clusters  t when k k= 1.
Proof: For a  nite initial measure  , by Lemma 4.17 and Lemma 4.1 we get as "!0
~c ;d"2=  d "t w! t a.s:
68
For a general  - nite measure  on Rd with  pt <1 for all t > 0, write  =  0 +  00 for
a  nite  0, and let  =  0 +  00 be the corresponding decomposition of  into independent
components with initial measures  0 and  00. Fixing an r> 1 with suppf Br 10 and using
the result for  nite  , we get a.s. on f 00tBr0 = 0g
"2=  d "tf = "2=  d 0"t f!c ;d 0tf = c ;d tf:
As  0" , we get by Lemma 4.9
P f 00tBr0 = 0g= P 00f tBr0 = 0g!1;
and the a.s. convergence extends to  . As in the proof of Lemma 4.17, we can obtain the
required a.s. vague convergence.
To prove the convergence in L1, we note that for any f2CdK
"2=  dE  "tf = "2=  d
Z
P f tB"x > 0gf(x)dx
!
Z
c ;d (  pt)(x)f(x)dx = c ;dE  tf; (4.13)
by Theorem 4.15. Combining this with the a.s. convergence under P and using Proposition
4.12 in [24], we obtain E j"2=  d "tf c ;d tfj! 0. For  nite  , (4.13) extends to any
f 2Cdb by dominated convergence based on Lemmas 5.4 and 5.5(i), together with the fact
that  d(  pt) =k k<1 by Fubini?s theorem.
To extend the Lebesgue approximation to the individual clusters  t, let  0 denote the
process of ancestors of  t at time 0, and note that
Pxf t2 g= P x[ t2 jk 0k= 1];
69
where P xfk 0k = 1g = ( t) 1= e ( t) 1= > 0. The a.s. convergence then follows from the
corresponding statement for  t. Since
P f t2 g=
Z
 (dx)Pxf t2 g;
the a.s. convergence under any P with k k = 1 also follows. To obtain the weak L1-
convergence in this case, we note that for f2Cdb,
"2=  dE  "tf = "2=  d
Z
P f tB"x > 0gf(x)dx
! c ;d ( t)1= 
Z
(  pt)(x)f(x)dx = c ;dE  tf;
by dominated convergence based on Lemma 5.5(i) and Theorem 4.15.  
As in Corollary 7.2 of [25], for the intensity measures in Theorem 4.18, we have even
convergence in total variation.
Corollary 4.19 Let  be a (2; )-process in Rd with  < 1 and d> 2= . Then for any  nite
 and t> 0, we have as "!0:
  "2=  dE
  "t  c ;dE  t
  !0:
This remains true for the clusters  t, and it also holds locally for  t whenever  is locally
 nite under P .
Finally let us give a detailed explanation of the deterministic distribution property
of (2; )-processes. Here the deterministic distribution property has two aspects. De ne
deterministic functions  "; similar to those de ned on page 309 of [41], Theorem 4.18
shows that a.s.
 (supp t) =  t;
70
so a.s.  t is a deterministic function of its support supp t. This is the  rst aspect of the
deterministic distribution property. Now the second aspect. Since  d(@Brx) = 0, we get
 t(@Brx) = 0 a.s. by noting E  t = (  pt)  d. With the help of Portmanteau Theorem for
 nite measures, Theorem 4.18 shows that a.s. for all open balls B with rational centers and
rational radius,
lim"!0  "(supp t)(B) =  t(B);
so the construction of  t(!) from its support supp t(!) is the same everywhere for any  xed
! outside a null set.
71
Chapter 5
Lebesgue Approximation of Superprocesses with a Regularly Varying Branching Mechanism
5.1 Introduction
Superprocesses are certain measure-valued Markov processes  = ( t), whose distri-
butions can be characterized by two components: the branching mechanism speci ed by
a function  (v), and the spatial motion usually given by a Markov process X. If X is a
Feller process in Rd with generator L, then the laplace functional E exp(  tf) satis es
E [exp(  tf)j s] = exp(  svt s) where vt(x) is the unique nonnegative solution of the so-
called evolution equation _v = Lv  (v) with initial condition v0 = f. We call this superpro-
cess an (L; )-superprocess (or (L; )-process for short). For  2(0;2] and  2(0;1], if X is
a rotation invariant  -stable L evy process in Rd with generator 12  and  (v) = v1+ , we get
a superprocess corresponding to the PDE _v = 12  v v1+ . We call it an ( ; )-superprocess
(( ; )-process for short), which is just a (12  ;v1+ )-superprocess in our previous nota-
tion. General surveys of superprocesses include the excellent monographs and lecture notes
[5, 15, 17, 32, 35, 42].
For any measure  on Rd and constant " > 0, write  " for the restriction of Lebesgue
measure  d to the "-neighborhood of supp . For a (2,1)-process  in Rd, Tribe [48] showed
that "2 d "t w!cd t a.s. as "! 0 for  xed time t > 0 when d 3, where w! denotes weak
convergence. Perkins [41] improved Tribe?s result by showing that the Lebesgue approxima-
tion actually holds for all time t > 0 simultaneously. Kallenberg [25] proved the Lebesgue
approximation of 2-dimensional (2,1)-processes. In [22], we showed that, for any (2; )-
process  in Rd with  < 1 and d > 2= , "2=  d "t w!c ;d t a.s. as " ! 0 for  xed time
t > 0. In particular, the Lebesgue approximation result implies that the superprocess  t
distributes its mass over supp t in a deterministic manner. See the end of [22] for a detailed
72
explanation of this deterministic distribution property. However, for any ( ; )-process  
with  < 2, supp t = Rd or ; a.s. (cf. [18, 40]), and so the corresponding property fails.
From all these Lebesgue approximation results, we raise the natural conjecture: Lebesgue
approximation holds for superprocesses with Brownian spatial motion and any \reasonable"
branching mechanism.
As a  rst step to prove this general conjecture, in this chapter we study the Lebesgue ap-
proximation of superprocesses with Brownian spatial motion and a regularly varying branch-
ing mechanism. For a precise description of the branching mechanism we consider in this
chapter, refer to the beginning of Section 3. The stable branching mechanism  (v) = v1+ 
with  2 (0;1] is a special case of the regularly varying branching mechanism we consider
here. Our main result in this chapter is Theorem 5.5, where we prove that the Lebesgue
approximation still holds for these more general superprocesses. Speci cally, ~m(") "t w! t
a.s. as "! 0 for  xed time t > 0, where m(") is a suitable normalizing function. In par-
ticular, if the branching mechanism is the stable one, we may recover all previous Lebesgue
approximation results for  xed time t> 0.
Although the previous conjecture may seems very natural, technically we have limited
tools to support some rigorous arguments needed. One such boundary is imposed by the
availability of the very important cluster representation of superprocesses. Luckily the super-
processes we consider here do have the cluster representation. Another boundary is imposed
by the availability of the lower and upper bounds of the hitting probabilities P f tB"x > 0g,
which is fundamental for the Lebesgue approximation. The restriction on the branching
mechanism we consider actually follows from Theorem 2.3 in [9], which is exactly the lower
and upper bounds of the hitting probabilities.
Armed with the hitting estimates, then we are able to overcome the main di culty in
this chapter, that is, to obtain an asymptotic result of the hitting probabilities P f tB"x > 0g,
which is Theorem 5.11. Note that for a (2; )-process, such a result is obtained by using the
strong scaling property. Since the regularly varying branching mechanism we consider here
73
has much weaker scaling property, we then have to rely on only the cluster representation and
the hitting estimations. Also the form of the asymptotic result of the hitting probabilities is
not clear in our general setting. By adapting an idea in Section 5 of [25], we can get the correct
form of our asymptotic result, which determines the form of the Lebesgue approximation.
This chapter is organized as follows. In Section 2 we review the truncation of super-
processes in a more general setting. In Section 3, we develop some lemmas about hitting
bounds and neighborhood measures of the more general superprocesses. In Section 4, we
derive some asymptotic results of these hitting probabilities. Finally in Section 5 we state
and prove the Lebesgue approximation of superprocesses with a regularly varying branching
mechanism and their truncated processes. This general result contains all previous Lebesgue
approximation of superprocesses as special cases.
5.2 Truncation of superprocesses
In this section we discuss the truncation of superprocesses with a general branching
mechanism, due to their independent interests.
We consider a general branching mechanism function  de ned on R+ as
 (v) = av +bv2 +
Z
(0;1)
(e rv 1 +rv) (dr);
where b 0 and  is a measure on (0;1) such that R10 (r^r2) (dr) <1.
It is well known that the (L;1)-process has weakly continuous sample paths. By contrast,
when  6= 0, the corresponding superprocess  has only weakly rcll sample paths with jumps
of the form   t = r x, for some t> 0, r> 0, and x2Rd. Let
N (dt;dr;dx) =
X
(t;r;x):   t=r x
 (t;r;x):
74
Clearly the point process N on R+ R+ Rd records all information about the jumps of  .
By the proof of Theorem 6.1.3 in [5], we know that N has compensator measure
^N (dt;dr;dx) = (dt) (dr) t(dx): (5.1)
Due to all the \big" jumps,  t has in nite variance. Some methods for (L;1)-processes, which
rely on the  nite variance of the processes, are not directly applicable to superprocesses with
a branching mechanism having  6= 0.
Mt(f) = Mct (f) +Mdt (f) =  tf  0f 
Z t
0
 s(Lf)ds;
 tf =  0f +
Z t
0
 s(Lf)ds+Mct (f) +Mdt (f)
where Mct (f) is a continuous martingale with quadratic variation process
[Mc(f)]t =
Z t
0
 s(bf2)ds; (5.2)
and Mdt (f) is a purely discontinuous martingale, which can be written as follows
Mdt (f) =
Z t
0
Z
(0;1)
Z
Rd
rf(x) ^N (dt;dr;dx)
=
Z t
0
Z
(0;K]
Z
Rd
rf(x) ^N (dt;dr;dx) +
Z t
0
Z
(K;1)
Z
Rd
rf(x) ^N (dt;dr;dx)
=
Z t
0
Z
(0;K]
Z
Rd
rf(x) ^N (dt;dr;dx)
+
Z t
0
Z
(K;1)
Z
Rd
rf(x)N (dt;dr;dx)  (K;1)
Z t
0
 sfds
E [exp(  Kt f)j Ks ] = exp(  Ks vt s) (5.3)
75
_v = Lv  K(v); (5.4)
where  K = (a  (K;1))v +bv2 +R(0;K](e rv 1 +rv) (dr)
 Kt f =  K0 f +
Z t
0
 Ks (Lf)ds+Mct (f) +Mdt (f)  [K;1)
Z t
0
 Ks fds
where Mct (f) is a continuous martingale with quadratic variation process
[Mc(f)]t =
Z t
0
 Ks (bf2)ds; (5.5)
and Mdt (f) is a purely discontinuous martingale, which can be written as follows
Mdt (f) =
Z t
0
Z
(0;1)
Z
Rd
rf(x) ^N K(dt;dr;dx)
=
Z t
0
Z
(0;K)
Z
Rd
rf(x) ^N K(dt;dr;dx)
N K(dt;dr;dx) =
X
(t;r;x):   Kt =r x
 (t;r;x):
^N K(dt;dr;dx) = (dt)1(0;K)(r) (dr) Kt (dx): (5.6)
In [38], Mytnik and Villa introduced a truncation method for ( ; )-processes with
 < 1, which can be used to study ( ; )-processes with  < 1, especially to extend results
of ( ;1)-processes to ( ; )-processes with  < 1. Speci cally, for the ( ; )-process  with
 < 1, we de ne the stopping time  K = infft > 0 : k  tk> Kg for any constant K > 0.
Clearly  K is the time when  has the  rst jump greater than K. For any  nite initial
measure  , they proved that one can de ne  and a weakly rcll, measure-valued Markov
process  K on a common probability space such that  t =  Kt for t <  K. Intuitively,  K
euqals  minus all masses produced by jumps greater than K along with the future evolution
76
of those masses. In this paper, we call  K the truncated K-process of  . Since all \big" jumps
are omitted,  Kt has  nite variance. They also proved that  Kt and  t agree asymptotically
as K!1. We give a di erent proof of this result, since similar ideas will also be used at
several crucial stages later. We write P f 2 gfor the distribution of  with initial measure
 .
Using the same proof of Lemma 1 in [38], we can construct  and  K on a common
probability space such that  t(!) =  Kt (!) for t <  K(!). This con rms our intuition that
 K euqals  minus all masses produced by jumps greater than K along with the future
evolution of those masses.
Lemma 5.1 We can de ne  and  K on a common probability space such that:
(i)  is an ( ; )-process with  < 1 and a  nite initial measure  , and  K is its truncated
K-process,
(ii)  t(!) =  Kt (!) for t< K(!).
Now we can prove that  Kt and  t agree asymptotically as K!1. We choose to give
a complete proof of this result, since similar ideas will also be used at several crucial stages
later. We write P f 2 g for the distribution of  with initial measure  .
Lemma 5.2 Fix any  nite  and t> 0. Then P f K >tg!1 as K!1.
Proof: If  K  t, then  has at least one jump greater than K before time t. Noting
that N ([0;t];(K;1);Rd) is the number of jumps greater than K before time t, we get by
Theorem 25.22 of [24] and (4.1),
P f K  tg  E N  [0;t];(K;1);Rd 
= E ^N  [0;t];(K;1);Rd 
=_  [K;1)E 
Z t
0
k skds = tk k [K;1)!0
77
as K!1, where the last equation holds by E k sk=k k.  
Using the same proof of Lemma 2.2 in [22], we can prove that  Kt (!)  t(!) for any t
and !. So indeed,  K is a \truncation" of  .
Lemma 5.3 We can de ne  and  K on a common probability space such that:
(i)  is an ( ; )-process with  < 1 and a  nite initial measure  , and  K is its truncated
K-process,
(ii)  t(!)  Kt (!) for any t and !,
(iii)  t(!) =  Kt (!) for t< K(!).
5.3 Hitting bounds
First we specify the regularly varying branching mechanism we consider for the Lebesgue
approximation. We consider the increasing function  de ned on R+ by
 (v) = bv2 +
Z
(0;1)
2rv2
1 + 2rv 
0(dr);
where b 0 and  0 is a measure on (0;1) such that R(0;1)(1^r) 0(dr) <1. To avoid trivial
cases, we assume either b > 0 or  0((0;1)) = 1. The function  can be expressed in the
usual form for branching mechanism functions,
 (v) = bv2 +
Z
(0;1)
(e rv 1 +rv) (dr);
where  (dr) = [R(0;1)e r=(2u)=(4u2) 0(du)]dr satis es R(0;1)(r^r2) (dr) <1. Notice that
if we take b = 0 and  0(dr) = c0r (1+ )dr then we get the stable case  (v) = cv1+ .
We consider the following two assumptions:
78
(A1) The function  is regularly varying at 1 with index 1 +  where  2(0;1]; that
is to say,
limu!1  (ru) (u) = r1+ for every r> 0
(A2) lim supr!0+r (1+ ) (r) <1.
The stable case  (v) = v1+ satis es all these assumptions.
The Lebesgue approximation depends crucially on estimates of the hitting probability
P f tB"0 > 0g. In this section, we  rst estimate P f tB"0 > 0g and P f Kt B"0 > 0g. Then
we use these estimates to study multiple hitting and neighborhood measures of the clusters
 Kh associated with the truncated K-process  K. We begin with a well-known relationship
between the hitting probabilities of superprocesses and their clusters, which can be proved
as in Lemma 4.1 of [25].
Lemma 5.4 Let the ( ; )-process  in Rd with associated clusters  t be locally  nite under
P , let  K be its truncated K-process with associated clusters  Kt , and  x any B2Bd. Then
P f tB > 0g =  at log (1 P f tB > 0g);
P f tB > 0g = 1 exp   a 1t P f tB > 0g ;
P f Kt B > 0g =  aKt log (1 P f Kt B > 0g);
P f Kt B > 0g = 1 exp ( (aKt ) 1P f Kt B > 0g):
In particular, P f tB > 0g a 1t P f tB > 0g and P f Kt B > 0g (aKt ) 1P f Kt B > 0g as
either side tends to 0.
Upper and lower bounds of P f tB"0 > 0g have been obtained by Delmas [9], using the
Brownian snake. However, in this paper we need the following improved upper bound.
Lemma 5.5 Let  t be the clusters of a (2; )-process  in Rd with  < 1 and d > 2= , let
 Kt be the clusters of  K, the truncated K-process of  , and consider a  - nite measure  on
Rd. Then for 0 <" pt,
79
(i) l2(") pt0 <_ "2=  da 1t P f tB"0 > 0g<_ l1(") p2t; where t0 =  t=(1 + ),
(ii) "2=  d(aKt ) 1P f Kt B"0 > 0g<_ l1(") p2t:
Proof: (i) Follow the proof of Lemma 6.3(i) in [25], then use Lemma 5.5(ii).
(ii) This is obvious from (i), Lemma 4.2, and Lemma 5.4.  
As in [25] we need to estimate the probability that a ball in Rd is hit by more than one
subcluster of the truncated K-process  K. This is where the truncation of  is needed.
Lemma 5.6 Fix any K > 0. Let  K be the truncated K-process of a (2; )-process  in Rd
with  < 1 and d> 2= . For any t h> 0 and "> 0, let  K"h be the number of h-clusters
of  Kt hitting B"0 at time t. Then for "2  h t,
E  K"h ( K"h  1) <_ l21(")"2(d 2= ) h1 d=2 pt + ( p2t)2 :
Proof: Follow Lemma 4.4 in [25], then use Lemma 3 of [38] and Lemma 5.5(ii).  
Now we consider the neighborhood measures of the clusters  Kh associated with the trun-
cated K-process  K. For any measure  on Rd and constant "> 0, we de ne the associated
neighborhood measure  " as the restriction of Lebesgue measure  d to the "-neighborhood of
supp , so that  " has Lebesgue density 1f B"x > 0g. Let pK;"h (x) = Pxf Kh B"0 > 0g, where
the  Kh are clusters of  K. Write pK;"h (x) = pK"h (x) and ( K;ih )" =  Ki"h for convenience.
Lemma 5.7 Let  K be the truncated K-process of a (2; )-process  in Rd with  < 1 and
d> 2= . Let the  Kih be conditionally independent h-clusters of  K, rooted at the points of a
Poisson process  with E =  . Fix any measurable function f 0 on Rd. Then,
(i) E Pi Ki"h = (  pK"h )  d ,
(ii) E Var Pi Ki"h fj  <_ l1(")aKh "d 2= hd=2kfk2k k for "2  h.
80
Proof: (i) Follow the proof of Lemma 6.2(i) in [25].
(ii)Follow the proof of Lemma 4.4(ii) in [22].  
We also need to estimate the overlap between subclusters.
Lemma 5.8 Let  K be the truncated K-process of a (2; )-process  in Rd with  < 1 and
d > 2= . For any  xed t > 0, let  Kih denote the subclusters in  K of age h > 0. Fix any
 2 ^Md. Then as "2  h!0,
E 
  
 
X
i 
Ki"
h   
K"
t
  
 <_ l21(")"2(d 2= )h1 d=2:
Proof: Follow the proof of Lemma 6.3(i) in [25], then use Lemma 5.5(ii).  
5.4 Hitting asymptotics
Write p"h(x) = Pxf hB"0 > 0g and pK;"h (x) = Pxf Kh B"0 > 0g, where  h and  Kh denote
an h-cluster associated with the superprocess  in Rd and its truncated K-process  K re-
spectively. Recall that  dp"h = P df hB"0 > 0g. Write pK"h = pK;"h for convenience. For the
functions p"h and pK"h , we have the following basic asymptotic property. Since we do not have
a lower bound for Pxf Kh B"0 > 0g in Lemma, this asymptotic property is crucial to us by
showing that essentially Pxf hB"0 > 0g and Pxf Kh B"0 > 0g share the same lower bound.
Lemma 5.9 As 0 <"2  h!0 with "2=  d b0h1+bd!0 for some b0> 0 and b2(0;1=2),
a 1h  dp"h (aKh ) 1 dpK"h :
Proof: We just need to show that
a 1h  dp"h (aKh ) 1 dpK"h
a 1h  dp"h !0:
81
By Lemma 5.5(i), we get
a 1h  dp"h l2(") dph0"d 2= = l2(")"d 2= ;
a 1h P1f(Bhb
0 )cg d
f hB"0 > 0g  l1(")1f(Bhb0 )cg dph"d 2= (5.7)
= l1(")"d 2= 
Z
jxj hb
ph(x)dx (5.8)
 l1(")"d 2= hc; (5.9)
for some c> 0. Since "2=  d b0h1+bd!0, we get
a 1h P1f(Bhb
0 )cg d
f hB"0 > 0g
a 1h  dp"h !0:
Similarly, we get
(aKh ) 1P1f(Bhb
0 )cg d
f Kh B"0 > 0g
a 1h  dp"h !0:
By Lemma 5.4,  nally it su ces to show that
P1fBhb
0 g d
f hB"0 > 0g P1fBhb
0 g d
f Kh B"0 > 0g
a 1h  dp"h !0 (5.10)
By Theorem 25.22 of [24] and (4.1),
"2=  d
 
P1fBhb
0 g d
f hB"0 > 0g P1fBhb
0 g d
f Kh B"0 > 0g
 
 "2=  dE1fBhb
0 g d
N  [0;h];(K;1);Rd 
= "2=  dE1fBhb
0 g d
^N  [0;h];(K;1);Rd 
=_ "2=  dE
Z h
0
k skds =_ "2=  dh1+bd!0:
 
82
De ne normalizing functions m(") by
m(") = a 1"c  dp""c (5.11)
with a  xed c satisfying
(d 2= ) + ( d=2 + 1=2)c = 0: (5.12)
Clearly c2(0;2).
Lemma 5.10 Fix any bounded, uniformly continuous function f 0 on Rd. As "!0,
  ~m(")(aK
"r)
 1 (pK"
"r  f) f
  !0:
The result holds uniformly over any class of uniformly bounded and equicontinuous functions
f 0 on Rd.
Proof: By Lemma 5.9, we get
~m(")(aK"r) 1 dpK""r !1:
De ning ^pK"h = pK"h = dpK"h , we then only need to show that k^pK"h  f fk!0. Now follow
the proof of Lemma 4.4(i) in [22] and use Lemma 5.5(ii).  
Theorem 5.11 Let  be a superprocess in Rd. Then for any t> 0 and bounded  , we have
as "!0
  ~m(")P
 f Kt B" > 0g e bKt(  pt)
  !0;
k~m(")P f tB" > 0g   ptk!0:
Proof:
P f Kt B" > 0g  E ( Ks  pK"h ) = (aKh ) 1E ( Ks  pK"h )
83
= e bKs(aKh ) 1(  ps pK"h ) e bKsm(")(  ps)
 e bKtm(")(  pt):
P f tB"x > 0g E ( s p"h)
k~m(")E ( s p"h)   ptk!0;
P f tB"x > 0g P f Kt B"x > 0g
ke bKt(  pt)   ptk!0
as K!1 since bK !0  
5.5 Lebesgue approximations
Same as in Section 5 of [22], here we begin with the Lebesgue approximation for  K, the
truncated K-process of  . Then we get the Lebesgue approximation for  immediately by
Lemma 5.2. Write ~m(") = 1=m(") for convenience, where m(") is de ned in (5.11). Recall
that  K"t = ( Kt )", the "-neighborhood measure of  Kt . For random measures  n and  on Rd,
 n w! in L1 means that  nf! f in L1 for all f in Cdb.
Theorem 5.12 Let  K be the truncated K-process of a superprocess  in Rd satisfying as-
suptions (A1) and (A2) with  < 1 and d> 2= . Fix any  2 ^Md and t> 0. Then under
P , we have as "!0:
~m(") K"t w! Kt a.s. and in L1
84
Proof: Fix any f 2CdK. We  rst prove that ~m(") K"t f ! Kt f a.s. as "!0. In order
to do that, we only need to show that for any sequence "n ! 0 as n!1, we can pick a
subsequence (still denoted by "n) such that ~m("n) K"nt f ! Kt f a.s. To do this we  x an
r 2 (0;1). Then for any given sequence "n ! 0 as n !1, we pick the subsequence "n
satisfying "n rn.
We follow the proof of Lemma 5.1 in [22]. Write  Kih for the subclusters of  Kt of age h.
Since the ancestors of  Kt at time s = t h form a Cox process directed by  Ks =aKh , Lemma
5.7(i) yields
E 
hX
i 
Ki"
h f
  
  Ks
i
= (aKh ) 1 Ks (pK"h  f);
and so by Lemma 5.7(ii)
E 
  
 
X
i 
Ki"
h f (a
K
h )
 1 K
s (p
K"
h  f)
  
 
2 = E
 Var
hX
i 
Ki"
h f
  
  Ks
i
<_ l1(")aKh "d 2= hd=2kfk2E k Ks =aKhk
 l1(")"d 2= hd=2kfk2k k;
where the last inequality follows from E k Ks k k k. Combining with Lemma 5.8 gives
E    K"t f (aKh ) 1 Ks (pK"h  f)  
 E 
  
  K"t f 
X
i 
Ki"
h f
  
 +E 
  
 
X
i 
Ki"
h f (a
K
h )
 1 K
s (p
K"
h  f)
  
 
<_ l21(")"2(d 2= ) h1 d=2kfk+l1=21 (")"1=2(d 2= ) hd=4kfk
= "d 2= 
 
l21(")"d 2= h1 d=2 +l1=21 (")" 1=2(d 2= )hd=4
 
kfk:
Taking hn = "cn, where c is de ned in (5.12) and writing sn = t hn = t "cn, we obtain
E 
X
n
~mK("n)   K"nt f a 1hn  Ksn(pK"nhn  f)  
<_ X
n
 
l2(rn)l21(rn)r[(d 2= )+( d=2+1)c]n +l2(rn)l1=21 (rn)r[ 1=2(d 2= )+(d=4)c]n
 
kfk<1;
85
since (d 2= ) + ( d=2 + 1)c > 0 and  1=2(d 2= ) + (d=4)c > 0 by (5.12). Note that
in the previous inequality we also used the fact that the subsequence "n satisfying "n rn.
The inequality above about the expectations clearly implies
~mK("n)   K"nt f a 1hn  Ksn(pK"nhn  f)  !0 a.s. P : (5.13)
Now we write
  ~mK(") K"
t f  
K
t f
  
 ~mK(")   K"t f (aKh ) 1 Ks (pK"h  f)  +j Ks f  Kt fj
+k Ks k  ~mK(")a 1h (pK"h  f) f  :
For the last term, we  rst  x b = 1=2 1=d, then apply Lemma 4.16. Noting that by (4.11)
(2=  d) + (1 +bd)c = (2=  d) + (d=2)c> 0;
we get by Lemma 4.16
  ~mK(")(aK
h )
 1 (pK"
h  f) f
  !0
along the sequence (rn). Using (5.13) and the a.s. weak continuity of K at the  xed timet, we
see that the right-hand side tends a.s. to 0 as n!1, which implies "2=  d K"t f!c ;d Kt f
a.s. as "!0 along the sequence (rn) for any  xed r2(0;1). Since this holds simultaneously,
outside a  xed null set, for all rational r2 (0;1), the a.s. convergence extends by Lemma
2.3 in [25] to the entire interval (0;1).
Applying this result to a countable, convergence-determining class of functions f (cf.
Lemma 3.2.1 in [5]), we obtain the required a.s. vague convergence. Since  is  nite, the
(2; )-process  t has a.s. compact support (cf. Theorem 9.3.2.2 of [5] and the proof of Theo-
rem 1.2 in [6]). By Lemma 4.2,  Kt also has a.s. compact support, and so the a.s. convergence
86
remains valid in the weak sense.  
Now we may prove our main result, the Lebesgue approximation of superprocesses with
a regularly varying branching mechanism. Again, we write ~m(") = 1=m(") for convenience,
where m(") is de ned in (5.11). Also recall that  K"t = ( Kt )", the "-neighborhood measure
of  Kt . For random measures  n and  on Rd,  n w! in L1 means that  nf! f in L1 for all
f in Cdb.
Theorem 5.13 Let the superprocess  in Rd satisfy assuptions (A1) and (A2) with  < 1
and d> 2= . Fix any  2 ^Md and t> 0. Then under P , we have as "!0:
~m(") "t w! t a.s. and in L1
Proof: by Theorem 5.12 and Lemma 5.2 we get as "!0
~m(") "t w! t a.s.:
To prove the convergence in L1, we note that for any f2Cdb
~m(")E  "tf = ~m(")
Z
P f tB"x > 0gf(x)dx
!
Z
(  pt)(x)f(x)dx = E  tf; (5.14)
by Theorem 5.11. Combining this with the a.s. convergence under P and using Proposition
4.12 in [24], we obtain E j~m(") "tf  tfj!0.  
If  is a (2;1)-process in Rd with d 3, then at = t. By (4) in [25], we get
m(") = " r dp""r  " rcd"d 2"r = cd"d 2:
87
So we recover the Lebesgue approximation of (2;1)-processes, that is,
~cd"2 d "t w! t a.s. and in L1:
Similarly, if  is a (2; )-process in Rd with  < 1 and d> 2= , then at = ( t)1= . By (9) in
[22], we get
m(") = ( "r) 1=  dp""r  c ;d"d 2= :
Again, we recover the Lebesgue approximation of (2; )-processes, that is,
~c ;d"2=  d "t w! t a.s. and in L1:
88
Bibliography
[1] Bertoin, J., Le Gall, J. F. and Le Jan, Y. (1999). Spatial branching processes
and subordination. Canad. J. Math. 49, 24{54.
[2] Bass, R.F., Levin, D.A. (2002). Transition probabilities for symmetric jump pro-
cesses. Trans. Amer. Math. Soc. 354, 2933{2953.
[3] Dawson, D. A. (1975). Stochastic evolution equations and related measure processes.
J. Multivariate Anal. 5 1{52.
[4] Dawson, D.A. (1992). In nitely divisible random measures and superprocesses In:
Stochastic analysis and related topics. Progr. Probab. 31, 1|129. Birkh auser, Boston,
MA.
[5] Dawson, D.A. (1993). Measure-valued Markov processes. In:  Ecole d?  Et e de Proba-
bilit es de Saint-Flour XXI{1991. Lect. Notes in Math. 1541, 1|260. Springer, Berlin.
[6] Dawson, D.A., Iscoe, I., Perkins, E.A. (1989). Super-Brownian motion: Path
properties and hitting probabilities. Probab. Th. Rel. Fields 83, 135{205.
[7] Dawson, D.A., Perkins, E.A. (1991). Historical processes. Mem. Amer. Math. Soc.
93, #454.
[8] Dawson, D. A., Vinogradov, V. (1994). Almost-sure path properties of (2;d; )-
superprocesses. Stochastic Process. Appl. 51 221{258.
[9] Delmas, J.-F. (1999). Path Properties of Superprocesses with a General Branching
Mechanism. Ann. Probab. 27, 1099-1134.
[10] Delmas, J.-F. (1999). Some properties of the range of super-Brownian motion. Probab.
Th. Rel. Fields 114, 505{547.
[11] Duquesne, T. (2009). The packing measure of the range of Super-Brownian motion.
Ann. Probab. 37, 2431{2458.
[12] Duquesne, T., Le Gall, J.-F. (2002). Random trees, L evy processes and spatial
branching processes. Ast erisque 281, vi-147.
[13] Dynkin, E.B. (1991). Branching particle systems and superprocesses. Ann. Probab.
19, 1157{1194.
89
[14] Dynkin, E.B. (1991). Path processes and historical superprocesses. Probab. Th. Rel.
Fields 90, 1{36.
[15] Dynkin, E.B. (1994). An Introduction to Branching Measure-Valued Processes. CRM
Monograph Series 6. AMS, Providence, RI.
[16] Dynkin, E.B. (2002). Di usions, Superdi usions and Partial Di erential Equations.
Colloquium Publications 50. AMS, Providence, RI.
[17] Etheridge, A.M. (2000). An Introduction to Superprocesses. University Lecture Series
20. AMS, Providence, RI.
[18] Evans, S.N., Perkins, E. (1991). Absolute continuity results for superprocesses with
some applications. Trans. Amer. Math. Soc. 325, 661{681.
[19] Falconer, K. (2007). Fractal geometry: mathematical foundations and applications.
Wiley.
[20] Folland, G. B. (1999). Real Analysis: Modern Techniques and Their Applications.
John Wiley & Sons, New York.
[21] Gmira, A., Veron, L. (1984). Large time behaviour of the solutions of a semilinear
parabolic equation in RN. J. Di er. Equations 53, 258{276.
[22] He, X. (2013). Lebesgue approximation of (2; )-superprocesses. Stochastic Process.
Appl. 123 1802{1819.
[23] He, X. (2013). Lebesgue approximation of superprocesses with a regularly varying
branching mechanism. Unpublished manuscript.
[24] Kallenberg, O. (2002). Foundations of Modern Probability, 2nd ed. Springer, New
York.
[25] Kallenberg, O. (2008). Some local approximations of Dawson-Watanabe superpro-
cesses. Ann. Probab. 36, 2176{2214.
[26] Kallenberg, O. (2011). Iterated Palm Conditioning and Some Slivnyak-Type Theo-
rems for Cox and Cluster Processes. J. Theor. Probab. 24, 875{893.
[27] Kallenberg, O. (2013). Local conditioning in Dawson-Watanabe superprocesses.
Ann. Probab. 41, 385{443.
[28] Kingman, J.F.C. (1973). An intrinsic description of local time. J. London Math. Soc.
(2) 6, 725{731.
[29] Klenke, A. (1998). Clustering and invariant measures for spatial branching models
with in nite variance. Ann. Probab. 26, 1057{1087.
[30] Le Gall, J.F. (1991). Brownian excursions, trees and measure-valued branching pro-
cesses. Ann. Probab. 19, 1399{1439.
90
[31] Le Gall, J. F. (1994). A lemma on super-Brownian motion with some applications.
The Dynkin Festschrift. Progr. Probab. 24 237{251. Birkh auser, Boston, MA.
[32] Le Gall, J.F. (1999). Spatial Branching Processes, Random Snakes and Partial Dif-
ferential Equations. Lectures in Mathematics, ETH Z urich. Birkh auser, Basel.
[33] Le Gall, J.F., Perkins, E. (1995). The Hausdor Measure of the Support of Two-
Dimensional Super-Brownian Motion. Ann. Probab. 23, 1719{1747.
[34] Le Gall, J.F., Perkins, E.A., Taylor S.J. (1995). The packing measure of the
support of super-Brownian motion. Stochastic Process. Appl. 59 1{20.
[35] Li, Z. (2011). Measure-Valued Branching Markov Processes. Springer, New York.
[36] M orters, P, Peres, Y (2010). Brownian Motion. Cambridge University Press.
[37] Mytnik, L., Perkins, E. (2003). Regularity and irregularity of (1 + )-stable super-
Brownian motion. Ann. Probab. 31, 1413{1440.
[38] Mytnik, L., Villa, J. (2007). Self-intersection local time of ( ;d; )-superprocess.
Ann. Inst. Henri Poincar Probab. Stat. 43, 481{507.
[39] Pazy, A. (1983). Semigroups of linear operators and applications to partial di erential
equations. 198. Springer, New York.
[40] Perkins, E. (1990). Polar sets and multiple points for super-Brownian motion. Ann.
Probab. 18, 453{491.
[41] Perkins, E. (1994). The strong Markov property of the support of super-Brownian
motion. The Dynkin Festschrift. Progr. Probab. 24 307{326. Birkh auser, Boston, MA.
[42] Perkins, E. (2002). Dawson-Watanabe superprocesses and measure-valued di usions.
 Ecole d?  Et e de Probabilit es de Saint-Flour XXIX{1999. Lect. Notes in Math. 1781,
125{329. Springer, Berlin.
[43] Perkins, E. (2004). Super-Brownian motion and criticial spatial stochastic systems.
Bull. Can. Math. Soc. 47, 280{297.
[44] Revuz, D., Yor, M (1999). Continuous martingales and Brownian motion. 293.
Springer, Berlin.
[45] Sato, K. I. (1999). L evy processes and in nitely divisible distribution. Cambridge Uni-
versity Press.
[46] Slade, G. (2002). Scaling limits and super-Brownian motion. Notices Amer. Math.
Soc. 49, 1056{1067.
[47] Stein, E.M., Shakarchi, R. (2005). Real analysis: measure theory, integration, and
Hilbert spaces. Princeton Lectures in Analysis 3. Princeton University Press.
91
[48] Tribe, R. (1994). A representation for super Brownian motion. Stochastic Process.
Appl. 51 207{219.
[49] Watanabe, S. (1968). A limit theorem of branching processes and continuous state
branching processes. J. Math. Kyoto Univ. 8 141{167.
92