LEBESGUE APPROXIMATION OF SUPERPROCESSES by Xin He A dissertation submitted to the Graduate Faculty of Auburn University in partial ful llment of the requirements for the Degree of Doctor of Philosophy Auburn, Alabama August 03, 2013 Keywords: Super-Brownian motion, measure-valued branching processes, deterministic distribution properties Copyright 2013 by Xin He Approved by Olav Kallenberg, Chair, Professor of Mathematics Ming Liao, Professor of Mathematics Jerzy Szulga, Professor of Mathematics Erkan Nane, Associate Professor of Mathematics Abstract Superprocesses are certain measure-valued Markov processes, whose distributions can be characterized by two components: the branching mechanism and the spatial motion. It is well known that some basic superprocesses are scaling limits of various random spatially distributed systems near criticality. We consider the Lebesgue approximation of superprocesses. The Lebesgue approxima- tion means that the processes at a xed time can be approximated by suitably normalized restrictions of Lebesgue measure to the small neighborhoods of their support. From this, we see that the processes distribute their mass over their support in a deterministic and \uniform" manner. It is known that the Lebesgue approximation holds for the most basic Dawson{Watanabe superprocesses but fails for certain superprocesses with discontinuous spatial motion. In this dissertation we rst prove that the Lebesgue approximation holds for superpro- cesses with Brownian spatial motion and a stable branching mechanism. Then we generalize the Lebesgue approximation even further to superprocesses with Brownian spatial motion and a regularly varying branching mechanism. We believe that the Lebesgue approxima- tion holds for superprocesses with Brownian spatial motion and any \reasonable" branching mechanism. Our present results may be regarded as some progress towards a complete proof of this very general conjecture. ii Acknowledgments I owe my deepest gratitude to my advisor Dr. Olav Kallenberg, who has put lots of his time and e ort in trying to develop me into a proper researcher in probability. His taste, style, and work ethic have all in uenced me immensely, and they will continue to in uence me in the future. I also sincerely thank Dr. Ming Liao, Dr. Jerzy Szulga, and Dr. Erkan Nane for serving in my committee and providing me much help when needed. Finally, I am grateful for the moral support of my family: My parents Anqing He and Ruizhen Hao, my sister Ni He, and my wife Yajie Chu. I am extremely lucky to have them and I simply can not imagine my life without them. iii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 A short introduction to superprocesses . . . . . . . . . . . . . . . . . . . . . 1 1.2 Summary of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Some Basics of Superprocesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1 Moment measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Cluster representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Historical superprocesses and random snakes . . . . . . . . . . . . . . . . . . 13 2.4 Hausdor dimensions and Hausdor measures . . . . . . . . . . . . . . . . . 16 2.5 Lebesgue approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3 Lebesgue Approximation of Dawson-Watanabe Superprocesses . . . . . . . . . . 26 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3 Lebesgue approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.4 Proofs of lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4 Lebesgue Approximation of (2; )-Superprocesses . . . . . . . . . . . . . . . . . 44 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2 Truncated superprocesses and local niteness . . . . . . . . . . . . . . . . . . 46 4.3 Hitting bounds and neighborhood measures . . . . . . . . . . . . . . . . . . 54 4.4 Hitting asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.5 Lebesgue approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 iv 5 Lebesgue Approximation of Superprocesses with a Regularly Varying Branching Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.2 Truncation of superprocesses . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.3 Hitting bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.4 Hitting asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.5 Lebesgue approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 v List of Notation x Dirac measure at x =_ equality up to a constant factor ^Md space of nite measures on Rd d Lebesgue measure on Rd <_ inequality up to a constant factor f integral of function f with respect to measure " neighborhood measure of , de ned as the restriction of Lebesgue measure d to the "-neighborhood of supp A closure of set A kfk supremum norm of function f k k total variation of measure = (v) branching mechanism _ the combination of <_ and >_ = ( t) superprocess or discrete spatial branching process A" "-neighborhood of set A, A" =fx : d(x;A) <"g Ac compliment of set A Brx an open ball around x of radius r vi f g f g!0 f g f=g!1 Md space of - nite measures on Rd v! vague convergence of measures w! weak convergence of measures vii Chapter 1 Introduction 1.1 A short introduction to superprocesses In this section we give a short introduction to superprocesses. Three characterizations of superprocesses will be given. They are the Laplace functional approach, the weak conver- gence approach, and the martingale problem approach. Superprocesses were introduced by Watanabe [49] in 1968 and Dawson [3] in 1975, and have been studied extensively ever since. General surveys of superprocesses include the following excellent monographs and lecture notes: Dawson [4, 5], Dynkin [15, 16], Etheridge [17], Le Gall [32], Li [35], and Perkins [42]. Two extremely informative yet concise and very accessible introductions of superprocesses are Perkins [43] and Slade [46]. First let us explain the two de ning components of a superprocess: the branching mech- anism and the spatial motion. We begin with branching processes, which contain only one component of superprocesses: the branching mechanism. Galton-Watson processes are dis- crete branching processes. They describe the evolution in discrete time of a population of individuals who reproduce according to an o spring distribution, which is a probability mea- sure on the nonnegative integers with expectation 1 (we only consider the critical case in this introduction). The distribution of a Galton-Watson process is determined by this o spring distribution. Continuous-state branching processes are continuous analogues of the Galton- Watson branching processes. Roughly speaking, they describe the evolution in continuous time of a \population" with values in the positive real line R+. The \population" consists of uncountably many \individuals", if its value is not 0. The distribution of a continuous- state branching process is determined by a function of the following type (again, we only 1 consider the critical case in this introduction, so no drift term here) (v) = av2 + Z 1 0 (e rv 1 +rv) (dr) (1.1) where a 0 and is a - nite measure on (0;1) such that R10 (r^r2) (dr) <1. This function is called the branching mechanism. Continuous-state branching processes may also be obtained as weak limits of rescaled Galton-Watson processes, see (1.7). This is closely related to the weak convergence approach to superprocesses, see (1.8). Spatial branching processes are obtained by combining the branching phenomenon with a spatial motion, which is usually given by a Markov process X. In the discrete setting, the branching phenomenon is a Galton-Watson process, and the individuals move independently in space according to the law of X. More precisely, when an individual dies at position x, her children begin to move from the initial point x, and they move in space independently according to the law of X. Writing Y 1t ;Y 2t ;::: for the positions of all individuals alive at time t, we may de ne t = X i Yit (1.2) where y denotes the Dirac measure at y. The process = ( t;t 0) is the spatial branching process corresponding to the branching phenomenon of a Galton-Watson process and the spatial motion X. Note that this is a measure-valued process, whose value at time t records the positions of all individuals alive at time t. In the continuous setting, the branching phenomenon is a continuous-state branching process with branching mechanism . The construction of the spatial motions is harder, and so here we proceed only heuristically. For mathematical support of these heuristics, refer to the weak convergence approach later in this section (see (1.7) and (1.8)), cluster representation in Section 2.2, and historical superprocesses and random snakes in Section 2.3. Here we let the \individuals" move independently in space according to the law of a Markov process X. Thus when an \individual" dies at the position x, her \children" begin 2 to move from the initial point x, and they move in space independently according to the law of X. Again we get a measure-valued process = ( t;t 0), whose value at time t records the positions of all \individuals" alive at time t. This measure-valued process = ( t) is called the (X; ){superprocess (or (X; )-process, for short). Superprocesses are measure-valued Markov processes. We rst use the Laplace func- tional approach to characterize their distributions. For a (X; )-process on Rd, the spatial motion X is a Markov process in Rd. Use f to denote the integral of the function f with respect to the measure . Write P ( 2 ) for the distribution of the process with ini- tial measure , and E for the expectation corresponding to P . The Laplace functional E exp( tf) satis es E [exp( tf)j s] = exp( svt s) (1.3) where vt(x);t 0;x2Rd is the unique nonnegative solution of the integral equation vt(x) + x Z t 0 (vt s(Xs))ds = x (f(Xt)) (1.4) Here we write x(X 2 ) for the distribution of the process X starting from x. If X is a Feller process in Rd with generator L, the integral equation (1.4) is the integral form of the following PDE, the so-called evolution equation _v = Lv (v) (1.5) with initial condition v0 = f. More explicitly, PDE (1.5) means @vt @t (x) = (Lvt)(x) (vt(x)): For the equivalence of the integral equation (1.4) and the di erential equation (1.5), see Section 7.1 in [35]. If X is a rotation invariant (or spherically symmetric, or isotropic) - stable L evy process in Rd for some 2 (0;2] (see De nition 14.12 and Theorem 14.14 in 3 [45]) and (v) = v1+ for some 2(0;1], we get a superprocess corresponding to the PDE _v = 12 v v1+ where 12 is the generator of the rotation invariant -stable process X (see Theorem 19.10 in [24]), and = ( ) =2 is the fractional Laplacian ( 2 = is the Laplacian, see Section 2.6 in [39]). Taking = 1 in the above PDE, we get a superprocess corresponding to the PDE _v = 12 v v1+ : (1.6) We call it the ( ; )-superprocess (( ; )-process for short). For the most basic and most important superprocess, we take = 2 and = 1 to get a (2;1)-process, which is often called the Dawson{Watanabe superprocess (DW-process for short). Clearly a DW-process has Brownian spatial motion and branching mechanism (v) = v2. We may abuse the notation further by referring to ( ; )-processes and (X; )-processes. Speci cally, an ( ; )- process has rotation invariant -stable spatial motion and branching mechanism , and an (X; )-process has spatial motion X and branching mechanism (v) = v1+ . Next we move to the weak convergence approach, which is the most intuitive way to de ne superprocesses. Just as continuous-state branching processes may be obtained as weak limits of rescaled Galton-Watson processes (see (1.7)), superprocesses can be obtained as weak limits of rescaled discrete spatial branching processes (see (1.8)). Recall that we can get intuition about Brownian motion from rescaled random walks, similarly here we may get some intuition about superprocesses from rescaled discrete spatial branching processes. We consider a sequence Nk; k 1 of Galton-Watson processes such that as k!1, 1 akN k [kt];t 0 fd !(Z t;t 0) (1.7) 4 where constants ak "1, Z is a continuous-state branching process with branching mecha- nism , and the symbol fd !means weak convergence of nite-dimensional marginals. Then, according to (1.2), we consider a sequence k; k 1 of spatial branching processes corre- sponding to the Galton-Watson processes Nk; k 1 and the spatial motion X. Clearly kt is a random element with values in the space of nite measures on Rd, equipped with the topology of weak convergence. Now, according to (1.7), we consider a sequence of rescaled spatial branching processes 1a k k[k ]; k 1. Suppose that the initial measures converge as k!1 ( w! denotes weak convergence) : 1 ak k 0 w! ; where is a nite measure on Rd. Finally, under adequate regularity assumptions on the spatial motion X, there exists a measure-valued Markov process such that ( 1a k k[kt];t 0) fd !( t;t 0); (1.8) where is an (X; )-process with initial measure . Finally, superprocesses can also be characterized as solutions to martingale problems. Chapter 7 in [35] is an excellent reference on martingale problems of very general superpro- cesses. We rst discuss a martingale problem of (X;1)-processes, where X is a Feller process in Rd with generator L. Write ^Md for the space of nite measures on Rd. Then write (D([0;1); ^Md); t;Ft) for the space of rcll ^Md-valued paths, the coordinate process, and the canonical completed right continuous ltration. For any f2D(L) (domain of generator L), de ne the process Mt(f) by Mt(f) = tf 0f Z t 0 s(Lf)ds: (1.9) 5 For any 2 ^Md, useL to denote the distribution of an (X;1)-process with initial measure . This is the unique distribution on F = (St 0Ft) such that the coordinate process satis es the following martingale problem: 0 = , and for any f2D(L), the process Mt(f) de ned in (1.9) is a continuous martingale with quadratic variation process [M(f)]t = Z t 0 s(f2)ds: For a (X; )-process, the corresponding martingale is not continuous in general. In this case, we may split the martingale into two parts: the continuous martingale Mct (f) and the purely discontinuous martingale Mdt (f) (see Theorem 26.14 in [24]). Then we write Mt(f) = Mct (f) +Mdt (f) = tf 0f Z t 0 s(Lf)ds; where Mct (f) is a continuous martingale with quadratic variation process [Mc(f)]t = Z t 0 s(af2)ds; (1.10) andMdt (f) is a purely discontinuous martingale, which can be de ned through a compensated random measure relating to the jumps of . For details, see Section 7.2 in [35]. Note that the jumps of are related to the measure in the branching mechanisam of (1.1), not the jumps of the spatial motion X (see Section 2.6 in [17]). We may also note that the continuous martingale Mct (f) is related to the term av2 in the branching mechanisam of (1.1) through its quadratic variation process [Mc(f)]t in (1.10). 6 1.2 Summary of contents The purpose of this dissertation is to discuss the Lebesgue approximation of superpro- cesses in details. In Section 1 we discussed the de nitions of superprocesses. Here we give a summary of the contents of the following chapters. In Chapter 2, we discuss some basic ingredients of superprocesses in the rst three sections, which are crucial for the Lebesgue approximation of superprocesses. Then we discuss the background of Lebesgue approximation and some related known results in the last two sections. In Section 1 we discuss the rst moment measure E t and the second moment measure E 2t . In particular, the second moment measure does not exist in general, which causes a real di culty for generalizing certain results. In Section 2 we discuss the very important cluster representation of superprocesses, which contains partial information of the whole genealogical evolution underlying superprocesses. This cluster representation is transparent in the discrete setting, however in the continuous setting it is not easy at all to obtain it rigorously. In Section 3 we discuss two approaches to encode the genealogical information and to obtain the cluster representation. They are Historical superprocesses approach and random snakes approach. In Section 4 we discuss some classical results about the Hausdor dimensions and Hausdor measures of superprocesses. The point is that the Hausdor measure approach is a more traditional, more successful way to do what the Lebesgue approximation approach tries to do: Construct nontrivial measures on some random null sets. Finally in Section 5 we discuss basic ideas of Lebesgue approximation and review almost all known Lebesgue approximation results. At the end of this section we also discuss some related open problems. In Chapter 3, we discuss the Lebesgue approximation of Dawson-Watanabe superpro- cesses of dimension d 3, which is the most basic and most transparent case. This chapter is based on Kallenberg?s proof of Lebesgue approximation of DW-processes of dimension d 3 in [25], with some technical simpli cations. Note that Tribe rst proved this result in [48]. Extra e orts haae been made to explain Kallenberg?s approach clearly and to make it more 7 accessible. In Section 1 we explain some crucial components in the proof and review some terminology and notation. In Section 2 we rst explain the crucial ideas about cluster rep- resentations, then state several lemmas which will not be used directly in the main proof of Lebesgue approximation, including the important upper bound of the hitting multiplicities. In Section 3 we state and prove the Lebesgue approximation for DW-processes of dimensions d 3. In order to do so, we list several lemmas that are needed in the main proof. Finally, in Section 4, we prove all the lemmas in this chapter. We suggest that the reader read the rst three sections in the linear order, then, when need arises, read the proofs of some lemmas in Section 4. In Chapter 4, we discuss the Lebesgue approximation of (2; )-superprocesses of di- mension d > 2= . This chapter is based on my 2013 paper [22]. In Section 1 we explain the additional di culties for the Lebesgue approximation of (2; )-processes and review our general approach, which overcomes these di culties. In Section 2 we develop further a trun- cation of ( ; )-processes from [38]. We also characterize the local niteness of any ( ; )- superprocess, which can be used to extend certain results to some superprocesses with - nite initial measures. In Section 3, we develop some lemmas about hitting bounds and neigh- borhood measures of (2; )-processes, in particular, we improve the upper bounds of hitting probabilities. In Section 4, we derive some asymptotic results of these hitting probabilities. In particular, for the (2; )-superprocess we show that "2= dP f tB"x > 0g!c ;d ( pt)(x), which extends the corresponding result for DW-processes. Finally in Section 5 we state and prove the Lebesgue approximation of (2; )-processes and their truncated processes. When- ever one feels the lack of details of some results in this chapter, refer back to appropriated places in Chapter 3. In Chapter 5, we discuss the Lebesgue approximation of superprocesses with a regu- larly varying branching mechanism. The branching mechanisms we consider here include the stable branching mechanisms considered in Chapter 4 as special cases. In Section 5.1 we explain the new di culties for the Lebesgue approximation of superprocesses with the 8 more general branching mechanism and review our general approach, which overcomes these di culties. In Section 2 we review the truncation of superprocesses in a more general setting. In Section 3, we develop some lemmas about hitting bounds and neighborhood measures of the more general superprocesses. In Section 4, we derive some asymptotic results of these hitting probabilities. Finally in Section 5 we state and prove the Lebesgue approximation of superprocesses with a regularly varying branching mechanism and their truncated processes. This general result contains all previous Lebesgue approximation of superprocesses as special cases. 9 Chapter 2 Some Basics of Superprocesses 2.1 Moment measures Moment measures play an important role in the study of superprocesses. For the Markov process X, write Ttf(x) = x(f(Xt)) for the semigroup of X, where x(X2 ) denotes the distribution of the process X starting from x. Then the rst moment measure of the (X; )- process (see (1.3) and (1.4)) is E ( tf) = (Ttf): (2.1) Note that the branching mechanism of (1.1) plays no role here. Write p t (x) for the transition density of the rotation invariant -stable L evy process with generator 12 (see (1.6)). Then the rst moment measure of the ( ; )-process takes the equivalent measure form E t = ( p t ) d; where p t (x) = R p t (x y) (dy) and f d denotes the measure de ned by (f d)(B) = R Bfd d. The second moment measure depends crucially on the branching mechanism. In fact, second moments do not exist in general. However, they do exist when the measure = 0 in the branching mechanism of (1.1), that is, for the (X; v2)-process . The second moment measure of the (X; v2)-process is E ( tf)2 = ( (Ttf))2 + 2 Z t 0 Ts(Tt sf)2 ds: (2.2) 10 Refer to Section 2.4 in [32] for the proofs of (2.1) and (2.2). For the (X; )-process with < 1, only moments of order less than 1 + exist. A useful inequality along this line is Lemma 2.1 in [37]: For 0 < < < 1, E ( tf)1+ 1 +c( ) ( (Ttf))1+ + Z t 0 Ts(Tt sf)1+ ds ; where c( )!1 as ! . When we need to use the second moments, we may truncate at any level K > 0 to get the truncated process K, which has nite second moments. For details about this truncation method, see pages 484 - 487 and Lemma 3 in [38]. Using series expansions of Laplace functionals, Dynkin [13] gives moment measure for- mulas for very general superprocesses. See Section 14.7 in [16] for a concise review of these formulas. Finally we mention that, for DW-processes, Theorem 4.2 of Kallenberg [27] con- tains a basic cluster decomposition of moment measures. Theorem 4.4 of that paper gives a fundamental connection between moment measures and certain uniform Brownian trees, rst noted by Etheridge in Section 2.1 of [17]. It would be interesting to study this connection for more general superprocesses. For details about the cluster decomposition of moment measures, See Theorem 5.1 in Kallenberg [26]. 2.2 Cluster representation In this section we discuss the very important cluster representation of superprocesses. Note that although a superprocess records the positions of all \individuals" alive at time t, they do not keep track of all the genealogy of these \individuals". More precisely, let us pick an \individual" alive at time t, then try to identify her \ancestor" at an earlier time s. Although we know from s the positions of all \individuals" alive at time s, we don?t know which speci c \individual" at time s is the \ancestor" of the \individual" we picked at time t. However, in the study of some deep properties of superprocesses, the genealogical structure underlying the evolution can be extremely useful, even when the nal results have 11 nothing directly to do with the genealogy. The cluster representation of superprocesses, while containing only partial information of the genealogy, is enough for many purposes. In order to discuss the cluster representation, let us rst recall the de nition of Poisson cluster processes. To de ne a cluster process, we start with a point process = Pi i on some space T. For a suitable classMS of measures on S, we consider a probability kernel from T to MS. Choosing the random measures i to be conditionally independent of the i with distributions i, we may introduce a random measure = Pi i on S. This random measure is called a -cluster process generated by . If is Poisson or Cox, we call a Poisson or Cox cluster process. Due to the underlying independence structure, superprocesses have the following branch- ing property: If and 0 are two independent (X; )-processes with initial measures and 0 respectively, then + 0 is an (X; )-process with initial measure + 0. This can be veri ed by using any of the three characterizations in Section 1.1. From this branching property, we see that, for any t, the superprocess t is an in nitely divisible random measure. A random measure is in nitely divisible i it is the sum of a Poisson cluster process and a deterministic measure (see Theorem 1.28 in [17]). Since Pf t = 0g > 0, the superprocess t is just a Poisson cluster process. The cluster representation of (X; )-processes depends crucially on the branching mech- anism of (1.1). For convenience, we rst discuss the cluster representation of (X;1)- processes (see Section 3.2 and 6.1 in [17]). For a (X;1)-process , at time 0, there are actually uncountably many \individuals". All \individuals" produce \o spring" randomly. However almost all \individuals" have no \o spring" alive at time t> 0, except nitely many \lucky" ones. In other words, the superprocess at time t is actually \o spring" of nitely many \ancestors". The point process records the locations of these nite many \ancestors" is a Poisson process 0 with intensity measure t 1 . This is the generating process in the Poisson cluster representation of t. Each one of these nitely many \ancestors" generates a random cluster at time t. Clearly this cluster is just her \o spring" at time t. These clusters 12 are \the same", means that they have the same distribution if we move their \ancestors" to a common point. In summary, t being a Poisson cluster process, is a nite sum of condi- tionally independent clusters, equally distributed apart from shifts and rooted at the points of a Poisson process 0 of \ancestors" with intensity measure t 1 . By the Markov property of , we have a similar representation of t for every s = t h2(0;t) as a countable sum of conditionally independent h-clusters (clusters of age h), rooted at the points of a Cox pro- cess s directed by h 1 s. In other words, s is conditionally Poisson given s with intensity measure h 1 s (see page 226 in [24]). Under some restrictions of the branching mechanism of (1.1), (X; )-processes also have a similar cluster representation (see Section 11.5 in [5] and Section 3 in [7]). The function t 1 in the above intensity measure t 1 should be replaced by another function of t, determined by the branching mechanism . The cluster distributions are also di erent, determined by both X and . 2.3 Historical superprocesses and random snakes It is clear that the cluster representation of the previous section cannot be recovered from the superprocess itself, since t records only the positions of all \individuals" alive at time t. A complete picture of the evolution underlying a superprocess is given by a random tree composed from the paths of all individuals. Two approaches to encode this picture are provided by historical superprocesses and by random snakes. Both approaches can be used to verify the cluster representation. The basic idea of historical superprocesses is very simple (see Section 1.9 in [17]). Let us explain the idea in the discrete setting, to make it even more transparent. For a discrete spatial branching process , pick two individuals alive at time t> 0, and assume that they have their last common ancestor at time s2 (0;t). Based on the Markov properties of the spatial motion X and the independence structures of spatial branching processes, clearly we can think that these two individuals perform the spatial motion X together as a single 13 individual before time s, then separate at time s and begin to perform independent spatial motion X ever since. In other words, we can think of these two individuals as a single path before time s, and this path splits into two independent paths at time s. The same idea still holds in the continuous setting, that is, for the superprocesses. Note that in the construction of superprocesses, the spatial motion Xt is only the location of an individual at time t. In order to remember the spatial locations of all the members in her genealogy line before her, we may just replace Xt by the corresponding path process ^Xt, which is a path-valued process. The value of ^Xt is the path of X over the time interval [0;t]. Now we construct the ( ^X; )-superprocess ^ , which corresponds to the (X; )-superprocess . We call ^ the (X; )-historical superprocess. Note that the way we de ne ^ from is di erent from the naive way we de ne ^X from X. If we de ne ^ as the corresponding path process of , then we would not be able to specify the ancestors of any individual. Denote the space of all rcll paths over the time interval [0;t] by Wt. Then the state space of ^Xt is Wt, and so ^Xt is a time-inhomogeneous Markov process. Write r;w for the probability measure under which ^X starts from the path w at time r. Clearly w is an rcll path over the time interval [0;r]. Let Mt be the space of all nite measures on Wt. This is the state space of ^ t, and so ^ t is also a time-inhomogeneous Markov process. The (X; )-historical superprocess ^ can be characterized by all three approaches in Section 1.1. It is obvious how to carry out the weak convergence approach. For the Laplace functional approach, we have Er; e ^ tf = e vrt; where is a nite measure on Wr, ^ t is a nite measure on Wt, f is a function on Wt, and vrt is a function on Wr. The function vrt(w) with r t and w2Wr is uniquely determined by the integral equation vrt(w) + r;w Z t r vst( ^Xs) ds = r;w f( ^Xt) : 14 This may be compared with (1.4), the integral equation of . The only di erence is that ^ is a superprocess of historical paths, while is a superprocess of spatial positions. The concept of historical superprocesses was developed in Dawson and Perkins [7] and Dynkin [14]. We may also refer to Chapter 12 in [5] and Section II.8 in [42]. From the construction of historical superprocesses it is clear that ^ t encodes the geneal- ogy information of all \individuals" alive at time t. The random snake approach developed by Le Gall and his co-authors allows to give a complete description of the genealogy. Here we only focus on the basic ideas, since the technical details and notation can be overwhelming. The basic idea of random snakes stems from an important fact of branching processes: The genealogical structures of branching processes can be completely encoded by a (random) function on R+. More precisely, the genealogical structure of Galton-Watson processes can be completely encoded by a discrete (random) function de ned on nonnegative integers. These nonnegative integers correspond to all the individuals and the function values are the generations of these individuals. For continuous-state branching processes, similarly the genealogical structure can be completely encoded by a continuous (random) function on R+. Again, any number in R+ corresponds to an \individual", and the function value is the \generation" or lifetime of this \individual". This continuous coding function is called the lifetime process. In fact, continuous coding function can be obtained from a sequence of rescaled discrete coding functions. Clearly this is closed related to the fact that continuous-state branching processes may be obtained as weak limits of rescaled Galton- Watson processes. For a continuous-state branching process with the branching mechanism = v2, the lifetime process &t is actually just the re ected one dimensional Brownian motion. The time parameter of the lifetime process &t is a labeling of all individuals in a certain order. For the complete evolution of (X;1)-processes, we then need to somehow combine the paths of individuals with this coding continuous random function. This is done by the so called Brownian snake Wt, which is a path-valued Markov process evolving according to both 15 the spatial motion X and the lifetime process &t. Note that the term \Brownian" refers to the branching mechanism, actually the lifetime process, not the spatial motion. The behavior of the Brownian snake is actually not hard to explain, at least informally. The value Wt at time t of the Brownian snake is a path of the underlying spatial motion X (started at a xed initial point) with the random lifetime &t. Informally, when &t decreases, the path Wt is shortened from its tip, and when &t increases, the path Wt is extended by adding (independently of the past) small \pieces of paths" following the law of the spatial motion X. In this way, we can generate the full set of historical paths of a (X;1)-process by running the Brownian snake according to the lifetime process, in this way we are visiting all the \individuals" one by one. For superprocesses with a general branching mechanism , similarly the so called L evy snakes can be de ned. The basic ideas are similar, but technically it is much more com- plicated. The main reason is that the corresponding lifetime process is not Markov and its de nition is quite involved. Actually part of the beauty, and the power, of the Brownian snake is that the lifetime process is itself a Markov process. The standard reference of Brow- nian snake is the excellent lecture notes [32] by Le Gall in 1999. For L evy snakes, refer to the excellent monograph [12] by Duquesne and Le Gall in 2002. 2.4 Hausdor dimensions and Hausdor measures In this section we review some classical results about the Hausdor dimensions and Hausdor measures of superprocesses. First let us review the de nitions of Hausdor di- mension and Hausdor measure. For a nice introduction of this topic in a probabilistic setting, see Chapter 4 and Section 6.4 in [36]. We rst de ne Hausdor measure, then Hausdor dimension. Assume A to be a metric space with the metric . Use jAj to denote the diameter of the set A, which is de ned by jAj= supf (x;y) : x;y2Ag: 16 For every 0 de ne H (A) = inff 1X i=1 jAij : A 1[ i=1 Ai;jAij for all ig: (2.3) Easy to see that the quantity H (A) is increasing as decreases, so that the limit H (A) = lim !0 H (A) is well-de ned, although it could be in nite. We call the limit H (A) = lim !0H (A) the -Hausdor measure of A. Since subsets of a metric space are metric spaces on their own, the -Hausdor measure H can be de ned for all subsets of the space A. Using the de nition ofH , we can check that the function H de ned for all subsets satis es all the properties of a metric Carath eodory exterior measure (see Section 7.1 in [47], or Section 11.2 in [20]). Thus H is a countably additive measure when restricted to the Borel sets of A. So indeed, the -Hausdor measure de ned on all Borel sets is a measure. Let us de ne the Hausdor dimension. The -Hausdor measure H (A) has the fol- lowing natural properties: If 0 < , and H (A) <1, then H (A) = 0; If 0 < , and H (A) > 0, then H (A) = 1. So there exists a unique number which is denoted by dimA such that H (A) = 1 for < dimA, and H (A) = 0 for > dimA. We call this unique number the Hausdor dimension of the set A, denoted by dimA. Or in other words, we de ne the Hausdor dimension of the set A by dimA = supf :H (A) =1g= inff :H (A) = 0g: Using the Hausdor dimension, we can associate a nonnegative number to any set, which generalizes the usual integer dimensions. For example, the classical Cantor set has Hausdor dimension log 2=log 3. The graph of a one dimensional Brownian motion, which is 17 a continuous (random) curve on R2, has Hausdor dimension 3=2 a.s. (see Theorem 16.4 in [19]). This is related to the fact that one dimensional Brownian path is a.s. locally H older continuous with exponent c for any c2(0; 12). Now we turn to superprocesses. For a DW-process in Rd, we denote the support of t by supp t, which is a random closed set in Rd. Actually this is even a random compact set, assuming 0 = is a nite measure (see Theorem 1.2 in [6]). For xed t> 0, if d 2, then a.s. this is a null set (means that it has Lebesgue measure 0). Here Hausdor dimension is useful for us to get some more understanding of the size of supp t. It is well-known that a.s. dim (supp t) = 2^d; on f t6= 0g: Note that if t = 0, then supp t = ;. More generally, if is a (2; )-process in Rd with 2(0;1], then for xed t> 0 , if d 2= , a.s. supp t is a null set (again, this is a random compact set, adapt the proof of Theorem 1.2 in [6] to generalize Theorem 1.1 in [8], or see Section 4.3 in [1]) and dim (supp t) = (2= )^d; on f t6= 0g: For this result and even more, see Theorem 2.1 in [9]. The situation for (2; )-processes is in stark contrast to ( ; )-processes in Rd with 2(0;2) and 2(0;1], where the spatial motion has jumps. In this case, Evans and Perkins [18, 40] showed that For xed t> 0 a.s. supp t = Rd; on f t6= 0g: (2.4) We can also discuss the Hausdor dimension of the range of superprocesses. First for I R+, de ne the range of on I by R(I) = [ t2I supp t; (2.5) 18 and the closed range of on I by R(I) = R(I), where R(I) is the closure of R(I) . Then the range of is de ned by R= [ ">0 R([";1)): For a DW-process in Rd, if d 4, a.s. R is a null set and dimR= 4^d. More generally, if is a (2; )-process in Rd with 2(0;1], if d (2= ) + 2, a.s. R is a null set and dimR= [(2= ) + 2]^d; see Corollary 2.2 in [9]. Again, for ( ; )-processes in Rd with 2 (0;2) and 2 (0;1], a.s. R= Rd. This is immediate from (2.4) and the de nition of the range. Let us turn back to the -Hausdor measures. Although for any nonnegative the - Hausdor measure is a Borel measure, for some metric spaces it is always a trivial measure for any , means that for any , the -Hausdor measure H (B) can only be 0 or 1 for any B2B(A). For example, if is a DW-process in Rd with d 2, then for a xed t> 0, a.s. H2(supp t) = 0 and H (B\supp t) =1 or 0 for any < 2 and B 2B(Rd) (see (2.7), (2.8), and (2.9)). So we need to generalize the -Hausdor measures if we want to construct a nontrivial measure on supp t. The de nition of Hausdor dimension still makes sense if we evaluate coverings by applying, instead of a simple power, an arbitrary non-decreasing function to the diameters of the sets in a covering. We call this function a gauge function. By a gauge function we mean a non-decreasing function : [0;")![0;1) with (0) = 0. As before, we de ne H (A) = inff 1X i=1 (jAij) : A 1[ i=1 Ai;jAij for all ig: (2.6) 19 Clearly the -Hausdor measure H in (2.3) is just the special case of H with (x) = x . Then de ne the -Hausdor measure of A by H (A) = lim !0 H (A): As before, H is a measure on Borel sets. Under this more general framework, it is more likely to construct nontrivial measures on a metric space, although this is still not always possible. For a DW-superprocess , this approach is extremely successful. Perkins and his co-authors proved the exact Hausdor measure of the support at a xed time or of the range of the process. First about the support supp t. For a xed t> 0, a.s. we have H ( \supp t) = t( ); (2.7) where for d 3 (see Theorem 5.2 in [7]), (x) = x2 log log(1=x); (2.8) and for d = 2 (see Theorem 1.1 in [33]), (x) = x2 log(1=x) log log log(1=x): (2.9) Next the range R(0;t]. For a xed t> 0, a.s. we have H ( \R(0;t]) = Z t 0 sds( ); (2.10) where for d 5, (x) = x4 log log(1=x); (2.11) 20 and for d = 4, (x) = x4 log(1=x) log log log(1=x): (2.12) Note that Rt0 sds is a measure on B(Rd), which is de ned by Z t 0 sds(B) = Z t 0 s(B)ds; for any B2B(Rd): (2.13) It is easy to see that the Hausdor measure results here contain the Hausdor dimension results that we reviewed previously. One obvious remaining question is the exact Hausdor measure function of (2; )- processes, but this may be technically too challenging. Then it is also interesting to try to get some good upper bound and lower bound of the exact Hausdor measure function. 2.5 Lebesgue approximations From the previous section, we see that by choosing carefully a suitable gauge function , we can de ne some nontrivial random measures on certain random null sets. Since from the beginning we know that there are some naturally de ned nontrivial random measures on these random null sets (the DW-process t on supp t, and the local time measure of one dimensional Brownian motion on its level set, see below), in fact the Hausdor measure approach gives representations of these measures with respect to only their support. So in order to recover these measures, we can forget about the related stochastic processes, only the support of these measures is needed. In this regard, we also have the packing measure approach (see [11, 34]), which is similar to the Hausdor measure approach generally speaking. A more di erent approach to do this is the so called Lebesgue approximation approach. Kingman [28] explained this approach in a very accessible manner and also used this approach to recover the local time measure of certain Markov processes intrinsically from the level set. 21 Let us rst explain Kingman?s idea. For the subset A2Rd, we use A" to denote the "-neighborhood of A , that is, A" =fx : d(x;A) <"g: It?s easy to see that A" = (A)". Recall that A is the closure of A. For their corresponding Lebesgue measures, clearly dE"2[0;1] and dE"! dE. So when E is a null set, we get dE"!0. Here the interesting thing is that, the rate at which dE" converges to zero is an indication of the size of E. For example, if E is a part of a su ciently smooth d0-dimensional surface in Rd, where d0 0, write " for the restriction of Lebesgue measure d to the "-neighborhood of supp . Note that using our notations, the "-neighborhood of supp 22 is denoted by (supp )". So we may write " explicitly by "( ) = d ( \(supp )"): For a DW-process in Rd with d 3, Tribe [48] showed that for any xed t > 0 and any bounded Borel set B in Rd, a.s. as "!0, "2 d "t(B)!cd t(B); where cd > 0 is a constant depending on d. Shortly after, in order to prove the strong Markov property of the support process supp t, Perkins [41] showed that the Lebesgue approximation result holds simultaneously for all time t> 0. More precisely, for a DW-process in Rd with d 3, a.s. as "!0 "2 d "t w!cd t; for all t> 0; (2.14) where w! denotes the weak convergence of measures. The corresponding Lebesgue approxi- mation of two dimensional DW-processes was still open at that time, even for xed t> 0. However, Perkins conjectured that for xed t > 0 and bounded Borel set B in R2, a.s. as "!0, jlog"j "t(B)!c t(B): Later, Kallenberg [25] essentially con rmed the above conjecture. More precisely, for a DW-process in R2, Kallenberg showed that for xed t> 0 a.s. as "!0, ~m(")jlog"j "t w ! t; where ~m is a suitable normalizing function bounded below and above by two positive con- stants. Note that both the conjecture of Perkins and the proof of Kallenberg depend crucially 23 on the hitting bounds of DW-processes in R2 from Le Gall [31]. Kallenberg?s approach also works for DW-processes in Rd with d 3, and results in a more probabilistic proof of Lebesgue approximation of t. In [?], we adapted Kallenberg?s probabilistic approach in [25] to prove the Lebesgue approximation of (2; )-processes with < 1, combined with a truncation method of super- processes from Mytnik and Villa [38], in order to overcome the additional di culty imposed by the in nite variance of (2; )-processes. More precisely, for a (2; )-process in Rd with < 1 and d> 2= , we proved that, for xed t> 0 a.s. as "!0, "2= d "t w!c ;d t; where c ;d > 0 is a constant depending on and d. In view of the Hausdor measure results (2.10), (2.11), and (2.12), we may ask about the Lebesgue approximation of the range of a superprocess. Here, Delmas [10] proved the Lebesgue approximation of the range of DW-processes in Rd with d 4, using Le Gall?s Brownian snake. More precisely, for a DW-process in Rd with d 4, Delmas showed that, for xed t> 0 and bounded Borel set B in Rd, a.s. as "!0, (")R"t(B)!cd Z 1 t sds(B); (2.15) where Rt is the R([t;1)) de ned in (2.5), and R1t sds is de ned as in (2.13). About the normalizing function , it is shown that (") = "4 d for d> 5; and, (") =jlog"j for d = 4: These known results lead to a couple immediate open problems. First we may ask if the Lebesgue approximation of (2; )-processes holds simultaneously for all time t> 0, in view of (2.14). Since intuitively the (2; )-process and its support supp do not jump at the 24 same time, an immediate guess should be no. We may then ask if it is possible to prove some results supporting this guess. What about the strong Markov property of the (2; )-support process supp t? It seems that we need to nd new approaches to prove it (or disprove it). The second question is that, whether it is possible to prove the Lebesgue approximation of the range of (2; )-processes, in view of (2.15). More generally, we may try to \translate" all Hausdor measure results into corre- sponding Lebesgue approximation ones. Here the challenge is that while there is a solid theory behind Hausdor measures which one could rely on, there is no such support for Lebesgue approximation results. One has to \invent" some approaches when trying to es- tablish Lebesgue approximation results. Still it is very interesting to see that whenever we can get the Lebesgue approximation results, the results are always shorter and cleaner then the corresponding Hausdor measure results. 25 Chapter 3 Lebesgue Approximation of Dawson-Watanabe Superprocesses 3.1 Introduction In this chapter we discuss the Lebesgue approximation of Dawson-Watanabe superpro- cesses in detail. The Lebesgue approximation of DW-processes of dimension d 3 was rst proved by Tribe [48], using both probabilistic and analytic techniques. The case of critical dimension d = 2 is more di cult. However, Kallenberg [25] obtained a similar result for DW-processes in R2 using a more probabilistic approach. His approach can also be applied to DW-processes of dimension d 3, and indeed this was done in [25]. The present chapter is based on Kallenberg?s proof of Lebesgue approximation of DW-processes of dimension d 3 in [25], with some technical simpli cations. Extra e orts have been made to explain Kallenberg?s approach clearly and to make it more accessible. We use = ( t) to denote the DW-process of dimension d 3. Recall that is a measure-valued Markov process, so for xed t and !, the value t(!) is a measure on Rd. We write "t for the restriction of Lebesgue measure d to the "-neighborhood of supp t, the support of the measure t, which for xed t and !, is a compact set in Rd (see Theorem 1.2 in [6]). The Lebesgue approximation of DW-processes of dimension d 3, which is Theorem 3.5 in this chapter, states that for xed t> 0, "2 d "t w!cd t a.s. as "!0, where w! denotes weak convergence of measures and c d > 0 is a universal constant depending on d. In particular, this con rms that t \distributes its mass over supp t in a deterministic manner" (cf. [17], p. 115, or [42], p. 212), as previously inferred from some deep results involving the exact Hausdor measure (cf. [7]). The proof depends crucially on some basic hitting estimates, due to Dawson, Iscoe, and Perkins [6]. Here we need the lower bound and upper bound of P f tB"0 > 0g (Theorem 26 3.1.(a) in [6]), and also the precise convergence result "2 dP f tB"0 > 0g!cd pt for d 3 as "!0 (Theorem 3.1.(b) in [6]), where Brx denotes an open ball around x of radius r. The proof also depends crucially on the representation of the DW-process as a countable sum of conditionally independent clusters. Precisely, each t can be expressed as a countable sum of conditionally independent clusters of age h2(0;t], where the generating ancestors at time s = t h form a Cox process s directed by h 1 s (cf. [7, 30]). Typically we let h!0 at a suitable rate depending on ". However, a technical complication when dealing with cluster representations is the possibility of multiple hits. More speci cally, a single cluster may hit (charge) several of the "-neighborhoods of n distinct points x1;:::;xn, or one of those neighborhoods may be hit by several clusters. In particular, Lemma 2.4 deals with this multiple hitting of a single neighborhood by several clusters. To minimize the e ect of such multiplicities, we need the cluster age h to be su ciently small. On the other hand, it needs to be large enough for the mentioned hitting estimates to apply to the individual clusters. Notice that we can translate the hitting estimates for the superprocess to the hitting estimates for the cluster , based on the connection between the superprocess and its clusters. The reason we don?t cover the case of critical dimension d = 2 is that, although the two cases of d = 2 and d 3 use the same general approach, technically the case of d = 2 is much more involved, since we then have to deal with the Logarithm normalizing function jlog(")j rather than the power normalizing function "2 d as in the case of d 3. Also when d = 2, a corresponding crucial result to the precise convergence result for d 3, as "! 0, "2 dP f tB"0 > 0g! cd pt, is not readily available. So in this chapter, we restrict our attention to the case of d 3. We proceed with some general remarks on terminology and notation. A random measure on Rd is de ned as a measurable function from to the spaceMd of locally nite measures on Rd, equipped by the - eld generated by all evaluation maps B : 7! B with B2Bd, where Bd denotes the Borel - eld on Rd. The subclasses of measures and bounded sets 27 are denoted by ^Md and ^Bd, respectively. The weak topology in Md is generated by all integration maps f : 7! f = R fd with f belonging to the space Cdb of bounded, continuous functions Rd!R+. Thus, n w! in Md i nf! f for all f2Cdb. Throughout the chapter we use relations such as =_ , <_ , _ , and , where the rst three mean equality, inequality, and asymptotic equality up to a constant factor, and the last one is the combination of <_ and >_ . We often write a b to mean a=b! 0. The double bars k k denote the supremum norm when applied to functions and total variation when applied to signed measures. In any Euclidean space Rd, we write Brx for the open ball of radius r > 0 centered at x2Rd. The shift and scaling operators x and Sr are given by xy = x+y and Srx = rx, respectively, and for measures on Rd we de ne x and Sr by ( x)B = ( xB) and ( Sr)B = (SrB), respectively. In particular, ( Sr)f = (f S 1r ) for measurable functions f on Rd. Convolutions of measures with functions f are given by ( f)(x) = R f(x u) (du). This chapter is organized as follows. In Section 2 we rst explain the crucial ideas about cluster representations, then state several lemmas which will not be used directly in the main proof of Lebesgue approximation, including the important upper bound of the hitting mul- tiplicities. In Section 3 we state and prove the Lebesgue approximation for DW-processes of dimensions d 3. In order to do so, we list several lemmas that are needed in the main proof. Finally, in Section 4 we prove all the lemmas in this chapter. We suggest that the reader read the rst three sections in the linear order, then, when need arises, read the proofs of some lemmas in Section 4. 3.2 Preliminaries Let us rst explain the cluster representations of DW-processes. We write L ( ) = P f 2 g for the distribution of the process with initial measure . For every xed , the DW-process is in nitely divisible under P and admits a decomposition into a Poisson 28 \forest" of conditionally independent clusters, corresponding to the excursions of the contour process in the ingenious \Brownian snake" representation of Le Gall [32]. In particular, this yields a cluster representation of t for every xed t > 0. More generally, the \ancestors" of t at an earlier time s = t h form a Cox process s directed by h 1 s (meaning that s is conditionally Poisson with intensity h 1 s, given s; cf. [24], p. 226), and the generated clusters ih are conditionally independent and identically distributed apart from shifts. In this paper, a generic cluster of age t> 0 is denoted by t; we write Lx( t) = Pxf t2 g for the distribution of a t-cluster centered at x2Rd and put P f t2 g= R (dx)Pxf t2 g. The rst lemma is about some basic scaling properties of DW-processes and their asso- ciated clusters. Lemma 3.1 Let be a DW-process in Rd with associated clusters t. Then for any measure on Rd, and r;t> 0, (i) L Sr(r2 t) =Lr2 ( r2tSr), (ii) L Sr(r2 t) =L ( r2tSr). Although the above two compact identities look nice, they may not be very intuitive for some people. In order to appreciate better these scaling properties, rst we translate the L notation back to the P notation P Srfr2 t2 g = Pr2 f r2tSr2 g; P Srfr2 t2 g = P f r2tSr2 g: Recall that the evaluation map B : 7! B is a function de ned on the spaceMd of locally nite measures on Rd. According to the de nition of - eld on Md, the set f B1=r 0 > 0g is a measurable set on Md. In the above two identities, take r = 1=", Sr = x, and, =f B1=r 0 > 0g, we get Pxf tB"0 > 0g = P(1="2) x="f t="2B10 > 0g; 29 Pxf tB"0 > 0g = Px="f t="2B10 > 0g: Now these two identities should be intuitive enough for one to appreciate the scaling prop- erties. Next we state a well-known relationship between the hitting probabilities of t and t. Lemma 3.2 Let the DW-process in Rd with associated clusters t be locally nite under P , and x any B2Bd. Then P f tB > 0g = t log (1 P f tB > 0g); P f tB > 0g = 1 exp ( t 1P f tB > 0g): In particular, P f tB > 0g t 1P f tB > 0g as either side tends to 0. The following lemma contains some slight variations of classical hitting estimates for DW-processes of dimension d 3. By Lemma 3.2 it is enough to consider the corresponding clusters t, and by shifting it su ces to consider balls centered at the origin. Lemma 3.3 Let the t be clusters of a DW-process in Rd with d 3, and consider a - nite measure on Rd. Then for 0 <" pt, we have pt <_ t 1"2 dP f tB"0 > 0g<_ p2t; The classical upper bound is pt+". Note that as " ! 0, the upper bound pt+" is approaching the lower bound pt, however the constants before these two bounds are de nitely di erent. Still this suggests that as " ! 0, the normalized hitting probability t 1"2 dP f tB"0 > 0g converges to c pt for some constant c > 0. This is indeed the case. Although the classical upper bound can give us this intuitive impression, for all practical purposes our upper bound p2t is as good, if not better. The reason is that mathematically speaking, p2t is almost the same as pt. 30 Next we need to estimate the probability that a small ball in Rd is hit by more than one subcluster of our DW-process . This result will play a crucial role throughout the remainder of the chapter. Lemma 3.4 Let the DW-process in Rd be locally nite under P . For any t h > 0 and " > 0, let "h be the number of h-clusters hitting B"0 at time t. Then for d 3 and as "2 h t, we have E "h( "h 1) <_ "2(d 2) h1 d=2 pt + ( p2t)2 : Here the intuition is that, if compare to h, the radius " is small enough, then most likely there will be only one cluster hitting this tiny ball, or no cluster at all. Actually what we want to control is the discrete quantity ( "h 1)+. However it seems that the only natural way to relate this quantity to the DW-process t is through the following simple inequality ( "h 1)+ "h( "h 1): Then we can relate E "h( "h 1) to E t and E 2t , the rst and second moment of the DW-process t. This is actually a very important point, especially in the next chapter when we are dealing with the (2; )-superprocesses. Since the (2; )-superprocesses have in nite second moment, to control E ( "h 1)+ we have to truncate the (2; )-processes, in order to get the nite second moment. 3.3 Lebesgue approximation In this section we rst state the main result of this chapter, the Lebesgue approximation of DW-processes of dimension d 3, which is Theorem 3.5. In order to give the proof of Theorem 3.5, we then state Lemma 3.6, 3.7, and 3.8, which will be used directly in the proof 31 of Theorem 3.5. However we leave all proofs of lemmas in the next section. At the end of the present section we give the proof of Theorem 3.5. For any measure on Rd and constant " > 0, we de ne the associated neighborhood measure " as the restriction of Lebesgue measure d to the "-neighborhood of supp , so that " has Lebesgue density 1f B"x > 0g. First note that " is a measure de ned from the measure . Then recall that t(!) is a measure for xed t and !, so "t(!) is just the neighborhood measure of t(!). Also recall that ^Md is the space of nite measures on Rd. For random measures n and with values in ^Md, the weak convergence in L1, denoted by n w! in L1; means that nf ! f in L1 for all f in Cdb. Write ~cd = 1=cd for convenience, where cd is such as in (3.1). Now we are ready to state the main result of this chapter, the Lebesgue approximation of DW-processes of dimension d 3. Theorem 3.5 Let be the DW-process in Rd with d 3. Fix any 2 ^Md and t> 0. Then under P , we have as "!0 ~cd"2 d "t w! t a.s. and in L1: Here the a.s. convergence means that for every ! outside a null set, ~cd"2 d "t(!) w! t(!): Note that for xed t and !, both "t(!) and t(!) are deterministic measures. Next we are going to study ( ih)", the neighborhood measures of the clusters. Since we will use the cluster decomposition t = i ih throughout the proof, naturally in order to prove ~cd"2 d "t w! t we also need to study ( ih)". Write ( ih)" = i"h for convenience. 32 Lemma 3.6 Let the ih be conditionally independent h-clusters in Rd, rooted at the points of a Poisson process with E = . Fix any measurable function f 0 on Rd. Then (i) E Pi i"h = ( p"h) d, (ii) Var Pi i"hf <_ h2"2(d 2)kfk2k k for "2 h. In part (i), notice that Pi i"h is a random measure, its expectation is the deterministic measure ( p"h) d, which means that for any measurable f 0 E X i i" hf = (( p " h) d)f where (( p"h) d)f is the integral of the function f with respect to the measure ( p"h) d. In part (ii), notice that Pi i"hf is a real-valued random variable, its variance is bounded above by h2"2(d 2)kfk2k k. Next we compare "t and Pi i"h , and prove that asymptotically they are the same, so that we can just replace "t byPi i"h . Intuitively this result is clear: Since the ages of clusters h and the parameter of neighborhood measures " are both going to 0 at some suitable rates, asymptotically there are no overlaps between the neighborhood measures of clusters, so that asymptotically Pi i"h and "t are the same. Lemma 3.7 Let be a DW-process in Rd with d 3, and for xed t> 0, let ih denote the subclusters in t of age h> 0. Fix a 2 ^Md. Then as "2 h!0, E X i i"h "t <_ (" 2=ph)d 2: Recall that for a signed measure on Rd with f as the density with respect to d, the total variation k k satis es k k= djfj= Z jf(x)jdx: 33 Note that Pi i"h and "t have the density Pi1f ihB"x > 0g and 1f B"x > 0g respectively, so E X i i"h "t = E Z X i 1f ihB"x > 0g 1f B"x > 0g dx: Now clearly the integrandjPi 1f ihB"x > 0g 1f B"x > 0gjis related to the multiple hitting of Lemma 3.4. The last lemma is a precise convergence result about the hitting probability Pxf hB"0 > 0g. For a DW-process of dimension d 3, we know from Theorem 3.1 of Dawson, Iscoe, and Perkins [6] (cf. Remark III.5.12 in [42]) that, for xed t > 0, x2Rd, and nite , as "!0 "2 dP f tB"x > 0g!cd ( pt)(x); (3.1) where cd > 0 is a constant depending only on d, and the convergence is uniform for x2Rd and for bounded t 1 andk k. Notice that in this classical result t can change, but it has to be bounded away from 0. By using the scaling property of DW-processes, from the classical result above we can get a precise convergence result about Pxf hB"0 > 0g as both h and " are approaching 0 at some suitable rates. More precisely, after the scaling term (cd) 1h 1"2 d multiplied to Pxf hB"0 > 0g, the measure (cd) 1h 1"2 dPxf hB"0 > 0gdx converges in a certain sense to 0, the Dirac measure at 0 0, as both h and " are approaching 0 at some suitable rates. This result should be easy to understand since if x6= 0, then for small enough h and ", the h stated from x will not be able to reach B"0 before time h. Only the h stated from 0 will be able to reach B"0 before time h, although the probability 34 is decreasing to 0. After the scaling term h 1"2 d multiplied to P0f hB"0 > 0g, it converges to the constant cd. Lemma 3.8 Write p"h(x) = Pxf hB"0 > 0g, where the h are clusters of a DW-process in Rd, and x a bounded, uniformly continuous function f 0 on Rd. Then as 0 < "2 h! 0, we have h 1"2 d (p" h f) cdf !0: The result holds uniformly over any class of uniformly bounded and equicontinuous functions f 0 on Rd. Here h 1"2 d (p"h f) cdf is the supremum norm of the function h 1"2 d (p"h f)(x) cdf(x); as a function of x. Now we are ready to prove Theorem 3.5, but before giving the proof let us discuss the main ideas in the proof carefully. First of all, we have two possible approaches to attack this theorem: one is to prove the L1-convergence rst, then use some interpolation to get the a.s. convergence from the L1-convergence (this is indeed what Tribe did in [48]); the other is to prove the a.s. convergence rst. In the rst approach, we need to get the a.s. convergence from the L1-convergence by the usual Borel-Cantelli argument: If EPjfnj < 1, then fn ! 0 a.s. as n!1. In order to do so, we need an upper bound of the approximating error "2 dP f tB"x > 0g cd ( pt)(x); which we don?t have here. So we will use the second approach: prove the a.s. convergence rst. In order to prove the a.s. convergence, we need to show that a.s. for all f2Cdb, we have that ~cd"2 d "tf ! tf, where Cdb is the class of bounded, continuous functions Rd ! R+. 35 However since there exists a countable, convergence-determining class of functions f in Cdb, we only need to prove for any xed f2Cdb, we have ~cd"2 d "tf! tf a.s. In order to prove this, we write "2 d " tf cd tf "2 d "tf X i i" hf +"2 d X i i" hf h 1 s(p" h f) +k sk "2 dh 1 (p"h f) cdf +cdj sf tfj: Notice that the last term converges to 0 by the a.s. weak continuity of and the third term converges to 0 by Lemma 3.4. The rst term is related to Lemma 3.3 and the second term is related to Lemma 3.2, however these two lemmas are about the expectations and variances of those terms. In order to get a.s. convergence from results of expectations and variances, we use the usual Borel-Cantelli argument: take a sequence "n and get f("n)!0 as n!1by showing that EPjf("n)j<1. Finally we extend the a.s. convergence from the sequence "n to the whole interval (0;1) by interpolation. As for the L1-convergence, since by (1) we easily get "2 dE "tf!cdE tf; so the L1-convergence follows from the a.s. convergence by an usual proposition. Proof of Theorem 3.1: Proof: (i) Let d 3, and x any t > 0, 2 ^Md, and f 2 CdK. Write ih for the subclusters of t of age h. Since the ancestors of t at time s = t h form a Cox process directed by s=h, Lemma 3.6 (i) yields E hX i i" hf s i = h 1 s(p"h f); 36 and so by Lemma 3.6 (ii) E X i i" hf h 1 s(p" h f) 2 = E Var hX i i" hf s i <_ "2(d 2) h2kfk2E k s=hk = "2(d 2)hkfk2k k: Combining with Lemma 3.7 gives E "tf h 1 s(p"h f) E "tf X i i" hf +E X i i" hf h 1 s(p" h f) <_ "2(d 2)h1 d=2kfk+"d 2h1=2kfk = "d 2 p h+ ("= p h)d 2 kfk: Taking h = " = rn for a xed r2(0;1) and writing sn = t rn, we obtain E X nr n(2 d) rn t f r n s n(p rn rn f) <_ X n rn=2 +rn(d 2)=2 kfk<1; which implies rn(2 d) rnt f r n sn(prnrn f) !0 a.s. P : (3.2) Now we write "2 d " tf cd tf "2 d " tf h 1 s(p" h f) +c dj sf tfj +k sk "2 dh 1 (p"h f) cdf : Using (3.2), Lemma 3.8, and the a.s. weak continuity of (cf. Proposition 2.15 in [17]), we see that the right-hand side tends a.s. to 0 as n!1, which implies "2 d "tf cd tf a.s. 37 as "! 0 along the sequence (rn) for any xed r2 (0;1). Since this holds simultaneously, outside a xed null set, for all rational r2 (0;1), the a.s. convergence extends by Lemma 2.3 in [25] to the entire interval (0;1). Now let 2Md be arbitrary with pt <1for all t> 0. Write = 0+ 00 for bounded 0, and let = 0+ 00 be the corresponding decomposition of into independent components with initial measures 0 and 00. Fixing an r > 1 with suppf Br 10 and using the result for bounded , we get a.s. on f 00tBr0 = 0g "2 d "tf = "2 d 0"t f!cd 0tf = cd tf: As 0" , we get by Lemma 4.3 in [25] P f 00tBr0 = 0g= P 00f tBr0 = 0g!1; and the a.s. convergence extends to . Applying this result to a countable, convergence- determining class of functions f (cf. Lemma 3.2.1 in [5]), we obtain the required a.s. vague convergence. If is bounded, then t has a.s. bounded support (cf. Corollary 6.8 in [17]), and the a.s. convergence remains valid in the weak sense. To prove the convergence in L1, we note that for any f2CdK "2 dE "tf = "2 d Z P f tB"x > 0gf(x)dx ! Z cd ( pt)(x)f(x)dx = cdE tf; (3.3) by Theorem 5.3.(i) in [25]. Combining this with the a.s. convergence under P and using Proposition 4.12 in [24], we obtain E j"2 d "tf cd tfj!0. For bounded , (5.14) extends to any f2Cdb by dominated convergence based on Lemmas 4.1 and 4.2 (i) in [25], together with the fact that d( pt) =k k<1 by Fubini?s theorem. 38 3.4 Proofs of lemmas Proof of Lemma 3.1: (i) If v solves the evolution equation for , that is, _v = 12 v v2 then so does ~v(t;x) = r2v(r2t;rx). Writing ~ t = r 2 r2tSr, ~ = r 2 Sr, and ~f(x) = r2f(rx), we get E e ~ t ~f = E e r2tf = e vr2t = e ~ ~vt = E~ e t ~f; and so L (~ ) =L~ ( ), which is equivalent to (i). (ii) De ne the cluster kernel by x = Lx( ), x 2 Rd, and consider the cluster de- composition = R m (dm), where is a Poisson process with intensity when 0 = . Here r 2 r2tSr = Z (r 2mr2tSr) (dm); r;t> 0: Using (i) and the uniqueness of the L evy measure, we obtain (r 2 Sr) = ( fr 2 ^mr2Sr2 g); which is equivalent to r 2L Sr( ) =Lr 2 Sr( ) =L (r 2^ r2Sr): Proof of Lemma 3.2: Under P we have t = Pi it, where the it are conditionally independent clusters of age t rooted at the points of a Poisson process with intensity =t. For a cluster rooted at x, the hit- ting probability is bx = Pxf tB > 0g. Hence (e.g. by Proposition 12.3 in [24]), the number of clusters hitting B is Poisson distributed with mean b=t, and so P f tB = 0g= exp( b=t), 39 which yields the asserted formulas. Proof of Lemma 3.3: Proof of Lemma 3.4: Let s be the Cox process of ancestors to t at time s = t h, and write ih for the associated h-clusters. Using Lemma 3.3, the conditional independence of the clusters, and the fact that E 2s = h 2E 2s outside the diagonal, we get with p"h(x) = Pxf hB"0 > 0g E "h( "h 1) = E XX i6=j1f i hB " 0^ j hB " 0 > 0g = ZZ x6=y p"h(x)p"h(y)E 2s(dxdy) <_ "2(d 2) ZZ ph(")(x)ph(")(y)E 2s(dxdy): By the formula of rst moment, Fubini?s theorem, and the semigroup property of (pt), we get Z ph(")(x)E s(dx) = Z ph(")(x) ( ps)(x)dx = Z (du) (ph(") ps)(u) = pt("): Next, we get by the formula of second moment, Fubini?s theorem, the properties of (pt), and the relations t t" 2t s ZZ ph(")(x)ph(")(y) Cov s(dxdy) = 2 ZZ ph(")(x)ph(")(y)dxdy Z (du) Z s 0 dr Z pr(v u)ps r(x v)ps r(y v)dv 40 = 2 Z (du) Z s 0 dr Z pr(u v) pt(") r(v) 2dv <_ Z (du) Z s 0 (t r) d=2 (pr p(t(") r)=2)(u)dr = Z (du) Z s 0 (t r) d=2p(t(")+r)=2(u)dr <_ Z pt(u) (du) Z t h r d=2dr<_ pth1 d=2: The assertion follows by combination of these estimates. Proof of Lemma 3.6: (i) By Fubini?s theorem and the de nitions of "h and p"h, we have Ex "hf = Ex Z 1f hB"u > 0gf(u)du = (p"h f)(x); and so by independence E hX i i" hf i = Z (dx)Ex "hf = (p"h f): (3.4) Hence, by Fubini?s theorem E X i i" hf = E (p " h f) = (p " h f) = (( p " h) d)f: (ii) First, Varx( K"h f) Ex( K"h f)2 Exk K"h k2kfk2 =kfk2Exk K"h k2: For Exk K"h k2, using Cauchy inequality and Lemma 3.3, we get Exk K"h k2 = Ex Z 1f Kh B"y > 0gdy Z 1f Kh B"z > 0gdz 41 = Z Z Px f Kh B"y > 0g\f Kh B"z > 0g dydz Z Z (Pxf Kh B"y > 0gPxf Kh B"z > 0g)1=2dydz <_ ah"d 2= Z Z (p2h(y x)p2h(z x))1=2dydz =_ ah"d 2= hd=2 Z Z p4h(y x)p4h(z x)dydz = ah"d 2= hd=2: Hence, by independence E Var hX i Ki" h fj i = E Z (dx)Varx( K"h f) <_ ah"d 2= hd=2kfk2k k: Proof of Lemma 3.7: Let "h(x) denote the number of subclusters of age h hitting B"x at time t. Then Lemma 3.4 yields, E X i i" h " t = E Z X i1f i hB " x > 0g 1f B " x > 0g dx = Z E ( "h(x) 1)+dx <_ "2(d 2) d h1 d=2( pt) + ( p2t)2 <_ "2(d 2) h1 d=2k k+t d=2k k2 : Proof of Lemma 3.8: 42 Using (3.1) and Lemmas 3.1 (ii), 3.2, and 3.3, we get by dominated convergence dp"h = hd=2 dp"= ph 1 cdh d=2 ("=ph)d 2 dp1 = cd"d 2h: (3.5) Similarly, Lemma 3.3 yields for xed r> 0 and a standard normal random vector in Rd "2 dh 1 Z jxj>r p"h(x)dx <_ Z juj>r=ph pl(")(u)du = P n j jl1=2" >r= p h o !0: (3.6) By (3.5) it is enough to show thatk^p"h f fk!0 as h, "2=h!0, where ^p"h = p"h= dp"h. Writing wf for the modulus of continuity of f, we get k^p"h f fk = supx Z ^p"h(u) (f(x u) f(x))du Z ^p"h(u)wf(juj)du wf(r) + 2kfk Z juj>r ^p"h(u)du; which tends to 0 as h, "2=h!0 and then r!0, by (3.6) and the uniform continuity of f. 43 Chapter 4 Lebesgue Approximation of (2; )-Superprocesses 4.1 Introduction Throughout this chapter, we use f to denote the integral of the function f with respect to the measure . By an ( ; )-superprocess (or ( ; )-process, for short) in Rd we mean a vaguely rcll, measure-valued strong Markov process = ( t) in Rd satisfying E e tf = e vt for suitable functions f 0, where v = (vt) is the unique solution to the evolution equation _v = 12 v v1+ with initial condition v0 = f. Here = ( ) =2 is the fractional Laplacian, 2 (0;2] refers to the spatial motion, and 2 (0;1] refers to the branching mechanism. When = 2 and = 1 we get the Dawson{Watanabe superprocess (DW- process for short), where the spatial motion is standard Brownian motion. General surveys of superprocesses include the excellent monographs and lecture notes [5, 15, 17, 32, 35, 42]. In this chapter we consider superprocesses with possibly in nite initial measures. Indeed, by the additivity property of superprocesses, we can construct the ( ; )-process with any - nite initial measure . In Lemma 4.5 we show that t is a.s. locally nite for every t> 0 i p (t; ) <1for all t, where p (t;x) denotes the transition density of a symmetric -stable process in Rd. Note that when = 2, p2(t;x) = pt(x) is the normal density in Rd. For any measure on Rd and constant " > 0, write " for the restriction of Lebesgue measure d to the "-neighborhood of supp . For a DW-process in Rd with any nite initial measure, Tribe [48] showed that "2 d "t w!cd t a.s. as "!0 when d 3, where w! denotes weak convergence and cd > 0 is a constant depending on d. For a locally nite DW-process in R2, Kallenberg [25] showed that ~m(")jlog"j "t v! t a.s. as "!0, where v!denotes vague convergence and ~m is a suitable normalizing function. Our main result in this chapter is Theorem 4.18, where we prove that, for a locally nite (2; )-process in Rd with < 1 and 44 d> 2= , "2= d "t v!c ;d t a.s. as "!0, where c ;d > 0 is a constant depending on and d. In particular, the (2; )-process t distributes its mass over supp t in a deterministic manner, which extends the corresponding property of DW-processes (cf. [17], page 115, or [42], page 212). See the end of the present chapter for a detailed explanation of this deterministic distribution property. For DW-processes, this property can also be inferred from some deep results involving the exact Hausdor measure (cf. [7]). However, for any ( ; )-process with < 2, supp t = Rd or ; a.s. (cf. [18, 40]), and so the corresponding property fails. Our result shows that this property depends only on the spatial motion. To prove our main result, we adapt the probabilistic approach for DW-processes from [25]. However, the nite variance of DW-processes plays a crucial role there. In order to deal with the in nite variance of (2; )-processes with < 1, we use a truncation of ( ; )- processes from [38], which will be further developed in Section 2 of the present chapter. By this truncation we may reduce our discussion to the truncated processes, where the variance is nite. To adapt the probabilistic approach from [25] to study the truncated processes, we also need to develop some technical tools. Thus, in Section 3 we improve the upper bounds of hitting probabilities for (2; )-processes with < 1 and their truncated processes. As an immediate application, in Theorem 4.8 we improve some known extinction criteria of the (2; )-process by showing that the local extinction property t d!0 and the seemingly stronger support property supp t d!; are equivalent. Then in Section 4 we derive some asymptotic results of these hitting probabilities. In particular, for the (2; )-process we show in Theorem 4.15 that "2= dP f tB"x > 0g! c ;d ( pt)(x), where Brx denotes an open ball around x of radius r, which extends the corresponding result for DW-processes (cf. Theorem 3.1(b) in [6]). Since the truncated processes do not have the scaling properties of the (2; )-process, our general method is rst to study the (2; )-process, then to estimate the truncated processes by the (2; )-process, in order to get the needed results for the truncated processes. 45 The extension of results of DW-processes to general ( ; )-processes is one of the major themes in the research of superprocesses. Since the spatial motion of the ( ; )-process is not continuous when < 2 and the ( ; )-process has in nite variance when < 1, many extensions are not straightforward, and some may not even be valid. However, it turns out that several properties of the support of (2; )-processes depend only on the spatial motion. These properties include short-time propagation of the support (cf. Theorem 9.3.2.2 in [5]) and Hausdor dimension of the support (cf. Theorem 9.3.3.5 in [5]). Our result also belongs to that category. In this chapter we are mainly using the notations in [25]. Recall that the double bars k kdenote the supremum norm when applied to functions and total variation when applied to signed measures. We also use relations such as =_ , <_ , and , where the rst two mean equality and inequality up to a constant factor, and the last one is the combination of <_ and >_ . Other notation will be explained whenever it occurs. 4.2 Truncated superprocesses and local niteness Although our main result of the present chapter is about (2; )-processes, in this section we discuss the truncation and local niteness of all ( ; )-processes, due to their independent interests. It is well known that the ( ;1)-process has weakly continuous sample paths. By contrast, the ( ; )-process with < 1 has only weakly rcll sample paths with jumps of the form t = r x, for some t> 0, r> 0, and x2Rd. Let N (dt;dr;dx) = X (t;r;x): t=r x (t;r;x): 46 Clearly the point process N on R+ R+ Rd records all information about the jumps of . By the proof of Theorem 6.1.3 in [5], we know that N has compensator measure ^N (dt;dr;dx) = c (dt)r 2 (dr) t(dx); (4.1) where c is a constant depending on . Due to all the \big" jumps, t has in nite variance. Some methods for ( ;1)-processes, which rely on the nite variance of the processes, are not directly applicable to ( ; )-processes with < 1. In [38], Mytnik and Villa introduced a truncation method for ( ; )-processes with < 1, which can be used to study ( ; )-processes with < 1, especially to extend results of ( ;1)-processes to ( ; )-processes with < 1. Speci cally, for the ( ; )-process with < 1, we de ne the stopping time K = infft > 0 : k tk> Kg for any constant K > 0, where inf;=1 as usual. When t = r x, we see that k tk= r. Clearly K is the time when has the rst jump greater than K. For any nite initial measure , they proved that one can de ne and a weakly rcll, measure-valued Markov process K (which is YK on page 485 of [38]) on a common probability space such that t = Kt for t< K. Intuitively, K euqals minus all masses produced by jumps greater than K along with the future evolution of those masses. In this chapter, we call K the truncated K-process of . Since all \big" jumps are omitted, Kt has nite variance. They also proved that Kt and t agree asymptotically as K!1. We give a di erent proof of this result, since similar ideas will also be used at several crucial stages later. We write P f 2 gfor the distribution of with initial measure . Lemma 4.1 Fix any nite and t> 0. Then P f K >tg!1 as K!1. Proof: If K t, then has at least one jump greater than K before time t. Noting that N ([0;t];(K;1);Rd) is the number of jumps greater than K before time t, we get by 47 Theorem 25.22 of [24] and (4.1), P f K tg E N [0;t];(K;1);Rd = E ^N [0;t];(K;1);Rd =_ K 1 E Z t 0 k skds = tk kK 1 !0 as K!1, where the last equation holds by E k sk=k k. Using Lemma 1 of [38] and a recursive construction, we can prove that Kt (!) t(!) for any t and !. So indeed, K is a \truncation" of . Lemma 4.2 We can de ne and K on a common probability space such that: (i) is an ( ; )-process with < 1 and a nite initial measure , and K is its truncated K-process, which has no jumps greater than K, (ii) t(!) Kt (!) for any t and !, (iii) t(!) = Kt (!) for t< K(!). Proof: Let m;n(t) denote the process m;n at time t. Use D([0;1); ^Md) as our , the space of rcll functions from [0;1) to ^Md, where ^Md is the set of nite measures on Rd. We endow with the Skorohod J1-topology. Let A= B( ). Let 1(t;!) = !(t) be an ( ; )-process de ned on ( ;A;P) with initial measure , and de ne K1 = infft > 0 : k 1(t)k> Kg. Then de ne a kernel u from ^Md to such that u( ; ) is the distribution of an ( ; )-process with initial measure , and a kernel uK from ^Md to such that uK( ; ) is the distribution of the truncated K-process of an ( ; )-process with initial measure . By Lemma 6.9 in [24], we can de ne 1;1 to be an ( ; )-process with initial measure 1( K1) on an extension of ( ;A;P), and 01;1 to be the truncated K-process 48 of an ( ; )-process with initial measure 1( K1). Now de ne 1 and K1 by 1(t) = 8 >< >: 1(t); t< K1; 1;1(t K1); t K1; K1 (t) = 8 >< >: 1(t); t< K1; 01;1(t K1); t K1: By the strong Markov property of ( ; )-processes and the above construction, we can verify that 1 is an ( ; )-process. By Lemma 1 in [38], K1 is the truncated K-process of an ( ; )-process. Moreover, 1 and K1 satisfy conditions (ii) and (iii) on [0; K1). Let u0 be a kernel from ^Md ^Md toA Asuch that u0( ; 0; ; ) is the distribution of a pair of two independent ( ; )-processes with initial measures and 0 respectively. De ne ( 2;0; 2;1) with distribution u0 K1 ( K1); 1( K1) K1 ( K1); ; : Let 2 = 2;0 + 2;1, 02 = 2;0, and K2 = infft > 0 : k 2(t)k > Kg. Let 2;1 be an ( ; )-process with initial measure 2( K2), and let 02;1 be the truncated K-process of an ( ; )-process with initial measure 02( K2). Now de ne 2 and K2 by 2(t) = 8> >>> < >>> >: 1(t); t< K1; 2(t K1); K1 t< K1 + K2; 2;1(t K1 K2); t K1 + K2; K2 (t) = 8 >>> >< >>> >: K1 (t); t< K1; 02(t K1); K1 t< K1 + K2; 02;1(t K1 K2); t K1 + K2: 49 Similarly, 2 is an ( ; )-process and K2 is the truncated K-process of an ( ; )-process. They satisfy conditions (ii) and (iii) on [0; K1 + K2). Continue the above construction: For every n, de ne n and Kn such that n is an ( ; )- process, Kn it the truncated K-process of an ( ; )-process, and they satisfy conditions (ii) and (iii) on [0;Pnk=1 Kk). It su ces to prove thatP1k=1 Kk =1a.s. Suppose P(P1k=1 Kk <1) > 0. Then there exist t and a such that P(P1k=1 Kk 0. Since for every n, n is an ( ; )-process with initial measure , we get an E ^N n [0;t];(K;1);Rd : Noting that by (4.1) E ^N n([0;t];(K;1);Rd) is the same nite constant for di erent n, we get a contradiction. So P1k=1 Kk =1 a.s. Just as the DW-process, the ( ; )-process and its truncated K-process K also have cluster structures (cf. Corollary 11.5.3 in [5], or Section 3 in [7], especially page 41 there). Speci cally, for any xed t, t is a Cox cluster process, such that the \ancestors" of t at time s = t h form a Cox process directed by ( h) 1= s, and the generated h-clusters ih are conditionally independent and identically distributed apart from shifts. For the truncated K-process K, the situation is similar, except that the clusters are di erent (because of the truncation) and the term ( h) 1= for needs to be replaced by aK(h) (or ah, when K is xed). Use K;ih (or Kih ) to denote the generated h-clusters of K. Write Pxf t 2 g for the distribution of t centered at x2Rd, and de ne P f t 2 g = R (dx)Pxf t 2 g. The following comparison of aK(h) and ( h) 1= , although not used explicitly in the present chapter, should be useful in other applications of the truncation method. 50 Lemma 4.3 Fix any K > 0. Then as h!0, ( h)1= aK(h) 2( h)1= : Proof: From Lemma 3.4 of [7] we know that ( h)1= = lim !1 1=v0(h; ); where v0(h; ) is the solution of _v = v1+ with initial condition v , and aK(h) = lim !1 1=v1(h; ); where v1(h; ) is the solution of (1.12) in [38] with initial condition v . De ne MK( ) = C (K) + K( ), where C (K) and K are such as in (1.12) of [38]. Then MK satis es 1+ MK( ) and lim !1 MK( ) 1+ = 1: Clearly it is enough to show that (1=2)v0(h; ) v1(h; ) v0(h; ) as h! 0 and !1. This follows from the above properties of MK. Unlike the normal densities, we have no explicit expressions for the transition densities of symmetric -stable processes when < 2. However, a simple estimate of p (t;x) is enough for our needs. Lemma 4.4 Let p (t;x), 2(0;2], t> 0, and x2Rd, denote the transition densities of a symmetric -stable process on Rd. Then for any xed and d, p (t;x+y) <_ p (2t;x); jyj t: 51 Proof: First let = 2. Note that p2(t;x) = pt(x) is the standard normal density on Rd. For jxj 4pt, trivially pt(x+y) <_ p2t(x). For jxj> 4pt, it su ces to check that jx+yj 2 2t jxj2 4t ; that is, 2jx+yj2 jxj2, which follows easily from jxj 4jyj. Now let < 2. By the arguments after Remark 5.3 of [2], p (t;x) t d= ^ tjxjd+ : (4.2) Choose K > 21= to satisfy 1 2(1 1=K)d+ . Since jyj t1= , we have for jxj>Kt1= , t jx+yjd+ 2t jxjd+ : Noticing also that (2t)=jxjd+ < (2t) d= for jxj>Kt1= , we get p (t;x + y) <_ p (2t;x) for jyj t1= andjxj>Kt1= . The same inequality holds trivially forjyj t1= andjxj Kt1= . Using Lemma 4.2 and Lemma 4.4, we can generalize Lemma 3.2 in [25] to any ( ; )- process. Lemma 4.5 Let be an ( ; )-process in Rd, 2(0;2] and 2(0;1], and x any - nite measure . Then for any xed t> 0, the following two conditions are equivalent: (i) t is locally nite a.s. P , (ii) E t is locally nite. Furthermore, (i) and (ii) hold for every t> 0 i (iii) p (t; ) <1 for all t> 0, and if < 2, then (iii) is equivalent to 52 (iv) p (t; ) <1 for some t> 0. Proof: The formulas for E t and E 2t (when < 1), well known for nite , as well as the formulas in Lemma 3 of [38], extend by monotone convergence to any - nite measure . We also need the simple inequality that for any xed < 2, s, and t, p (s;x) p (t;x): (4.3) To prove it, use (4.2) and consider three cases: jxj (s^t)1= , jxj (s_t)1= , and (s^t)1= 0 and x2Rd, p (t;x u) <_ p (jxj1= ;x u) <_ p (2jxj1= ; u) = p (2jxj1= ;u) yields p (t; )(x) <1. Now assume < 1. Condition (ii) clearly implies (i). Conversely, suppose that E tB = 1for some B. Then E Kt B =1for any xed K > 0 by Lemma 3 of [38]. Also, we get by Lemma 3 of [38], P K t B E Kt B >r (1 r)2 (E K t B) 2 E ( Kt B)2 (1 r)2 1 +ct(E Kt B) 1 for any r2(0;1). Arguing as in the proof of Lemma 3.2 in [25], we get Kt B =1 a.s., and so tB =1 a.s. by Lemma 4.2. In particular, this shows that (i) implies (ii). To prove the equivalence of (ii) and (iii), again using Lemma 4.4 and (4.3) we can proceed as in Lemma 3.2 of [25]. The last assertion is obvious from (4.3). 53 4.3 Hitting bounds and neighborhood measures From now on we consider only (2; )-processes. The Lebesgue approximation depends crucially on estimates of the hitting probability P f tB"0 > 0g. In this section, we rst estimate P f tB"0 > 0g and P f Kt B"0 > 0g. Then we use these estimates to study multiple hitting and neighborhood measures of the clusters Kh associated with the truncated K- process K. We begin with a well-known relationship between the hitting probabilities of t and t, which can be proved as in Lemma 4.1 of [25]. Lemma 4.6 Let the (2; )-process in Rd with associated clusters t be locally nite under P , let K be its truncated K-process with associated clusters Kt , and x any B2Bd. Then P f tB > 0g = ( t)1= log (1 P f tB > 0g); P f tB > 0g = 1 exp ( t) 1= P f tB > 0g ; P f Kt B > 0g = at log (1 P f Kt B > 0g); P f Kt B > 0g = 1 exp ( a 1t P f Kt B > 0g): In particular, P f tB > 0g ( t) 1= P f tB > 0g and P f Kt B > 0g a 1t P f Kt B > 0g as either side tends to 0. Upper and lower bounds of P f tB"0 > 0g have been obtained by Delmas [9], using a subordinated Brownian snake approach. However, in this chapter we need the following improved upper bound. Lemma 4.7 Let t be the clusters of a (2; )-process in Rd with < 1 and d > 2= , let Kt be the clusters of K, the truncated K-process of , and consider a - nite measure on Rd. Then for 0 <" pt, (i) pt0 <_ "2= d( t) 1= P f tB"0 > 0g<_ p2t; where t0 = t=(1 + ), (ii) "2= da 1t P f Kt B"0 > 0g<_ p2t: 54 Proof: (i) From the proof of Theorem 2.3 in [9] we know that Pxf tB"0 > 0g= 1 exp( NxfYtB"0 > 0g); where Nx and Yt are de ned in Section 4.2 of [9]. Comparing this with Lemma 5.4 yields ( t) 1= Pxf tB"0 > 0g= NxfYtB"0 > 0g: By Proposition 6.2 in [9] we get the lower bound. For our upper bound, we will now improve the upper bound in Proposition 6.1 of [9]. For 0 <"=2 "=2g [ f(r;y)2R+ Rd; r 0g<_ " 2= P0f s2B"=2x for some s2[t "2=16;t)g; where is a standard Brownian motion. De ne T = inffs t "2=16 : s2B"=2x g; where inf;=1 as usual. Then fT 0, y2B"=2x , and s s0 = "2=16, Pyf s =2B"xg Pzf s0 =2B"xg<_ Pzf s02B"xg Pyf s2B"xg; where z is a point on the surface of B"=2x , and the second relation holds since Pzf s0 =2B"xgand Pzf s02B"xg are both positive constants. Now return to P0fT 2= . This extends Theorem 4.5 of [25] for DW-processes of dimension d 2. Note that the special case of convergence of random measures t d!0 is equivalent to tB P!0 for any bounded Borel set B. Convergence of closed random sets is de ned as usual with respect to the Fell topology (cf. [24], pp. 324, 566). However, in this chapter we need only the special case of convergence to the empty set supp t d!;, which is equivalent to 1f tB > 0gP!0 for any bounded Borel set B. 56 Theorem 4.8 Let be a locally nite (2; )-process in Rd, < 1 and d> 2= , with arbitrary initial distribution. Then these conditions are equivalent as t!1: (i) t d!0, (ii) supp t d!;, (iii) 0pt P!0. Proof: By Lemma 5.4 and Lemma 5.5(i) we get for any xed r P f tBr0 > 0g ( t) 1= P f tBr0 > 0g<_ p2t; and so P f tBr0 > 0g<_ p2t^1. For a general initial distribution, Pf tBr0 > 0g<_ E( 0p2t^1); which shows that (iii) implies (ii). Since clearly (ii) implies (i), it remains to prove that (i) implies (iii). Let be locally nite under P . We rst choose f2C++c (Rd) with suppf2B10, where C++c (Rd) is such as in Proposition 2.6 of [29]. Clearly tf P!0 if tB10 P!0. By dominated convergence exp( vt) = E exp( tf)!1; and so vt!0. By Proposition 2.6 of [29], we have for t large enough pt=2(x) (t=2;x) <_ vt(x); where is de ned in (1.15) of [29] (on page 1061, see also (1.17) and (1.18) there). So pt=2 !0. For general 0, we may proceed as in the proof of Theorem 4.5 in [25]. 57 The following simple fact is often useful to extend results for nite initial measures to the general case. Here ^Bd denotes the space of bounded sets in the Borel -algebra Bd. Lemma 4.9 Let the (2; )-process in Rd with < 1 and d > 2= be locally nite under P , and suppose that n # 0. Then P nf tB > 0g! 0 as n!1 for any xed t > 0 and B2 ^Bd. Proof: Follow the proof of Lemma 4.3 in [25], then use Lemma 4.5, Lemma 5.4, and Lemma 5.5(i). As in [25] we need to estimate the probability that a ball in Rd is hit by more than one subcluster of the truncated K-process K. This is where the truncation of is needed. Lemma 4.10 Fix any K > 0. Let K be the truncated K-process of a (2; )-process in Rd with < 1 and d> 2= . For any t h> 0 and "> 0, let "h be the number of h-clusters of Kt hitting B"0 at time t. Then for "2 h t, E "h( "h 1) <_ "2(d 2= ) h1 d=2 pt + ( p2t)2 : Proof: Follow Lemma 4.4 in [25], then use Lemma 3 of [38] and Lemma 5.5(ii). Now we consider the neighborhood measures of the clusters Kh associated with the trun- cated K-process K. For any measure on Rd and constant "> 0, we de ne the associated neighborhood measure " as the restriction of Lebesgue measure d to the "-neighborhood of supp , so that " has Lebesgue density 1f B"x > 0g. Let pK;"h (x) = Pxf Kh B"0 > 0g, where the Kh are clusters of K. Write pK;"h (x) = pK"h (x) and ( K;ih )" = Ki"h for convenience. Lemma 4.11 Let K be the truncated K-process of a (2; )-process in Rd with < 1 and d> 2= . Let the Kih be conditionally independent h-clusters of K, rooted at the points of a Poisson process with E = . Fix any measurable function f 0 on Rd. Then, 58 (i) E Pi Ki"h = ( pK"h ) d , (ii) E Var Pi Ki"h fj <_ ah"d 2= hd=2kfk2k k for "2 h. Proof: (i) Follow the proof of Lemma 6.2 (i) in [25]. (ii) First, Varx( K"h f) Ex( K"h f)2 Exk K"h k2kfk2 =kfk2Exk K"h k2: For Exk K"h k2, using Cauchy inequality and Lemma 5.5(ii), we get Exk K"h k2 = Ex Z 1f Kh B"y > 0gdy Z 1f Kh B"z > 0gdz = Z Z Px f Kh B"y > 0g\f Kh B"z > 0g dydz Z Z (Pxf Kh B"y > 0gPxf Kh B"z > 0g)1=2dydz <_ ah"d 2= Z Z (p2h(y x)p2h(z x))1=2dydz =_ ah"d 2= hd=2 Z Z p4h(y x)p4h(z x)dydz = ah"d 2= hd=2: Hence, by independence E Var hX i Ki" h fj i = E Z (dx)Varx( K"h f) <_ ah"d 2= hd=2kfk2k k: We also need to estimate the overlap between subclusters. Lemma 4.12 Let K be the truncated K-process of a (2; )-process in Rd with < 1 and d > 2= . For any xed t > 0, let Kih denote the subclusters in K of age h > 0. Fix any 59 2 ^Md. Then as "2 h!0, E X i Ki" h K" t <_ "2(d 2= )h1 d=2: Proof: Follow the proof of Lemma 6.3(i) in [25], then use Lemma 5.5(ii). 4.4 Hitting asymptotics For a DW-process of dimension d 3, we know from Theorem 3.1(b) of Dawson, Iscoe, and Perkins [6] that, as "!0, "2 dP f tB"x > 0g!cd ( pt)(x); uniformly for bounded k k, bounded t 1, and x2Rd. A similar result for DW-processes of dimension d = 2 is Theorem 5.3(ii) of [25]. In this section, using Lemma 5.5(i), we can prove the corresponding result for (2; )-processes in Rd with < 1 and d> 2= . First we x a continuous function f on Rd such that 0 < f(x) 1 for x 2 B10 and f(x) = 0 otherwise. Let v be the solution of _v = 12 v v1+ with initial condition v(0) = f. Since v is increasing in , we can de ne v1 = lim !1v . Using Lemma 5.5(i), we can get an upper bound of v1, similar to Lemma 3.2 in [6]. Lemma 4.13 For any t 1 and x2Rd, v1(t;x) <_ p(2t;x). Proof: Letting !1 in Ex exp( t f) = exp[ v (t;x)], we get Pxf tB10 > 0g= 1 exp[ v1(t;x)]: Comparing this with Lemma 5.4 yields v1(t;x) = ( t) 1= Pxf tB10 > 0g: (4.4) 60 Now Lemma 4.13 follows from Lemma 5.5(i). As in Lemma 3.3 of [6], we can apply a PDE result to get the uniform convergence of v1. Notice that the improved upper bound in Lemma 5.5(i) is crucial here. Lemma 4.14 There exists a constant c ;d > 0 such that lim"!0" dv1(" 2t;" 1x) = c ;d p(t;x): The convergence is uniform for bounded t 1 and x2Rd. Proof: We follow the proof of Lemma 3.3 in [6]. By Lemma 4.13, v1(t;x) is nite for any t 1 and x2Rd. Then by a standard regularity argument in PDE theory, _v1 = 12 v1 v1+ 1 (4.5) on [1;1) Rd. By Lemma 4.13, v1(1)2L1(Rd). Set w"(t;x) = " dv1(1 +" 2t;" 1x): Then by (4.5), _w" = 12 w" " d 2w1+ " with initial condition w"(0;x) = " dv1(1;" 1x). Applying Proposition 3.1 in [21] gives lim"!0" dv1(1 +" 2t;" 1x) = c ;d p(t;x); uniformly on compact subsets of (0;1) Rd. Together with Lemma 4.13 this yields the uniform convergence on [a;1) Rd for any a> 0. Moreover, letting t = t0 "2, we get lim"!0" dv1(" 2t0;" 1x) = c ;d p(t0;x); 61 uniformly on [a;1) Rd for any a> 0. It remains to prove that c ;d > 0. Using (4.4) and the lower bound in Lemma 5.5(i), we obtain " dv1(" 2t;" 1x) = " d( t) 1= P" 1xf " 2tB10 > 0g >_ " dp " 2t 1 + ;" 1x = p t 1 + ;x ; and so c ;d > 0. Now we can derive the asymptotic hitting rate for a (2; )-process. Theorem 4.15 Let the (2; )-process in Rd with < 1 and d> 2= be locally nite under P . Fix any t> 0 and x2Rd. Then as "!0, "2= dP f tB"x > 0g!c ;d( pt)(x): The convergence is uniform for bounded k k, bounded t 1, and x2Rd. Similar results hold for the clusters t with pt replaced by ( t)1= pt. Proof: We rst prove that as "!0, "2= d( t) 1= P f tB"x > 0g!c ;d( pt)(x); (4.6) uniformly for bounded k k, bounded t 1, and x2Rd. Use x to denote the measure shifted by x. If is nite, then by the scaling of , (4.4), and Lemma 4.14, we can get the following chain of relations, which proves the uniform convergence of (4.6): "2= d( t) 1= P f tB"x > 0g 62 = "2= d( t) 1= Z Pyf tB"0 > 0g( x)(dy) = "2= d( t) 1= Z Py="f t="2B10 > 0g( x)(dy) = " d Z v1(" 2t;" 1y)( x)(dy)!c ;d( pt)(x): Let be an in nite - nite measure satisfying pt <1 for all t. From the proof of Lemma 4.5, we know that ( p2t)(x) <1for any x2Rd. Then by dominated convergence based on Lemma 5.5(i), we can still get (4.6). Now we turn to t. First note that by Lemma 5.4, as "!0, "2= dP f tB"x > 0g!c , "2= d( t) 1= P f tB"x > 0g!c; (4.7) "2= dP f Kt B"x > 0g!c , "2= da 1t P f Kt B"x > 0g!c: (4.8) It remains to prove the uniform convergence for t. Since ( pt)(x) t d=2k k, we know that by (4.6), ( t) 1= P f tB"x > 0g! 0, uniformly for bounded k k, bounded t 1, and x2Rd. Then we may use Lemma 5.4 to get the uniform convergence for t. The following result, especially part (ii), will play a crucial role in Section 5. Here we approximate the hitting probabilities pK"h by suitably normalized Dirac functions. This will be used in Lemma 4.17 to prove the Lebesgue approximation of K. Lemma 4.16 Let p"h(x) = Pxf hB"0 > 0g, where the h are clusters of a (2; )-process in Rd with < 1 and d> 2= . Recall that pK"h (x) = Pxf Kh B"0 > 0g, where the Kh are clusters of K, the truncated K-process of . Fix any bounded, uniformly continuous function f 0 on Rd. (i) As 0 <"2 h!0, "2= d( h) 1= (p" h f) c ;df !0: 63 (ii) Fix any b2(0;1=2). Then as 0 <"2 h!0 with "2= dh1+bd!0, "2= da 1 h (p K" h f) c ;df !0: Both results hold uniformly over any class of uniformly bounded and equicontinuous functions f 0 on Rd. Proof: (i) We follow the proof of Lemma 5.2(i) in [25]. By scaling of and (4.6), "2= d( h) 1= dp"h = ("= p h)2= d( ) 1= dp"= ph 1 !c ;d: (4.9) De ning ^p"h = p"h= dp"h, we need to show that k^p"h f fk!0. Write wf for the modulus of continuity of f, that is, a function wf = w(f; ) de ned by wf(r) = supfjf(s) f(t)j;s;t2Rd;js tj rg; r> 0: Clearly wf(r)!0 as r!0 since f is uniformly continuous. Now we get k^p"h f fk = supx Z ^p"h(u) (f(x u) f(x)) du Z ^p"h(u)wf(juj)du wf(r) + 2kfk Z juj>r ^p"h(u)du: It remains to show that Rjuj>r ^p"h(u)du! 0 for any xed r > 0. Then notice that for any xed r> 0 by Lemma 5.5(i), "2= d( h) 1= Z juj>r p"h(u)du<_ Z juj>r p2h(u)du!0: 64 (ii) For pK"h , Lemma 5.5(ii) yields for any xed r> 0, "2= da 1h Z juj>r pK"h (u)du<_ Z juj>r p2h(u)du!0: Following the steps of the previous proof, it is enough to show that "2= da 1h dpK"h !c ;d: (4.10) Since Rjuj>hbp2h(u)du!0, Lemma 5.5 yields "2= d( h) 1= 1f(Bhb0 )cg dp"h!0; "2= da 1h 1f(Bhb0 )cg dpK"h !0: By (5.12), to prove (4.10) it su ces to show that "2= d( h) 1= 1fBhb0 g dp"h "2= da 1h 1fBhb0 g dpK"h !0; or equivalently (by (4.7) and (4.8)), "2= d P1fBhb 0 g d f hB"0 > 0g P1fBhb 0 g d f Kh B"0 > 0g !0: By Theorem 25.22 of [24] and (4.1), "2= d P1fBhb 0 g d f hB"0 > 0g P1fBhb 0 g d f Kh B"0 > 0g "2= dE1fBhb 0 g d N [0;h];(K;1);Rd = "2= dE1fBhb 0 g d ^N [0;h];(K;1);Rd =_ "2= dE Z h 0 k skds =_ "2= dh1+bd!0: 65 4.5 Lebesgue approximations To prove the Lebesgue approximation for a (2; )-process in Rd with < 1 and d > 2= , we begin with the Lebesgue approximation for K, the truncated K-process of . Since and K agree asymptotically as K !1, we have thus proved the Lebesgue approximation for . Write ~c ;d = 1=c ;d for convenience, where c ;d is such as in Lemma 4.14. Recall that K"t = ( Kt )", the "-neighborhood measure of Kt . Lemma 4.17 Let K be the truncated K-process of a (2; )-process in Rd with < 1 and d> 2= . Fix any 2 ^Md and t> 0. Then under P , we have as "!0: ~c ;d"2= d K"t w! Kt a.s. Proof: We follow the proof of Theorem 7.1 in [25]. Fix any f 2CdK. Write Kih for the subclusters of Kt of age h. Since the ancestors of Kt at time s = t h form a Cox process directed by Ks =ah, Lemma 5.7(i) yields E hX i Ki" h f Ks i = a 1h Ks (pK"h f); and so by Lemma 5.7(ii) E X i Ki" h f a 1 h K s (p K" h f) 2 = E Var hX i Ki" h f Ks i <_ ah"d 2= hd=2kfk2E k Ks =ahk "d 2= hd=2kfk2k k; where the last inequality follows from E k Ks k k k. Combining with Lemma 5.8 gives E K"t f a 1h Ks (pK"h f) E K"t f X i Ki" h f +E X i Ki" h f a 1 h K s (p K" h f) 66 <_ "2(d 2= ) h1 d=2kfk+"1=2(d 2= ) hd=4kfk = "d 2= "d 2= h1 d=2 +" 1=2(d 2= )hd=4 kfk: Let c satisfy (d 2= ) + ( d=2 + 1=2)c = 0: (4.11) Clearly c 2 (0;2). Taking " = rn for a xed r 2 (0;1) and h = "c = rcn, and writing sn = t h = t rcn, we obtain E X n rn(2= d) Krnt f a 1rcn Ksn(pKrnrcn f) <_ X n r[(d 2= )+( d=2+1)c]n +r[ 1=2(d 2= )+(d=4)c]n kfk<1; since (d 2= ) + ( d=2 + 1)c> 0 and 1=2(d 2= ) + (d=4)c> 0 by (4.11). This implies rn(2= d) Krnt f a 1rcn Ksn(pKrnrcn f) !0 a.s. P : (4.12) Now we write "2= d K" t f c ;d K t f "2= d K"t f a 1h Ks (pK"h f) +c ;dj Ks f Kt fj +k Ks k "2= da 1h (pK"h f) c ;df : For the last term, we rst x b = 1=2 1=d, then apply Lemma 4.16. Noting that by (4.11) (2= d) + (1 +bd)c = (2= d) + (d=2)c> 0; we get by Lemma 4.16 "2= da 1 h (p K" h f) c ;df !0 67 along the sequence (rn). Using (5.13) and the a.s. weak continuity of K at the xed timet, we see that the right-hand side tends a.s. to 0 as n!1, which implies "2= d K"t f!c ;d Kt f a.s. as "!0 along the sequence (rn) for any xed r2(0;1). Since this holds simultaneously, outside a xed null set, for all rational r2 (0;1), the a.s. convergence extends by Lemma 2.3 in [25] to the entire interval (0;1). Applying this result to a countable, convergence-determining class of functions f (cf. Lemma 3.2.1 in [5]), we obtain the required a.s. vague convergence. Since is nite, the (2; )-process t has a.s. compact support (cf. Theorem 9.3.2.2 of [5] and the proof of Theo- rem 1.2 in [6]). By Lemma 4.2, Kt also has a.s. compact support, and so the a.s. convergence remains valid in the weak sense. Now we may prove our main result, the Lebesgue approximation of (2; )-processes. Again, we write ~c ;d = 1=c ;d for convenience, where c ;d is such as in Lemma 4.14. Also recall that "t = ( t)" denotes the "-neighborhood measure of t. For random measures n and on Rd, n v! (or w!) in L1 means that nf! f in L1 for all f in CdK (or Cdb). Theorem 4.18 Let the (2; )-process in Rd with < 1 and d> 2= be locally nite under P , and x any t> 0. Then under P , we have as "!0: ~c ;d"2= d "t v! t a.s. and in L1: This remains true in the weak sense when is nite. The weak version holds even for the clusters t when k k= 1. Proof: For a nite initial measure , by Lemma 4.17 and Lemma 4.1 we get as "!0 ~c ;d"2= d "t w! t a.s: 68 For a general - nite measure on Rd with pt <1 for all t > 0, write = 0 + 00 for a nite 0, and let = 0 + 00 be the corresponding decomposition of into independent components with initial measures 0 and 00. Fixing an r> 1 with suppf Br 10 and using the result for nite , we get a.s. on f 00tBr0 = 0g "2= d "tf = "2= d 0"t f!c ;d 0tf = c ;d tf: As 0" , we get by Lemma 4.9 P f 00tBr0 = 0g= P 00f tBr0 = 0g!1; and the a.s. convergence extends to . As in the proof of Lemma 4.17, we can obtain the required a.s. vague convergence. To prove the convergence in L1, we note that for any f2CdK "2= dE "tf = "2= d Z P f tB"x > 0gf(x)dx ! Z c ;d ( pt)(x)f(x)dx = c ;dE tf; (4.13) by Theorem 4.15. Combining this with the a.s. convergence under P and using Proposition 4.12 in [24], we obtain E j"2= d "tf c ;d tfj! 0. For nite , (4.13) extends to any f 2Cdb by dominated convergence based on Lemmas 5.4 and 5.5(i), together with the fact that d( pt) =k k<1 by Fubini?s theorem. To extend the Lebesgue approximation to the individual clusters t, let 0 denote the process of ancestors of t at time 0, and note that Pxf t2 g= P x[ t2 jk 0k= 1]; 69 where P xfk 0k = 1g = ( t) 1= e ( t) 1= > 0. The a.s. convergence then follows from the corresponding statement for t. Since P f t2 g= Z (dx)Pxf t2 g; the a.s. convergence under any P with k k = 1 also follows. To obtain the weak L1- convergence in this case, we note that for f2Cdb, "2= dE "tf = "2= d Z P f tB"x > 0gf(x)dx ! c ;d ( t)1= Z ( pt)(x)f(x)dx = c ;dE tf; by dominated convergence based on Lemma 5.5(i) and Theorem 4.15. As in Corollary 7.2 of [25], for the intensity measures in Theorem 4.18, we have even convergence in total variation. Corollary 4.19 Let be a (2; )-process in Rd with < 1 and d> 2= . Then for any nite and t> 0, we have as "!0: "2= dE "t c ;dE t !0: This remains true for the clusters t, and it also holds locally for t whenever is locally nite under P . Finally let us give a detailed explanation of the deterministic distribution property of (2; )-processes. Here the deterministic distribution property has two aspects. De ne deterministic functions "; similar to those de ned on page 309 of [41], Theorem 4.18 shows that a.s. (supp t) = t; 70 so a.s. t is a deterministic function of its support supp t. This is the rst aspect of the deterministic distribution property. Now the second aspect. Since d(@Brx) = 0, we get t(@Brx) = 0 a.s. by noting E t = ( pt) d. With the help of Portmanteau Theorem for nite measures, Theorem 4.18 shows that a.s. for all open balls B with rational centers and rational radius, lim"!0 "(supp t)(B) = t(B); so the construction of t(!) from its support supp t(!) is the same everywhere for any xed ! outside a null set. 71 Chapter 5 Lebesgue Approximation of Superprocesses with a Regularly Varying Branching Mechanism 5.1 Introduction Superprocesses are certain measure-valued Markov processes = ( t), whose distri- butions can be characterized by two components: the branching mechanism speci ed by a function (v), and the spatial motion usually given by a Markov process X. If X is a Feller process in Rd with generator L, then the laplace functional E exp( tf) satis es E [exp( tf)j s] = exp( svt s) where vt(x) is the unique nonnegative solution of the so- called evolution equation _v = Lv (v) with initial condition v0 = f. We call this superpro- cess an (L; )-superprocess (or (L; )-process for short). For 2(0;2] and 2(0;1], if X is a rotation invariant -stable L evy process in Rd with generator 12 and (v) = v1+ , we get a superprocess corresponding to the PDE _v = 12 v v1+ . We call it an ( ; )-superprocess (( ; )-process for short), which is just a (12 ;v1+ )-superprocess in our previous nota- tion. General surveys of superprocesses include the excellent monographs and lecture notes [5, 15, 17, 32, 35, 42]. For any measure on Rd and constant " > 0, write " for the restriction of Lebesgue measure d to the "-neighborhood of supp . For a (2,1)-process in Rd, Tribe [48] showed that "2 d "t w!cd t a.s. as "! 0 for xed time t > 0 when d 3, where w! denotes weak convergence. Perkins [41] improved Tribe?s result by showing that the Lebesgue approxima- tion actually holds for all time t > 0 simultaneously. Kallenberg [25] proved the Lebesgue approximation of 2-dimensional (2,1)-processes. In [22], we showed that, for any (2; )- process in Rd with < 1 and d > 2= , "2= d "t w!c ;d t a.s. as " ! 0 for xed time t > 0. In particular, the Lebesgue approximation result implies that the superprocess t distributes its mass over supp t in a deterministic manner. See the end of [22] for a detailed 72 explanation of this deterministic distribution property. However, for any ( ; )-process with < 2, supp t = Rd or ; a.s. (cf. [18, 40]), and so the corresponding property fails. From all these Lebesgue approximation results, we raise the natural conjecture: Lebesgue approximation holds for superprocesses with Brownian spatial motion and any \reasonable" branching mechanism. As a rst step to prove this general conjecture, in this chapter we study the Lebesgue ap- proximation of superprocesses with Brownian spatial motion and a regularly varying branch- ing mechanism. For a precise description of the branching mechanism we consider in this chapter, refer to the beginning of Section 3. The stable branching mechanism (v) = v1+ with 2 (0;1] is a special case of the regularly varying branching mechanism we consider here. Our main result in this chapter is Theorem 5.5, where we prove that the Lebesgue approximation still holds for these more general superprocesses. Speci cally, ~m(") "t w! t a.s. as "! 0 for xed time t > 0, where m(") is a suitable normalizing function. In par- ticular, if the branching mechanism is the stable one, we may recover all previous Lebesgue approximation results for xed time t> 0. Although the previous conjecture may seems very natural, technically we have limited tools to support some rigorous arguments needed. One such boundary is imposed by the availability of the very important cluster representation of superprocesses. Luckily the super- processes we consider here do have the cluster representation. Another boundary is imposed by the availability of the lower and upper bounds of the hitting probabilities P f tB"x > 0g, which is fundamental for the Lebesgue approximation. The restriction on the branching mechanism we consider actually follows from Theorem 2.3 in [9], which is exactly the lower and upper bounds of the hitting probabilities. Armed with the hitting estimates, then we are able to overcome the main di culty in this chapter, that is, to obtain an asymptotic result of the hitting probabilities P f tB"x > 0g, which is Theorem 5.11. Note that for a (2; )-process, such a result is obtained by using the strong scaling property. Since the regularly varying branching mechanism we consider here 73 has much weaker scaling property, we then have to rely on only the cluster representation and the hitting estimations. Also the form of the asymptotic result of the hitting probabilities is not clear in our general setting. By adapting an idea in Section 5 of [25], we can get the correct form of our asymptotic result, which determines the form of the Lebesgue approximation. This chapter is organized as follows. In Section 2 we review the truncation of super- processes in a more general setting. In Section 3, we develop some lemmas about hitting bounds and neighborhood measures of the more general superprocesses. In Section 4, we derive some asymptotic results of these hitting probabilities. Finally in Section 5 we state and prove the Lebesgue approximation of superprocesses with a regularly varying branching mechanism and their truncated processes. This general result contains all previous Lebesgue approximation of superprocesses as special cases. 5.2 Truncation of superprocesses In this section we discuss the truncation of superprocesses with a general branching mechanism, due to their independent interests. We consider a general branching mechanism function de ned on R+ as (v) = av +bv2 + Z (0;1) (e rv 1 +rv) (dr); where b 0 and is a measure on (0;1) such that R10 (r^r2) (dr) <1. It is well known that the (L;1)-process has weakly continuous sample paths. By contrast, when 6= 0, the corresponding superprocess has only weakly rcll sample paths with jumps of the form t = r x, for some t> 0, r> 0, and x2Rd. Let N (dt;dr;dx) = X (t;r;x): t=r x (t;r;x): 74 Clearly the point process N on R+ R+ Rd records all information about the jumps of . By the proof of Theorem 6.1.3 in [5], we know that N has compensator measure ^N (dt;dr;dx) = (dt) (dr) t(dx): (5.1) Due to all the \big" jumps, t has in nite variance. Some methods for (L;1)-processes, which rely on the nite variance of the processes, are not directly applicable to superprocesses with a branching mechanism having 6= 0. Mt(f) = Mct (f) +Mdt (f) = tf 0f Z t 0 s(Lf)ds; tf = 0f + Z t 0 s(Lf)ds+Mct (f) +Mdt (f) where Mct (f) is a continuous martingale with quadratic variation process [Mc(f)]t = Z t 0 s(bf2)ds; (5.2) and Mdt (f) is a purely discontinuous martingale, which can be written as follows Mdt (f) = Z t 0 Z (0;1) Z Rd rf(x) ^N (dt;dr;dx) = Z t 0 Z (0;K] Z Rd rf(x) ^N (dt;dr;dx) + Z t 0 Z (K;1) Z Rd rf(x) ^N (dt;dr;dx) = Z t 0 Z (0;K] Z Rd rf(x) ^N (dt;dr;dx) + Z t 0 Z (K;1) Z Rd rf(x)N (dt;dr;dx) (K;1) Z t 0 sfds E [exp( Kt f)j Ks ] = exp( Ks vt s) (5.3) 75 _v = Lv K(v); (5.4) where K = (a (K;1))v +bv2 +R(0;K](e rv 1 +rv) (dr) Kt f = K0 f + Z t 0 Ks (Lf)ds+Mct (f) +Mdt (f) [K;1) Z t 0 Ks fds where Mct (f) is a continuous martingale with quadratic variation process [Mc(f)]t = Z t 0 Ks (bf2)ds; (5.5) and Mdt (f) is a purely discontinuous martingale, which can be written as follows Mdt (f) = Z t 0 Z (0;1) Z Rd rf(x) ^N K(dt;dr;dx) = Z t 0 Z (0;K) Z Rd rf(x) ^N K(dt;dr;dx) N K(dt;dr;dx) = X (t;r;x): Kt =r x (t;r;x): ^N K(dt;dr;dx) = (dt)1(0;K)(r) (dr) Kt (dx): (5.6) In [38], Mytnik and Villa introduced a truncation method for ( ; )-processes with < 1, which can be used to study ( ; )-processes with < 1, especially to extend results of ( ;1)-processes to ( ; )-processes with < 1. Speci cally, for the ( ; )-process with < 1, we de ne the stopping time K = infft > 0 : k tk> Kg for any constant K > 0. Clearly K is the time when has the rst jump greater than K. For any nite initial measure , they proved that one can de ne and a weakly rcll, measure-valued Markov process K on a common probability space such that t = Kt for t < K. Intuitively, K euqals minus all masses produced by jumps greater than K along with the future evolution 76 of those masses. In this paper, we call K the truncated K-process of . Since all \big" jumps are omitted, Kt has nite variance. They also proved that Kt and t agree asymptotically as K!1. We give a di erent proof of this result, since similar ideas will also be used at several crucial stages later. We write P f 2 gfor the distribution of with initial measure . Using the same proof of Lemma 1 in [38], we can construct and K on a common probability space such that t(!) = Kt (!) for t < K(!). This con rms our intuition that K euqals minus all masses produced by jumps greater than K along with the future evolution of those masses. Lemma 5.1 We can de ne and K on a common probability space such that: (i) is an ( ; )-process with < 1 and a nite initial measure , and K is its truncated K-process, (ii) t(!) = Kt (!) for t< K(!). Now we can prove that Kt and t agree asymptotically as K!1. We choose to give a complete proof of this result, since similar ideas will also be used at several crucial stages later. We write P f 2 g for the distribution of with initial measure . Lemma 5.2 Fix any nite and t> 0. Then P f K >tg!1 as K!1. Proof: If K t, then has at least one jump greater than K before time t. Noting that N ([0;t];(K;1);Rd) is the number of jumps greater than K before time t, we get by Theorem 25.22 of [24] and (4.1), P f K tg E N [0;t];(K;1);Rd = E ^N [0;t];(K;1);Rd =_ [K;1)E Z t 0 k skds = tk k [K;1)!0 77 as K!1, where the last equation holds by E k sk=k k. Using the same proof of Lemma 2.2 in [22], we can prove that Kt (!) t(!) for any t and !. So indeed, K is a \truncation" of . Lemma 5.3 We can de ne and K on a common probability space such that: (i) is an ( ; )-process with < 1 and a nite initial measure , and K is its truncated K-process, (ii) t(!) Kt (!) for any t and !, (iii) t(!) = Kt (!) for t< K(!). 5.3 Hitting bounds First we specify the regularly varying branching mechanism we consider for the Lebesgue approximation. We consider the increasing function de ned on R+ by (v) = bv2 + Z (0;1) 2rv2 1 + 2rv 0(dr); where b 0 and 0 is a measure on (0;1) such that R(0;1)(1^r) 0(dr) <1. To avoid trivial cases, we assume either b > 0 or 0((0;1)) = 1. The function can be expressed in the usual form for branching mechanism functions, (v) = bv2 + Z (0;1) (e rv 1 +rv) (dr); where (dr) = [R(0;1)e r=(2u)=(4u2) 0(du)]dr satis es R(0;1)(r^r2) (dr) <1. Notice that if we take b = 0 and 0(dr) = c0r (1+ )dr then we get the stable case (v) = cv1+ . We consider the following two assumptions: 78 (A1) The function is regularly varying at 1 with index 1 + where 2(0;1]; that is to say, limu!1 (ru) (u) = r1+ for every r> 0 (A2) lim supr!0+r (1+ ) (r) <1. The stable case (v) = v1+ satis es all these assumptions. The Lebesgue approximation depends crucially on estimates of the hitting probability P f tB"0 > 0g. In this section, we rst estimate P f tB"0 > 0g and P f Kt B"0 > 0g. Then we use these estimates to study multiple hitting and neighborhood measures of the clusters Kh associated with the truncated K-process K. We begin with a well-known relationship between the hitting probabilities of superprocesses and their clusters, which can be proved as in Lemma 4.1 of [25]. Lemma 5.4 Let the ( ; )-process in Rd with associated clusters t be locally nite under P , let K be its truncated K-process with associated clusters Kt , and x any B2Bd. Then P f tB > 0g = at log (1 P f tB > 0g); P f tB > 0g = 1 exp a 1t P f tB > 0g ; P f Kt B > 0g = aKt log (1 P f Kt B > 0g); P f Kt B > 0g = 1 exp ( (aKt ) 1P f Kt B > 0g): In particular, P f tB > 0g a 1t P f tB > 0g and P f Kt B > 0g (aKt ) 1P f Kt B > 0g as either side tends to 0. Upper and lower bounds of P f tB"0 > 0g have been obtained by Delmas [9], using the Brownian snake. However, in this paper we need the following improved upper bound. Lemma 5.5 Let t be the clusters of a (2; )-process in Rd with < 1 and d > 2= , let Kt be the clusters of K, the truncated K-process of , and consider a - nite measure on Rd. Then for 0 <" pt, 79 (i) l2(") pt0 <_ "2= da 1t P f tB"0 > 0g<_ l1(") p2t; where t0 = t=(1 + ), (ii) "2= d(aKt ) 1P f Kt B"0 > 0g<_ l1(") p2t: Proof: (i) Follow the proof of Lemma 6.3(i) in [25], then use Lemma 5.5(ii). (ii) This is obvious from (i), Lemma 4.2, and Lemma 5.4. As in [25] we need to estimate the probability that a ball in Rd is hit by more than one subcluster of the truncated K-process K. This is where the truncation of is needed. Lemma 5.6 Fix any K > 0. Let K be the truncated K-process of a (2; )-process in Rd with < 1 and d> 2= . For any t h> 0 and "> 0, let K"h be the number of h-clusters of Kt hitting B"0 at time t. Then for "2 h t, E K"h ( K"h 1) <_ l21(")"2(d 2= ) h1 d=2 pt + ( p2t)2 : Proof: Follow Lemma 4.4 in [25], then use Lemma 3 of [38] and Lemma 5.5(ii). Now we consider the neighborhood measures of the clusters Kh associated with the trun- cated K-process K. For any measure on Rd and constant "> 0, we de ne the associated neighborhood measure " as the restriction of Lebesgue measure d to the "-neighborhood of supp , so that " has Lebesgue density 1f B"x > 0g. Let pK;"h (x) = Pxf Kh B"0 > 0g, where the Kh are clusters of K. Write pK;"h (x) = pK"h (x) and ( K;ih )" = Ki"h for convenience. Lemma 5.7 Let K be the truncated K-process of a (2; )-process in Rd with < 1 and d> 2= . Let the Kih be conditionally independent h-clusters of K, rooted at the points of a Poisson process with E = . Fix any measurable function f 0 on Rd. Then, (i) E Pi Ki"h = ( pK"h ) d , (ii) E Var Pi Ki"h fj <_ l1(")aKh "d 2= hd=2kfk2k k for "2 h. 80 Proof: (i) Follow the proof of Lemma 6.2(i) in [25]. (ii)Follow the proof of Lemma 4.4(ii) in [22]. We also need to estimate the overlap between subclusters. Lemma 5.8 Let K be the truncated K-process of a (2; )-process in Rd with < 1 and d > 2= . For any xed t > 0, let Kih denote the subclusters in K of age h > 0. Fix any 2 ^Md. Then as "2 h!0, E X i Ki" h K" t <_ l21(")"2(d 2= )h1 d=2: Proof: Follow the proof of Lemma 6.3(i) in [25], then use Lemma 5.5(ii). 5.4 Hitting asymptotics Write p"h(x) = Pxf hB"0 > 0g and pK;"h (x) = Pxf Kh B"0 > 0g, where h and Kh denote an h-cluster associated with the superprocess in Rd and its truncated K-process K re- spectively. Recall that dp"h = P df hB"0 > 0g. Write pK"h = pK;"h for convenience. For the functions p"h and pK"h , we have the following basic asymptotic property. Since we do not have a lower bound for Pxf Kh B"0 > 0g in Lemma, this asymptotic property is crucial to us by showing that essentially Pxf hB"0 > 0g and Pxf Kh B"0 > 0g share the same lower bound. Lemma 5.9 As 0 <"2 h!0 with "2= d b0h1+bd!0 for some b0> 0 and b2(0;1=2), a 1h dp"h (aKh ) 1 dpK"h : Proof: We just need to show that a 1h dp"h (aKh ) 1 dpK"h a 1h dp"h !0: 81 By Lemma 5.5(i), we get a 1h dp"h l2(") dph0"d 2= = l2(")"d 2= ; a 1h P1f(Bhb 0 )cg d f hB"0 > 0g l1(")1f(Bhb0 )cg dph"d 2= (5.7) = l1(")"d 2= Z jxj hb ph(x)dx (5.8) l1(")"d 2= hc; (5.9) for some c> 0. Since "2= d b0h1+bd!0, we get a 1h P1f(Bhb 0 )cg d f hB"0 > 0g a 1h dp"h !0: Similarly, we get (aKh ) 1P1f(Bhb 0 )cg d f Kh B"0 > 0g a 1h dp"h !0: By Lemma 5.4, nally it su ces to show that P1fBhb 0 g d f hB"0 > 0g P1fBhb 0 g d f Kh B"0 > 0g a 1h dp"h !0 (5.10) By Theorem 25.22 of [24] and (4.1), "2= d P1fBhb 0 g d f hB"0 > 0g P1fBhb 0 g d f Kh B"0 > 0g "2= dE1fBhb 0 g d N [0;h];(K;1);Rd = "2= dE1fBhb 0 g d ^N [0;h];(K;1);Rd =_ "2= dE Z h 0 k skds =_ "2= dh1+bd!0: 82 De ne normalizing functions m(") by m(") = a 1"c dp""c (5.11) with a xed c satisfying (d 2= ) + ( d=2 + 1=2)c = 0: (5.12) Clearly c2(0;2). Lemma 5.10 Fix any bounded, uniformly continuous function f 0 on Rd. As "!0, ~m(")(aK "r) 1 (pK" "r f) f !0: The result holds uniformly over any class of uniformly bounded and equicontinuous functions f 0 on Rd. Proof: By Lemma 5.9, we get ~m(")(aK"r) 1 dpK""r !1: De ning ^pK"h = pK"h = dpK"h , we then only need to show that k^pK"h f fk!0. Now follow the proof of Lemma 4.4(i) in [22] and use Lemma 5.5(ii). Theorem 5.11 Let be a superprocess in Rd. Then for any t> 0 and bounded , we have as "!0 ~m(")P f Kt B" > 0g e bKt( pt) !0; k~m(")P f tB" > 0g ptk!0: Proof: P f Kt B" > 0g E ( Ks pK"h ) = (aKh ) 1E ( Ks pK"h ) 83 = e bKs(aKh ) 1( ps pK"h ) e bKsm(")( ps) e bKtm(")( pt): P f tB"x > 0g E ( s p"h) k~m(")E ( s p"h) ptk!0; P f tB"x > 0g P f Kt B"x > 0g ke bKt( pt) ptk!0 as K!1 since bK !0 5.5 Lebesgue approximations Same as in Section 5 of [22], here we begin with the Lebesgue approximation for K, the truncated K-process of . Then we get the Lebesgue approximation for immediately by Lemma 5.2. Write ~m(") = 1=m(") for convenience, where m(") is de ned in (5.11). Recall that K"t = ( Kt )", the "-neighborhood measure of Kt . For random measures n and on Rd, n w! in L1 means that nf! f in L1 for all f in Cdb. Theorem 5.12 Let K be the truncated K-process of a superprocess in Rd satisfying as- suptions (A1) and (A2) with < 1 and d> 2= . Fix any 2 ^Md and t> 0. Then under P , we have as "!0: ~m(") K"t w! Kt a.s. and in L1 84 Proof: Fix any f 2CdK. We rst prove that ~m(") K"t f ! Kt f a.s. as "!0. In order to do that, we only need to show that for any sequence "n ! 0 as n!1, we can pick a subsequence (still denoted by "n) such that ~m("n) K"nt f ! Kt f a.s. To do this we x an r 2 (0;1). Then for any given sequence "n ! 0 as n !1, we pick the subsequence "n satisfying "n rn. We follow the proof of Lemma 5.1 in [22]. Write Kih for the subclusters of Kt of age h. Since the ancestors of Kt at time s = t h form a Cox process directed by Ks =aKh , Lemma 5.7(i) yields E hX i Ki" h f Ks i = (aKh ) 1 Ks (pK"h f); and so by Lemma 5.7(ii) E X i Ki" h f (a K h ) 1 K s (p K" h f) 2 = E Var hX i Ki" h f Ks i <_ l1(")aKh "d 2= hd=2kfk2E k Ks =aKhk l1(")"d 2= hd=2kfk2k k; where the last inequality follows from E k Ks k k k. Combining with Lemma 5.8 gives E K"t f (aKh ) 1 Ks (pK"h f) E K"t f X i Ki" h f +E X i Ki" h f (a K h ) 1 K s (p K" h f) <_ l21(")"2(d 2= ) h1 d=2kfk+l1=21 (")"1=2(d 2= ) hd=4kfk = "d 2= l21(")"d 2= h1 d=2 +l1=21 (")" 1=2(d 2= )hd=4 kfk: Taking hn = "cn, where c is de ned in (5.12) and writing sn = t hn = t "cn, we obtain E X n ~mK("n) K"nt f a 1hn Ksn(pK"nhn f) <_ X n l2(rn)l21(rn)r[(d 2= )+( d=2+1)c]n +l2(rn)l1=21 (rn)r[ 1=2(d 2= )+(d=4)c]n kfk<1; 85 since (d 2= ) + ( d=2 + 1)c > 0 and 1=2(d 2= ) + (d=4)c > 0 by (5.12). Note that in the previous inequality we also used the fact that the subsequence "n satisfying "n rn. The inequality above about the expectations clearly implies ~mK("n) K"nt f a 1hn Ksn(pK"nhn f) !0 a.s. P : (5.13) Now we write ~mK(") K" t f K t f ~mK(") K"t f (aKh ) 1 Ks (pK"h f) +j Ks f Kt fj +k Ks k ~mK(")a 1h (pK"h f) f : For the last term, we rst x b = 1=2 1=d, then apply Lemma 4.16. Noting that by (4.11) (2= d) + (1 +bd)c = (2= d) + (d=2)c> 0; we get by Lemma 4.16 ~mK(")(aK h ) 1 (pK" h f) f !0 along the sequence (rn). Using (5.13) and the a.s. weak continuity of K at the xed timet, we see that the right-hand side tends a.s. to 0 as n!1, which implies "2= d K"t f!c ;d Kt f a.s. as "!0 along the sequence (rn) for any xed r2(0;1). Since this holds simultaneously, outside a xed null set, for all rational r2 (0;1), the a.s. convergence extends by Lemma 2.3 in [25] to the entire interval (0;1). Applying this result to a countable, convergence-determining class of functions f (cf. Lemma 3.2.1 in [5]), we obtain the required a.s. vague convergence. Since is nite, the (2; )-process t has a.s. compact support (cf. Theorem 9.3.2.2 of [5] and the proof of Theo- rem 1.2 in [6]). By Lemma 4.2, Kt also has a.s. compact support, and so the a.s. convergence 86 remains valid in the weak sense. Now we may prove our main result, the Lebesgue approximation of superprocesses with a regularly varying branching mechanism. Again, we write ~m(") = 1=m(") for convenience, where m(") is de ned in (5.11). Also recall that K"t = ( Kt )", the "-neighborhood measure of Kt . For random measures n and on Rd, n w! in L1 means that nf! f in L1 for all f in Cdb. Theorem 5.13 Let the superprocess in Rd satisfy assuptions (A1) and (A2) with < 1 and d> 2= . Fix any 2 ^Md and t> 0. Then under P , we have as "!0: ~m(") "t w! t a.s. and in L1 Proof: by Theorem 5.12 and Lemma 5.2 we get as "!0 ~m(") "t w! t a.s.: To prove the convergence in L1, we note that for any f2Cdb ~m(")E "tf = ~m(") Z P f tB"x > 0gf(x)dx ! Z ( pt)(x)f(x)dx = E tf; (5.14) by Theorem 5.11. Combining this with the a.s. convergence under P and using Proposition 4.12 in [24], we obtain E j~m(") "tf tfj!0. If is a (2;1)-process in Rd with d 3, then at = t. By (4) in [25], we get m(") = " r dp""r " rcd"d 2"r = cd"d 2: 87 So we recover the Lebesgue approximation of (2;1)-processes, that is, ~cd"2 d "t w! t a.s. and in L1: Similarly, if is a (2; )-process in Rd with < 1 and d> 2= , then at = ( t)1= . By (9) in [22], we get m(") = ( "r) 1= dp""r c ;d"d 2= : Again, we recover the Lebesgue approximation of (2; )-processes, that is, ~c ;d"2= d "t w! t a.s. and in L1: 88 Bibliography [1] Bertoin, J., Le Gall, J. F. and Le Jan, Y. (1999). Spatial branching processes and subordination. Canad. J. Math. 49, 24{54. [2] Bass, R.F., Levin, D.A. (2002). Transition probabilities for symmetric jump pro- cesses. Trans. Amer. Math. Soc. 354, 2933{2953. [3] Dawson, D. A. (1975). Stochastic evolution equations and related measure processes. J. Multivariate Anal. 5 1{52. [4] Dawson, D.A. (1992). In nitely divisible random measures and superprocesses In: Stochastic analysis and related topics. Progr. Probab. 31, 1|129. Birkh auser, Boston, MA. [5] Dawson, D.A. (1993). Measure-valued Markov processes. In: Ecole d? Et e de Proba- bilit es de Saint-Flour XXI{1991. Lect. Notes in Math. 1541, 1|260. Springer, Berlin. [6] Dawson, D.A., Iscoe, I., Perkins, E.A. (1989). Super-Brownian motion: Path properties and hitting probabilities. Probab. Th. Rel. Fields 83, 135{205. [7] Dawson, D.A., Perkins, E.A. (1991). Historical processes. Mem. Amer. Math. Soc. 93, #454. [8] Dawson, D. A., Vinogradov, V. (1994). Almost-sure path properties of (2;d; )- superprocesses. Stochastic Process. Appl. 51 221{258. [9] Delmas, J.-F. (1999). Path Properties of Superprocesses with a General Branching Mechanism. Ann. Probab. 27, 1099-1134. [10] Delmas, J.-F. (1999). Some properties of the range of super-Brownian motion. Probab. Th. Rel. Fields 114, 505{547. [11] Duquesne, T. (2009). The packing measure of the range of Super-Brownian motion. Ann. Probab. 37, 2431{2458. [12] Duquesne, T., Le Gall, J.-F. (2002). Random trees, L evy processes and spatial branching processes. Ast erisque 281, vi-147. [13] Dynkin, E.B. (1991). Branching particle systems and superprocesses. Ann. Probab. 19, 1157{1194. 89 [14] Dynkin, E.B. (1991). Path processes and historical superprocesses. Probab. Th. Rel. Fields 90, 1{36. [15] Dynkin, E.B. (1994). An Introduction to Branching Measure-Valued Processes. CRM Monograph Series 6. AMS, Providence, RI. [16] Dynkin, E.B. (2002). Di usions, Superdi usions and Partial Di erential Equations. Colloquium Publications 50. AMS, Providence, RI. [17] Etheridge, A.M. (2000). An Introduction to Superprocesses. University Lecture Series 20. AMS, Providence, RI. [18] Evans, S.N., Perkins, E. (1991). Absolute continuity results for superprocesses with some applications. Trans. Amer. Math. Soc. 325, 661{681. [19] Falconer, K. (2007). Fractal geometry: mathematical foundations and applications. Wiley. [20] Folland, G. B. (1999). Real Analysis: Modern Techniques and Their Applications. John Wiley & Sons, New York. [21] Gmira, A., Veron, L. (1984). Large time behaviour of the solutions of a semilinear parabolic equation in RN. J. Di er. Equations 53, 258{276. [22] He, X. (2013). Lebesgue approximation of (2; )-superprocesses. Stochastic Process. Appl. 123 1802{1819. [23] He, X. (2013). Lebesgue approximation of superprocesses with a regularly varying branching mechanism. Unpublished manuscript. [24] Kallenberg, O. (2002). Foundations of Modern Probability, 2nd ed. Springer, New York. [25] Kallenberg, O. (2008). Some local approximations of Dawson-Watanabe superpro- cesses. Ann. Probab. 36, 2176{2214. [26] Kallenberg, O. (2011). Iterated Palm Conditioning and Some Slivnyak-Type Theo- rems for Cox and Cluster Processes. J. Theor. Probab. 24, 875{893. [27] Kallenberg, O. (2013). Local conditioning in Dawson-Watanabe superprocesses. Ann. Probab. 41, 385{443. [28] Kingman, J.F.C. (1973). An intrinsic description of local time. J. London Math. Soc. (2) 6, 725{731. [29] Klenke, A. (1998). Clustering and invariant measures for spatial branching models with in nite variance. Ann. Probab. 26, 1057{1087. [30] Le Gall, J.F. (1991). Brownian excursions, trees and measure-valued branching pro- cesses. Ann. Probab. 19, 1399{1439. 90 [31] Le Gall, J. F. (1994). A lemma on super-Brownian motion with some applications. The Dynkin Festschrift. Progr. Probab. 24 237{251. Birkh auser, Boston, MA. [32] Le Gall, J.F. (1999). Spatial Branching Processes, Random Snakes and Partial Dif- ferential Equations. Lectures in Mathematics, ETH Z urich. Birkh auser, Basel. [33] Le Gall, J.F., Perkins, E. (1995). The Hausdor Measure of the Support of Two- Dimensional Super-Brownian Motion. Ann. Probab. 23, 1719{1747. [34] Le Gall, J.F., Perkins, E.A., Taylor S.J. (1995). The packing measure of the support of super-Brownian motion. Stochastic Process. Appl. 59 1{20. [35] Li, Z. (2011). Measure-Valued Branching Markov Processes. Springer, New York. [36] M orters, P, Peres, Y (2010). Brownian Motion. Cambridge University Press. [37] Mytnik, L., Perkins, E. (2003). Regularity and irregularity of (1 + )-stable super- Brownian motion. Ann. Probab. 31, 1413{1440. [38] Mytnik, L., Villa, J. (2007). Self-intersection local time of ( ;d; )-superprocess. Ann. Inst. Henri Poincar Probab. Stat. 43, 481{507. [39] Pazy, A. (1983). Semigroups of linear operators and applications to partial di erential equations. 198. Springer, New York. [40] Perkins, E. (1990). Polar sets and multiple points for super-Brownian motion. Ann. Probab. 18, 453{491. [41] Perkins, E. (1994). The strong Markov property of the support of super-Brownian motion. The Dynkin Festschrift. Progr. Probab. 24 307{326. Birkh auser, Boston, MA. [42] Perkins, E. (2002). Dawson-Watanabe superprocesses and measure-valued di usions. Ecole d? Et e de Probabilit es de Saint-Flour XXIX{1999. Lect. Notes in Math. 1781, 125{329. Springer, Berlin. [43] Perkins, E. (2004). Super-Brownian motion and criticial spatial stochastic systems. Bull. Can. Math. Soc. 47, 280{297. [44] Revuz, D., Yor, M (1999). Continuous martingales and Brownian motion. 293. Springer, Berlin. [45] Sato, K. I. (1999). L evy processes and in nitely divisible distribution. Cambridge Uni- versity Press. [46] Slade, G. (2002). Scaling limits and super-Brownian motion. Notices Amer. Math. Soc. 49, 1056{1067. [47] Stein, E.M., Shakarchi, R. (2005). Real analysis: measure theory, integration, and Hilbert spaces. Princeton Lectures in Analysis 3. Princeton University Press. 91 [48] Tribe, R. (1994). A representation for super Brownian motion. Stochastic Process. Appl. 51 207{219. [49] Watanabe, S. (1968). A limit theorem of branching processes and continuous state branching processes. J. Math. Kyoto Univ. 8 141{167. 92