inordinatum

Physics and Mathematics of Disordered Systems

Posts Tagged ‘stochastic process

Fokker-Planck equation for a jump diffusion process

with 7 comments

One of the simplest stochastic processes is a Brownian motion with drift, or a diffusion process:

\displaystyle   \begin{array}{rl}  \dot{x}(t) =& \mu + \sigma\,\xi(t), \\  x(t) =& \mu t + \sigma \,W(t).  \end{array}   (1)

Here, \xi(t) is Gaussian white noise with mean zero and variance \left\langle \xi(t_1)\xi(t_2)\right\rangle = \delta(t_1-t_2). Its integral W(t) = \int_0^t \mathrm{d} t'\,\xi(t') is a Brownian motion.
Continuous Itô stochastic processes such as eq. (1) are insufficient for applications where the random variable x may jump suddenly (such as in avalanches). A natural extension of (1) for modelling this is a so-called jump-diffusion process. Let us suppose that our jump sizes s are positive, independent and identically distributed with density Q(s). Then, the jump diffusion process y(t) is

\displaystyle   y(t) = x(t) + \sum_{i=0}^{N(t)} s_i = \mu t + \sigma\, W(t) + \sum_{i=0}^{N(t)} s_i,   (2)

where s_i, i\in \mathbb{N} are i.i.d. jump sizes as above, and N(t) is the number of jumps encountered up to time t. For simplicitly, let us assume that jumps occur independently with rate \lambda, i.e. that the probability to have a jump in a time interval \mathrm{d}t is \lambda\, \mathrm{d}t. Then, N(t) is a Poisson process with rate \lambda.

It is well-known that the diffusion process x(t) in (1) is equivalently described by a partial differential equation for the distribution P(x) of x, the Fokker-Planck equation (FPE)

\displaystyle \partial_t P_t(x) = \frac{\sigma^2}{2}\partial_x^2 P_t(x) - \mu \partial_x P_t(x).   (3)

This representation is useful e.g. for first-passage problems: they correspond to various boundaries introduced in the PDE (3). So how does one generalise the Fokker-Planck (3) to the case of the jump-diffusion process y(t) in (2)? I will explain in the following that the answer is

\displaystyle \partial_t P_t(y) = \frac{\sigma^2}{2}\partial_y^2 P_t(y) - \mu \partial_y P_t(y)- \lambda P_t(y) + \lambda\,\int_0^\infty \mathrm{d}s \,Q(s)P_t(y - s) ,   (4)

and then discuss a specific example.

1. Deriving the jump-diffusion FPE

Let us consider a time step from t to t + \mathrm{d}t. The probability for a jump to occur during this interval is \lambda\,\mathrm{d}t, so

\displaystyle P_{t+\mathrm{d}t}(y) = (1-\lambda \mathrm{d}t)\left\langle P_t\left(y-\mu \,\mathrm{d}t - \left[W(t+\mathrm{d}t)-W(t)\right] \right) \right\rangle_W + \lambda \mathrm{d}t\,\left\langle P_t(y-s) \right\rangle_s,   (5)

where \left\langle \cdot \right\rangle_W denotes averaging over all realizations of the Brownian motion W, and \left\langle \cdot \right\rangle_s denotes averaging over the distribution Q(s) of the jump size s. Since the jump term is already multiplied by the jump probability \propto \mathrm{d}t, the drift and noise contributions there are of higher order in \mathrm{d}t and were dropped.

The averaging over the jump size in (5) yields a convolution with the jump size distribution Q(s):

\displaystyle \left\langle P_t(y-s) \right\rangle_s = \int_0^\infty \mathrm{d}s \,Q(s)P_t(y - s).

The average over the noise W in (5) is the same as for standard diffusion. During the interval \mathrm{d}t, the increment of the noise term in (2) is

\displaystyle W(t+\mathrm{d}t)-W(t) = \int_t^{t+\mathrm{d}t}\mathrm{d}t'\,\xi(t') =: V\sqrt{\mathrm{d}t},

where the last equality is a definition for V.
Since W is a Brownian motion, V is normally distributed:

\displaystyle \mathcal{N}(V) = \frac{1}{\sqrt{2\pi}}\exp\left(-\frac{V^2}{2}\right).

Thus, the average over W in (5) is

\displaystyle  \begin{array}{rl}  \displaystyle \left\langle P_t\left(y-\mu \,\mathrm{d}t - \left[W(t+\mathrm{d}t)-W(t)\right] \right) \right\rangle_W = & \displaystyle \int_{-\infty}^\infty \mathrm{d}V\,\mathcal{N}(V)\, P_t\left(y-\mu \,\mathrm{d}t - V \sqrt{\mathrm{d}t} \right) \\   = & \displaystyle P_t(y) - \mu\, \mathrm{d}t\, \partial_y P_t(y)- \sqrt{\mathrm{d}t} \partial_y P_t(y) \int_{-\infty}^\infty \mathrm{d}V\, \mathcal{N}(V)\, V  \\  & \displaystyle + \frac{1}{2}\mathrm{d}t \partial_y^2 P_t(y)\, \int_{-\infty}^\infty \mathrm{d}V\, \mathcal{N}(V)\, V^2 + \mathcal{O}(\mathrm{d}t)^2 \\  =& \displaystyle P_t(y) - \mu\, \mathrm{d}t\, \partial_y P_t(y) + \frac{1}{2}\mathrm{d}t\,\partial_y^2 P_t(y)  \end{array} .

In the last line we used the fact that the normal distribution has mean zero and variance 1, and dropped all terms of higher order than \mathrm{d}t. Inserting all this into (5), we obtain the jump-diffusion FPE (4).

2. Example: Exponential jumps

Now let us consider a simple example, where the jump sizes are distributed exponentially:

\displaystyle  Q(s) = \alpha\,\exp(-\alpha s),\quad\quad \alpha > 0  .

We will compute the distribution of y(t), starting from y(0)=0.
For this, we solve the jump-diffusion FPE (4) with the initial condition P_{t=0}(y) = \delta(y). Let us take a Fourier transform of (4) from y to k:

\begin{array}{rl}  \displaystyle  \tilde{P}_t(k) := & \displaystyle \int_{-\infty}^\infty \mathrm{d}y\,e^{i k y}\,P_t(y), \\  \displaystyle  \partial_t \tilde{P}_t(k) = & \displaystyle -\frac{\sigma^2}{2}k^2\,\tilde{P}_t(k) + i\mu k\, \tilde{P}_t(k) - \lambda \tilde{P}_t(k) + \lambda\tilde{Q}(k)\tilde{P}_t(k)  \end{array}  .

The different spatial modes decouple, and instead of a partial differential equation we obtain an ordinary differential equation for each value of the Fourier variable k.
For the above exponential form of the jump size distribution Q,

\displaystyle  \tilde{Q}(k) = \frac{\alpha}{\alpha - i k}  .

Furthermore, the initial condition P_{t=0}(y) = \delta(y) gives \tilde{P}_{t=0}(k) = 1. Hence, the solution $\tilde{P}$ reads

\displaystyle  \tilde{P}_t(k) = \exp\left[ \left( -\frac{\sigma^2}{2}k^2+  i\mu k + i\frac{\lambda k}{\alpha - i k} \right) t\right]  .

While it does not seem feasible to invert this Fourier transform to obtain a closed expression for P_t(y) (but if anyone has an idea, let me know!), \tilde{P}_t(y) is already enough to determine the moments of y(t). Taking derivatives, we obtain for example

\displaystyle  \begin{array}{rl}  \left\langle y(t) \right\rangle = & -i\partial_k \big|_{k=0} \tilde{P}_t(k) = \left(\frac{\lambda}{\alpha}+\mu\right)t \\  \left\langle y(t)^2 \right\rangle^c = & -\partial_k^2 \big|_{k=0} \tilde{P}_t(k) + \left[\partial_k \big|_{k=0} \tilde{P}_t(k)\right]^2  \\  = & \left(\frac{2\lambda}{\alpha^2}+\sigma^2\right)t, \\  ... &  \end{array}  .

Similarly, solving the FPE (4) with an absorbing boundary allows computing first-passage times (or at least their moments) for our jump-diffusion process.

Have fun, and do let me know if you have any comments!

Advertisements

Written by inordinatum

November 20, 2013 at 8:04 pm

How does an absorbing boundary modify the propagator of a stochastic process?

with one comment

A recurring theme throughout my blog are stochastic processes, widely used to model quantities under the influence of both deterministic evolution laws and random fluctuations. A complete solution for a stochastic process is its propagator: The probability density for the final state at some time t_f given an initial condition at time t_i. The propagator depends on the boundary conditions imposed on the stochastic process. For example, an absorbing boundary is useful for describing extinction in a stochastic model of population dynamics: When the population of a given species reaches zero, it cannot be re-introduced.
Typically, however, the “free” propagator for a stochastic process — i.e. the propagator without any boundaries — is much easier to determine than the propagator with any given boundary condition.

However, common sense tells us that the free propagator for a stochastic process contains already all there is to know about it. So I have wondered for quite a while if and how one can recover e.g. the propagator with an absorbing boundary, knowing only the free propagator. I found the solution only quite recently in this preprint on arXiv, and think that it is interesting enough to explain it here briefly.

The result

Consider a one-dimensional, stationary, Markovian stochastic process. Let us denote its free propagator by

\displaystyle G(x_i,x_f|t_f-t_i).

This is the density for the final value x_f at time t_f, given an initial value x_i at time t_i. Stationarity of the process implies that G_0 is a function of the time difference \tau := t_f-t_i only. Define the Laplace transform of the free propagator by

\displaystyle \hat{G}(x_i,x_f|\lambda) := \int_{0}^{\infty} \mathrm{d}\tau\, e^{\lambda \tau}G(x_i,x_f|\tau).

Now let us add an absorbing boundary at x=a. Then, the propagator

\displaystyle G_a(x_i,x_f|t_f-t_i)

is the density for the final value x_f at time t_f, given an initial value x_i at time t_i, without touching the line x=a between t_i and t_f. The upshot of this post is the following simple relationship between the Laplace transforms of the free propagator G and of the propagator with absorbing boundary G_a :

\displaystyle \hat{G}_a(x_i,x_f|\lambda) = \hat{G}(x_i,x_f|\lambda)-\frac{\hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{\hat{G}(a,a|\lambda)}.      (1)

In case the denominator on the right-hand side is zero, one may have to perform some limiting procedure in order to make sense of this formula. We can now check how nicely it works for simple cases, like Brownian motion with drift, where the result is known by other means.

Example: Brownian motion with drift

Let us consider an example, the Brownian motion with drift \mu. Its stochastic differential equation is

\displaystyle  \partial_t X(t) = \mu + \xi(t),

where \xi is white noise with \overline{\xi(t)\xi(t')} =2\sigma\delta(t-t').

The free propagator is a simple Gaussian

\displaystyle  G(x_f,x_i|\tau) = \frac{1}{\sqrt{4\pi\sigma\tau}}\exp\left[-\frac{(x_f-x_i-\mu \tau)^2}{4\sigma \tau}\right].

Its Laplace transform is given by:

\displaystyle  \begin{array}{rl}  \hat{G}(x_f,x_i|\lambda) =& \int_{0}^{\infty} \mathrm{d}\tau\, e^{\lambda \tau}G(x_i,x_f|\tau) \\  =&\frac{1}{\sqrt{\mu^2-4\lambda\sigma}}\exp\left[\frac{\mu}{2\sigma}(x_f-x_i) -\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|x_f-x_i|\right].  \end{array}

To compute G_a, let us assume x_i > a, x_f > a. Note this is without loss of generality, since if x_i and x_f are on opposite sides of the absorbing boundary the propagator G_a is zero. Applying the result (1), we have

\displaystyle   \begin{array}{rl}  \hat{G}_a(x_i,x_f|\lambda) =& \hat{G}(x_i,x_f|\lambda)-\frac{\hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{\hat{G}(a,a|\lambda)}  \\  \quad=& \frac{1}{\sqrt{\mu^2-4\lambda\sigma}}\exp\left[\frac{\mu}{2\sigma}(x_f-x_i) -\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|x_f-x_i|\right] - \\   & \frac{1}{\sqrt{\mu^2-4\lambda\sigma}}\exp\left[\frac{\mu}{2\sigma}(a-x_i) -\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|a-x_i| \right. \\  &\quad \left.+\frac{\mu}{2\sigma}(x_f-a) -\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|x_f-a|\right]  \\  \quad=& \frac{1}{\sqrt{\mu^2-4\lambda\sigma}}e^{\frac{\mu}{2\sigma}(x_f-x_i)}\left[e^{-\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|x_f-x_i|} - e^{-\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}(x_i+x_f-2a)}\right].    \end{array}

The inverse Laplace transform of this expression is:

\displaystyle   \begin{array}{rl}  G_a(x_i,x_f|\tau) =& G(x_i,x_f|\tau)-e^{\frac{\mu}{\sigma}(a-x_i)}G(2a-x_i,x_f|\tau) \\  =& \frac{1}{\sqrt{4\pi\sigma\tau}}e^{-\frac{\mu^2 \tau}{4\sigma} + \mu\frac{x_f-x_i}{2\sigma}}\left[e^{-\frac{(x_f-x_i)^2}{4\sigma\tau}}-e^{-\frac{(2a-x_f+x_i)^2}{4\sigma\tau}}\right].  \end{array}

This is, of course, exactly the same result (the so-called Bachelier-Levy formula) which I had already discussed earlier in a guest blog post at Ricardo’s blog, using slightly different methods (a variant of the Cameron-Martin-Girsanov theorem). Note that some signs which were wrong in the original post are corrected here.

The proof

Now let us go back to the case of a general stochastic process, and discuss how formula (1) can be proven. Let us assume that the stochastic process in question has a path-integral representation, i.e. that

\displaystyle G(x_i,x_f|\tau) = \int_{x(t_i)=x_i}^{x(t_f)=x_f} \mathcal{D}[x(t)]\,e^{-S[x(t)]}

for some action S.

Now, to add an absorbing boundary at x=a we weigh each crossing of x=a by a factor e^{-\gamma} in the path integral, and then let \gamma \to \infty. This effectively retains only those trajectories in the path integral, which never touch x=a. We thus have:

\displaystyle G_a(x_i,x_f|\tau) = \lim_{\gamma \to \infty}\int_{x(t_i)=x_i}^{x(t_f)=x_f} \mathcal{D}[x(t)]\,e^{-S[x(t)]-\gamma\delta(x(t)-a)}.

Now, to evaluate the path integral, we expand it in a series in \gamma (or, in other words, in the number k of crossings of x=a). We get:

\displaystyle   \begin{array}{rl}  \lefteqn{G_a^{(\gamma)}(x_i,x_f|\tau) = G(x_i,x_f|\tau) + }&\\  & + \sum_{k=1}^{\infty} (-\gamma)^k \int_{t_i}^{t_f}\mathrm{d}t_1\, G(x_i,x_1|t_1-t_i) \int_{t_1}^{t_f}\mathrm{d}t_2\, G(x_1,x_2|t_2-t_1) \cdots \\  &  \cdots \int_{t_{k-1}}^{t_f}\mathrm{d}t_k\,G(x_{k-1},x_k|t_k-t_{k-1})\,G(x_k,x_f|t_f-t_k)  \end{array}  .

Note that here I cancelled the factor \frac{1}{k!}, coming from the Taylor expansion of e^{-\gamma \delta(x(t)-a)}, against the k! possibilities of ordering the times t_1\cdots t_k. This is why I can assume them to be in ascending order above. Also, the \delta functions actually set all x_1=x_2=\cdots=x_k=a.
Taking Laplace transforms, the time integrals (which are a k-fold convolution) simplify to a simple product:

\displaystyle  \begin{array}{rl}  \hat{G}_a^{(\gamma)}(x_i,x_f|\lambda) =& \hat{G}(x_i,x_f|\lambda) + \\  & + \sum_{k=1}^{\infty} (-\gamma)^k \hat{G}(x_i,a|\lambda)\hat{G}(a,a|\lambda)^{k-1} \hat{G}(a,x_f|\lambda) \\  =& \hat{G}(x_i,x_f|\lambda) - \frac{\gamma \hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{1+\gamma \hat{G}(a,a|\lambda)}\\  =& \hat{G}(x_i,x_f|\lambda) - \frac{\hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{\gamma^{-1}+ \hat{G}(a,a|\lambda)}  \end{array}.

Now, one easily obtains the claimed result:

\displaystyle  \begin{array}{rl}  \hat{G}_a(x_i,x_f|\lambda) =& \lim_{\gamma \to \infty}\hat{G}_a^{(\gamma)}(x_i,x_f|\lambda) \\  =& \hat{G}(x_i,x_f|\lambda) - \frac{\hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{\hat{G}(a,a|\lambda)}  \end{array}  .

This computation can, of course, be generalized to more complicated situations like several boundaries, higher dimensions, etc. For details see again the original preprint by C. Grosche. Have fun and let me know if you have questions!

Written by inordinatum

November 24, 2012 at 11:50 pm