inordinatum

Physics and Mathematics of Disordered Systems

Archive for the ‘Stochastic Processes’ Category

Stochastic Logistic Growth

with one comment

A classic model for the growth of a population with a finite carrying capacity is logistic growth. It is usually formulated as a differential equation,

\displaystyle \partial_t n(t)=r\, n(t)\left[1-\frac{n(t)}{K}\right]. (1)

Here n(t) is the size of the population at time t, r is the growth rate and K is the carrying capacity.

The dynamics of eq. (1) is deterministic: The initial population n(0)=n_0 > 0 grows (or decays) towards the constant carrying capacity n(t\to\infty) = K, which is a fixed point of eq. (1). This is seen in the solid trajectories in the figure below:

stoch_log_growth

Deterministic vs. stochastic logistic growth. Solid lines: deterministic model in eq. (1). Transparent lines: Realizations of stochastic logistic growth model in eq. (2) below. The colors blue, orange and green correspond to K=50, 30, 10. In all cases r=1.

To make this model more realistic, let’s see how we can extend it to include stochastic fluctuations (semi-transparent trajectories in the figure above). I’ll look in the following at a simple stochastic logistic growth model (motivated by some discussions with a friend), where the steady state can be calculated exactly. The effect of stochasticity on the steady state is twofold:

  • The population size for long times is not fixed, but fluctuates on a scale \sigma \approx \sqrt{K} around the carrying capacity.
  • The average population size \left\langle n \right\rangle is increased above the carrying capacity K, but the shift goes to 0 as K increases (i.e. the deterministic model is recovered for large K).

Now let’s look at the calculation in detail…

A stochastic, individual-based logistic growth model

In eq. (1), the population is described by a real-valued function N(t). In reality, populations consist of discrete individuals and the population size doesn’t change continuously. So, a more realistic approach is to describe the population dynamics as a birth-death process with the following reactions:

\begin{array}{rl} A & \stackrel{r}{\longrightarrow} A+A \\ A+A & \stackrel{d}{\longrightarrow} A\end{array} (2)

In other words, we assume that during a time interval dt two events may happen:

  • Birth: With a probability r \, dt, any individual may give birth to offspring, thus increasing the population size by 1.
  • Death due to competition: With a probability d\, dt, out of any two individuals one may die due to the competition for common resources. Thus the population size decreases by 1. Note that d is related to the carrying capacity K in eq. (1) by d = r/K.

Note that the stochasticity in this model is not due to random external influences but due to the discreteness of the population (demographic noise).

Solution of the stochastic model

The system of reaction equations (2) translates into the following master equation for the probabilities p_n(t) of the population size at time t being equal to n:

\begin{array}{rl} \displaystyle \partial_t p_n(t) = & \displaystyle n(n+1)d\,p_{n+1}(t) - n(n-1)d\, p_n(t) \\ & \displaystyle + (n-1)r\,p_{n-1}(t) - nr\, p_n(t)\end{array} (3)

This looks daunting. However, it can be simplified a lot by introducing the generating function

\displaystyle G(\lambda,t) := \sum_{n=0}^{\infty} \lambda^n p_n(t)

After some algebra we obtain

\displaystyle \partial_t G(\lambda,t) = \lambda(1-\lambda) \left[d\, \partial_\lambda^2 G(\lambda,t) - r \,\partial_\lambda G(\lambda,t)\right].

This looks simpler than eq. (3) but still finding the full time-dependent solution G(\lambda,t) does not seem feasible. Let’s focus on the steady state where \partial_t G(\lambda,t)=0. Together with the boundary conditions G(\lambda=1,t)=1 and G(\lambda=0,t)=0, we obtain the steady-state solution G_s(\lambda)

\displaystyle G_s(\lambda) = \frac{e^{K\lambda}-1}{e^{K}-1}. (4)

Here we set d=r/K to connect to the notation of eq. (1). Correspondingly, the steady-state probabilities for population size n are

\displaystyle p_n = \frac{K^n}{n!}\frac{1}{e^K-1}. (5)

Of course, these can equivalently also be obtained by solving the recursion relation (3) with \partial_t p_n(t) = 0. This result can be easily checked against simulations, see figure below.

statdist

Stationary probability distribution for the stochastic logistic growth model in eq. (2), for r=1, K=5, d=r/K=0.2. Blue line: exact solution in eq. (5). Yellow bars: Histogram over 10^6 samples.

Steady state with stochasticity

Let’s try to understand what the steady state of our stochastic model looks like. From eq. (4) we easily obtain the first moments of the distribution of the population size. The mean population size is:

\displaystyle \left\langle n \right\rangle = \partial_\lambda\big|_{\lambda=1}G_s(\lambda) = \frac{K}{1-e^{-K}}

So for large carrying capacities K \gg 1, we recover the same result as in the deterministic model (which is reasonable since for large population sizes demographic noise is less pronounced!). However, for smaller K fluctuations increase the average population size. E.g. for K=2, the average population size in our stochastic model is 2/\left(1-e^{-2}\right) \approx 2.31.

Now let us look at the variance,

\begin{array}{rl} \sigma^2 := &\displaystyle \left\langle n^2\right\rangle - \left\langle n \right\rangle^2\\=&\displaystyle\partial_\lambda\big|_{\lambda=1}\lambda \partial_\lambda G_s(\lambda)-\left(\frac{K}{1-e^{-K}}\right)^2\\=& \displaystyle\frac{K\left(1-e^{-K}-K\,e^{-K}\right)}{\left(1-e^{-K}\right)^2}\end{array}.

For large K \gg 1, we have \sigma^2 \approx K. So we see that stochastic fluctuations spread out the steady state to a width \sigma \approx \sqrt{K} around the carrying capacity.

Written by inordinatum

January 1, 2016 at 4:11 pm

Estimating expected growth

leave a comment »

Let’s take some fluctuating time series — say the value of some financial asset, like a stock price. What is its average growth rate? This seemingly trivial question came up in a recent discussion I had with a friend; obviously, it is relevant for quantitative finance and many other applications. Looking at it in more detail, it turns out that a precise answer is actually not that simple, and depends on the time range for which one would like to estimate the expected growth. So I’d like to share here some insights on this problem and its interesting connections to stochastic processes. Suprisingly, this was only studied in the literature quite recently!

The setup

Let’s consider the price s_t of an asset at N+1 discrete times t=1...N+1. The corresponding growth rates r_t are defined by

\displaystyle 1+r_t := \frac{s_{t+1}}{s_t}.    (1)

I.e. if s_{t+1}=1.5 s_t, the growth rate is r_t=0.5=50\%. Let’s also assume that our growth process r_t is stationary, i.e. that the distribution of the growth rates does not change in time. Say for concreteness that the growth rates r_t are i.i.d. (independent identically distributed) random variables.

The arithmetic average

The most immediate idea for computing the average growth rate \overline{r} is just to take the arithmetic average, i.e.

\displaystyle \overline{r}^{AM} := \frac{1}{N}\sum_{t=1}^N r_t.    (2)

What does this give us? Obviously, by the assumption of stationarity made above, with increasing sample size N the expectation value for the growth rate at the next time step r_{N+1} (or any other future time step) is approximated better and better by \overline{r}^{AM}:

\displaystyle \lim_{N\to\infty} E[r_{N+1}]-\frac{1}{N}\sum_{t=1}^N r_t =0.

This seems to be what we’d expect for an average growth rate, so what’s missing? Let’s go back to our sample of N asset prices and take the total growth

\displaystyle \frac{s_{N+1}}{s_1} = \prod_{i=1}^N (1+r_i).

By the relationship between the arithmetic and the geometric mean, we know that

\displaystyle \frac{s_{N+1}}{s_1} = \prod_{i=1}^N (1+r_i) \leq \left[\frac{1}{N}\sum_{i=1}^N (1+r_i)\right]^N = \left(1+\overline{r}^{AM}\right)^N. (3)

The left-hand side is the total growth after N time periods and the right-hand side is its estimate from the arithmetic mean in eq. (2). Equality only holds in eq. (3) when all the r_t are equal. When the r_t fluctuate the total growth rate will always be strictly less than the estimate from the arithmetic mean, even as N \to \infty. So, by taking the arithmetic mean to obtain the total growth in the price of our asset at the end of the observation period, we systematically overestimate it. The cause for this is the compounding performed to obtain the total growth over more than one time step. This makes our observable a nonlinear function of the growth rate in a single time step. Thus, fluctuations in the growth rate don’t average out, and yield a net shift in its expectation value.

The first explicit observation of this effect I’ve found is in a 1974 paper by M. E. Blume, Unbiased Estimators of Long-Run Expected Rates of Return. Quantitative estimates from that paper and from a later study by Jacquier, Kane and Marcus show that in typical situations this overestimation may easily be as large as 25%-100%.

The geometric average

So we see that the arithmetic mean does not provide an unbiased estimate of the compounded growth over a longer time period. Another natural way to obtain the average growth, which intuitively seems more adapted to capture that effect, is to take the geometric average of the growth rates:

\displaystyle 1+\overline{r}^{GM} := \left(\prod_{t=1}^N 1+r_t\right)^{1/N}.    (4)

Now, by construction, the total growth at the end of our observation period, i.e. after N time periods, is correctly captured:

\displaystyle \frac{s_{N+1}}{s_1} = \left(1+\overline{r}^{GM}\right)^N

But this only solves the problem which we observed after eq. (3) for this specific case, when estimating growth for a time period whose length is exactly the same as the observation time span N from which we obtain the average. For shorter time periods (in particular, when estimating the growth rate for a single time step r_t), the geometric mean will now underestimate the growth rate. On the other hand, for time periods even longer than the observation period, it will still overestimate it like the arithmetic mean (see again the paper by Blume for a more detailed discussion).

Unbiased estimators

Considering the results above, the issue becomes clearer: A reasonable (i.e. unbiased) estimate for the compounded growth over a time period requires a formula that takes into account both the number of observed time steps N, and the number of time steps over which we’d like to estimate the compounded growth T. For T=1 we can use the arithmetic mean, for T=N we can use the geometric mean, and for general T we need another estimator altogether. For the general case, Blume proposes the following (approximately) unbiased estimator \left(\overline{r}^{UB}\right)^T:

\displaystyle \overline{r}^{UB} = \frac{N-T}{N-1}\left(1+\overline{r}^{AM}\right)^T+\frac{T-1}{N-1}\left(1+\overline{r}^{GM}\right)^T.    (5)

Eq. (5) is a reasonable approximation for the compounded growth, not having any further information on the form of the distribution, correlations, etc.
For T=1, i.e. estimating growth in a single time step, this gives just the arithmetic mean \overline{r}^{AM} which is fine as we saw above. For T=N, this gives the geometric mean which is also correct. For other values of T, eq. (5) is a linear combination of the arithmetic and the geometric mean.

To see how the coefficients in eq. (5) arise, let us start with an ansatz of the form

\displaystyle \left(\overline{r}^{UB}\right)^T = \alpha\left(1+\overline{r}^{AM}\right)^T+\beta\left(1+\overline{r}^{GM}\right)^T.    (6)

Let us further split up the growth rates as r_t = \mu + \epsilon_t, where \mu is the “true” average growth rate and \epsilon are fluctuations. Inserting this as well as the definitions of \overline{r}^{AM}, \overline{r}^{GM} into eq. (6) we get

\displaystyle \left(\overline{r}^{UB}\right)^T = \alpha\left(1+\mu +\frac{1}{N}\sum_{i=1}^N\epsilon_i\right)^T+\beta\left[\prod_{i=1}^N (1+\mu+\epsilon_i)\right]^{T/N}.

Now let us assume that the fluctuations \epsilon are small, and satisfy E[\epsilon_i\epsilon_j]=\sigma^2 \delta_{ij}. Expanding to second order in \epsilon (the first order vanishes), and taking the expectation value, we obtain

\begin{array}{rl}   \displaystyle E\left[\left(\overline{r}^{UB}\right)^T\right] = & \displaystyle \alpha\left(1+\mu\right)^T\left[1+\frac{T(T-1)\sigma^2}{2(1+\mu)^2N}\right]  \\ & \displaystyle +\beta\left(1+\mu\right)^T\left[1+\frac{T(T-N)\sigma^2}{2(1+\mu)^2N}\right]  \\  & \displaystyle + \mathcal{O}(\sigma^4).  \end{array}

To obtain an estimator that is unbiased (to second order in \sigma), we now choose \alpha and \beta such that the term of order \sigma^0 is just the true growth rate (1+\mu)^T, and the term of order \sigma^2 vanishes. This gives the system

\displaystyle   \begin{array}{rl}  \displaystyle \alpha + \beta & =  1,\\  \displaystyle \frac{T-1}{2}\alpha+\frac{T-N}{2}\beta & = 0.  \end{array}

The solution of this linear system for \alpha and \beta yields exactly the coefficients in eq. (5).
Of course, here we make the assumption of small fluctuations and also a specific ansatz for the estimator. If one has more information on the distribution of the growth rates r this may not be the most adequate one, but with what we know there’s not much more we can do!

Outlook

As you can see from the above discussion, estimating the expected (compounded) growth over a series of time steps is more complex than it appears at first sight. I’ve shown some basic results, but didn’t touch on many other important aspects:

  • In addition to the growth rate r, it is also interesting to consider the discount factor r^{-T}. Blume’s approach is extended to this observable in this paper by Ian Cooper.
  • If one assumes the growth factors 1+r in eq. (1) to be log-normally distributed, the problem can be treated analytically. Jacquier, Kane and Marcus discuss this case in detail in this paper, and also provide an exact result for the unbiased estimator.
  • The assumption of independent, identically distributed growth rates is not very realistic. On the one hand, we expect the distribution from which the annual returns are drawn to vary in time (i.e. due to underlying macroeconomic conditions). This is discussed briefly in Cooper’s paper. On the other hand, we also expect some amount of correlation between subsequent time steps, even if the underlying distribution does not change. It would be interesting to see how this modifies the results above.

Let me know if you find these interesting — I’ll be glad to expand on that in a future post!

Written by inordinatum

January 18, 2015 at 4:18 pm

Extreme avalanches in the ABBM model

leave a comment »

Extreme value statistics is becoming a popular and well-studied field. Just like the sums of random variables exhibit universality described by the central limit theorem, maxima of random variables also obey a universal distribution, the generalized extreme-value distribution. This is interesting for studying e.g. extreme events in finance (see this paper by McNeil and Frey) and climate statistics (see e.g. this physics paper in Science and this climatology paper).

Having refereed a related paper recently, I’d like to share some insights on the statistics of extreme avalanches in disordered systems. An example are particularly large jumps of the fracture front when slowly breaking a disordered solid. A simple but reasonable model for such avalanches is the Alessandro-Beatrice-Bertotti-Montorsi (ABBM) model, originally invented for describing Barkhausen noise. I already touched upon it in a previous blog post, and will focus on it again in the following.
I’ll show an exact formula for the distribution of the maximum avalanche size in an interval, and connect this result to the universal extreme value distribution when considering a large number of avalanches.

Brief review: The ABBM model

To briefly recap (see my previous post for details), the ABBM model consists in a particle at position u(t) pulled on a random landscape by a spring. A key assumption of the model is that the force resulting from the disordered potential is a Brownian Motion in u. This allows computing many observables exactly.
For example, when the force due to the spring increases by f, it is well-known (see e.g. arxiv:0808.3217 and the citations therein) that the resulting displacement S of the particle follows the distribition

\displaystyle P_f(S) = \frac{f}{2\sqrt{\pi} S^{3/2}}\exp\left[-\frac{(kS-f)^2}{4S}\right].    (1)

Here, k is the spring constant. For simplicity I took a Brownian random force landscape with variance \sigma=1 here, but the results are straightforward to generalize. This result is basically the distribution of the first-passage time of a Brownian motion with drift k at a given level f. In this context it is also known as the Bachelier-Levy formula (see also my post on first-passage times).
For small forces, f \to 0, and weak springs, k \to 0, (1) becomes a power-law distribution P(S) \sim S^{-3/2}.

The largest avalanche in the ABBM model

Now let us consider particularly large avalanches. When applying a force f, the probability to have a total displacement S \leq S_0 is

\displaystyle F^{tot}_f(S_0) := \int_0^{S_0}\mathrm{d}S\,P_f(S).    (2)

Note, however, that the total displacement in this case is the sum of many avalanches triggered by infinitesimal steps in f. So how do we obtain the probability F_f(S_0) that all avalanches triggered during this ramp-up of the force are smaller than S_0? We decompose it in n intervals of size f/n, and let n \to \infty:

\displaystyle F_f(S_0) = \lim_{n\to\infty} \left[F^{tot}_{f/n}(S_0)\right]^n.    (3)

Note that we use here the Markovian property of the Brownian Motion, which ensures that avalanches on disjoint intervals are independent. Only this property permits us to take the n-th power to obtain the cumulative distribution for the n intervals; on any non-Markovian landscape the avalanches in these intervals would be correlated and things would be much more complicated.

Combining equations (1), (2) and (3) we can compute F_f(S_0) explicitly:

\begin{array}{rl}  F_f(S_0) =& \lim_{n\to\infty} \left[F^{tot}_{f/n}(S_0)\right]^n = \exp\left[-f\partial_f\big|_{f=0}\int_{S_0}^{\infty}\mathrm{d}S\,P_f(S)\right] \\  =& \displaystyle \exp\left[-f\left(\frac{e^{-k^2 S_0/4}}{\sqrt{\pi S_0}} - \frac{k}{2}\text{erfc} \frac{k\sqrt{S_0}}{2}\right)\right].  \end{array}    (4)

This satisfies the normalization expected of a cumulative distribution function: For S_0 \to 0, F_f(S_0)\to 0 and for S_0 \to \infty, F_f(S_0)\to 1.
The probability distribution of the maximal avalanche size S_{max} is correspondingly

\displaystyle P_f(S_{max}) = \partial_{S_0}\big|_{S_0=S_{max}}F_f(S_0).    (5)

Eqs. (4) and (5) are a nice closed-form expression for the size distribution of the largest avalanche during the force increase by f!

From few to many avalanches

As one goes to infinitesimal force steps f \to 0, only a single avalanche is triggered. Then it is clear from our construction and eqs. (4), (5) that P_f(S_{max}) \to P_f(S) as defined in (1). So, as expected, when considering a single avalanche the maximal avalanche size and the total avalanche size coincide.

On the other hand, for large force steps f \to \infty, the ramp-up of the force triggers many independent avalanches. The total displacement S is then the sum of many independent avalanche sizes S_1...S_n. Thus, by the central limit theorem, one expects to find a Gaussian distribution for S. We can see this explicitly from (1): The expectation value is \left\langle S \right\rangle = f/k, and the fluctuations around it are \delta S := S-\left\langle S \right\rangle. From (1) one finds that they scale like \delta S \sim \sqrt{f/k^3}. The normalized fluctuations d := \delta S \sqrt{k^3/f} have the distribution

\displaystyle P(d) = \frac{1}{2\sqrt{\pi}}\exp\left(-\frac{d^2}{4}\right) + \mathcal{O}(f^{-1/2}).     (6)

For large f, we are indeed left with a Gaussian distribution for the normalized fluctuations d. This is easily checked numerically, see figure 1 below.

Distribution of the total displacement for increasing force steps.

Figure 1: Distribution of the total displacement S for increasing force steps f, as given in eq. (1). Observe how the distribution converges to the Gaussian limit eq. (6) indicated by black dotted lines.

So what happens with the maximal avalanche size S_{max} for large steps f \to \infty? S_{max} is now the maximum of many independent avalanche sizes S_1...S_n, and as mentioned in the introduction we expect its distribution to be a universal extreme value distribution.
Since only large S_0 are relevant in (4), the exponent can be approximated by

\displaystyle \frac{e^{-k^2 S_0/4}}{\sqrt{\pi S_0}} - \frac{k}{2}\text{erfc} \frac{k\sqrt{S_0}}{2} \approx \frac{4}{k^2 \sqrt{\pi} S_0^{3/2}}e^{-k^2 S_0/4}.    (7)

Inserting this back into (4), we see that for large f the distribution P_f(S_{max}) is centered around S_f := \frac{4}{k^2} \log f. The cumulative distribution function (4) is well approximated by

\displaystyle F_f(S_{max}) \approx \exp\left[- \frac{2}{k^2 \sqrt{\pi} S_f^{3/2}}e^{-k^2 (S_{max}-S_f)/4} \right].    (8)

This is, up to rescaling, the cumulative distribution function of the Gumbel extreme-value distribution e^{-e^{-x}}. It is easy to check this universal asymptotic form numerically, see figure 2 below. Note that here the convergence here is much slower than for the total displacement shown in figure 1, since the typical scale for S_{max} only grows logarithmically with f.

Distribution of the size of the largest avalanche

Figure 2: Distribution of the size of the largest avalanche S_{max} for force steps of increasing size f, as given by eq. (5). Observe how the distribution converges to the universal Gumbel limit in eq. (8) indicated by black dotted lines.

For some applications, the limit of a very soft spring, k \to 0, is important. I leave the details to the reader but the main picture is that the exponential decay for large S_0 in eq. (6) is replaced by a power law S_0^{-1/2}. Correspondingly, the universal extreme-value distribution observed for large force steps f is no longer the Gumbel distribution (8) but instead a Fréchet distribution.

Side note: The minimal avalanche size

One may be tempted to approach similarly the problem of the minimal avalanche size for a slow ramp-up of the applied force. However, this is not well-defined: Due to the roughness of the Brownian force landscape, as we increase the force more and more slowly, the size of the smallest avalanche decreases more and more. Hence, its distribution will always be discretization-dependent and will not yield a finite result such as eq. (4) in the continuum limit.

All this gives a consistent picture of the maximal avalanche in the ABBM model. I find it really nice that it is so simple, knowing the avalanche size distribution (1), to express the distribution of the size of the largest avalanche in closed form and understand how it behaves!

Written by inordinatum

August 19, 2014 at 9:53 pm

Fokker-Planck equation for a jump diffusion process

with 7 comments

One of the simplest stochastic processes is a Brownian motion with drift, or a diffusion process:

\displaystyle   \begin{array}{rl}  \dot{x}(t) =& \mu + \sigma\,\xi(t), \\  x(t) =& \mu t + \sigma \,W(t).  \end{array}   (1)

Here, \xi(t) is Gaussian white noise with mean zero and variance \left\langle \xi(t_1)\xi(t_2)\right\rangle = \delta(t_1-t_2). Its integral W(t) = \int_0^t \mathrm{d} t'\,\xi(t') is a Brownian motion.
Continuous Itô stochastic processes such as eq. (1) are insufficient for applications where the random variable x may jump suddenly (such as in avalanches). A natural extension of (1) for modelling this is a so-called jump-diffusion process. Let us suppose that our jump sizes s are positive, independent and identically distributed with density Q(s). Then, the jump diffusion process y(t) is

\displaystyle   y(t) = x(t) + \sum_{i=0}^{N(t)} s_i = \mu t + \sigma\, W(t) + \sum_{i=0}^{N(t)} s_i,   (2)

where s_i, i\in \mathbb{N} are i.i.d. jump sizes as above, and N(t) is the number of jumps encountered up to time t. For simplicitly, let us assume that jumps occur independently with rate \lambda, i.e. that the probability to have a jump in a time interval \mathrm{d}t is \lambda\, \mathrm{d}t. Then, N(t) is a Poisson process with rate \lambda.

It is well-known that the diffusion process x(t) in (1) is equivalently described by a partial differential equation for the distribution P(x) of x, the Fokker-Planck equation (FPE)

\displaystyle \partial_t P_t(x) = \frac{\sigma^2}{2}\partial_x^2 P_t(x) - \mu \partial_x P_t(x).   (3)

This representation is useful e.g. for first-passage problems: they correspond to various boundaries introduced in the PDE (3). So how does one generalise the Fokker-Planck (3) to the case of the jump-diffusion process y(t) in (2)? I will explain in the following that the answer is

\displaystyle \partial_t P_t(y) = \frac{\sigma^2}{2}\partial_y^2 P_t(y) - \mu \partial_y P_t(y)- \lambda P_t(y) + \lambda\,\int_0^\infty \mathrm{d}s \,Q(s)P_t(y - s) ,   (4)

and then discuss a specific example.

1. Deriving the jump-diffusion FPE

Let us consider a time step from t to t + \mathrm{d}t. The probability for a jump to occur during this interval is \lambda\,\mathrm{d}t, so

\displaystyle P_{t+\mathrm{d}t}(y) = (1-\lambda \mathrm{d}t)\left\langle P_t\left(y-\mu \,\mathrm{d}t - \left[W(t+\mathrm{d}t)-W(t)\right] \right) \right\rangle_W + \lambda \mathrm{d}t\,\left\langle P_t(y-s) \right\rangle_s,   (5)

where \left\langle \cdot \right\rangle_W denotes averaging over all realizations of the Brownian motion W, and \left\langle \cdot \right\rangle_s denotes averaging over the distribution Q(s) of the jump size s. Since the jump term is already multiplied by the jump probability \propto \mathrm{d}t, the drift and noise contributions there are of higher order in \mathrm{d}t and were dropped.

The averaging over the jump size in (5) yields a convolution with the jump size distribution Q(s):

\displaystyle \left\langle P_t(y-s) \right\rangle_s = \int_0^\infty \mathrm{d}s \,Q(s)P_t(y - s).

The average over the noise W in (5) is the same as for standard diffusion. During the interval \mathrm{d}t, the increment of the noise term in (2) is

\displaystyle W(t+\mathrm{d}t)-W(t) = \int_t^{t+\mathrm{d}t}\mathrm{d}t'\,\xi(t') =: V\sqrt{\mathrm{d}t},

where the last equality is a definition for V.
Since W is a Brownian motion, V is normally distributed:

\displaystyle \mathcal{N}(V) = \frac{1}{\sqrt{2\pi}}\exp\left(-\frac{V^2}{2}\right).

Thus, the average over W in (5) is

\displaystyle  \begin{array}{rl}  \displaystyle \left\langle P_t\left(y-\mu \,\mathrm{d}t - \left[W(t+\mathrm{d}t)-W(t)\right] \right) \right\rangle_W = & \displaystyle \int_{-\infty}^\infty \mathrm{d}V\,\mathcal{N}(V)\, P_t\left(y-\mu \,\mathrm{d}t - V \sqrt{\mathrm{d}t} \right) \\   = & \displaystyle P_t(y) - \mu\, \mathrm{d}t\, \partial_y P_t(y)- \sqrt{\mathrm{d}t} \partial_y P_t(y) \int_{-\infty}^\infty \mathrm{d}V\, \mathcal{N}(V)\, V  \\  & \displaystyle + \frac{1}{2}\mathrm{d}t \partial_y^2 P_t(y)\, \int_{-\infty}^\infty \mathrm{d}V\, \mathcal{N}(V)\, V^2 + \mathcal{O}(\mathrm{d}t)^2 \\  =& \displaystyle P_t(y) - \mu\, \mathrm{d}t\, \partial_y P_t(y) + \frac{1}{2}\mathrm{d}t\,\partial_y^2 P_t(y)  \end{array} .

In the last line we used the fact that the normal distribution has mean zero and variance 1, and dropped all terms of higher order than \mathrm{d}t. Inserting all this into (5), we obtain the jump-diffusion FPE (4).

2. Example: Exponential jumps

Now let us consider a simple example, where the jump sizes are distributed exponentially:

\displaystyle  Q(s) = \alpha\,\exp(-\alpha s),\quad\quad \alpha > 0  .

We will compute the distribution of y(t), starting from y(0)=0.
For this, we solve the jump-diffusion FPE (4) with the initial condition P_{t=0}(y) = \delta(y). Let us take a Fourier transform of (4) from y to k:

\begin{array}{rl}  \displaystyle  \tilde{P}_t(k) := & \displaystyle \int_{-\infty}^\infty \mathrm{d}y\,e^{i k y}\,P_t(y), \\  \displaystyle  \partial_t \tilde{P}_t(k) = & \displaystyle -\frac{\sigma^2}{2}k^2\,\tilde{P}_t(k) + i\mu k\, \tilde{P}_t(k) - \lambda \tilde{P}_t(k) + \lambda\tilde{Q}(k)\tilde{P}_t(k)  \end{array}  .

The different spatial modes decouple, and instead of a partial differential equation we obtain an ordinary differential equation for each value of the Fourier variable k.
For the above exponential form of the jump size distribution Q,

\displaystyle  \tilde{Q}(k) = \frac{\alpha}{\alpha - i k}  .

Furthermore, the initial condition P_{t=0}(y) = \delta(y) gives \tilde{P}_{t=0}(k) = 1. Hence, the solution $\tilde{P}$ reads

\displaystyle  \tilde{P}_t(k) = \exp\left[ \left( -\frac{\sigma^2}{2}k^2+  i\mu k + i\frac{\lambda k}{\alpha - i k} \right) t\right]  .

While it does not seem feasible to invert this Fourier transform to obtain a closed expression for P_t(y) (but if anyone has an idea, let me know!), \tilde{P}_t(y) is already enough to determine the moments of y(t). Taking derivatives, we obtain for example

\displaystyle  \begin{array}{rl}  \left\langle y(t) \right\rangle = & -i\partial_k \big|_{k=0} \tilde{P}_t(k) = \left(\frac{\lambda}{\alpha}+\mu\right)t \\  \left\langle y(t)^2 \right\rangle^c = & -\partial_k^2 \big|_{k=0} \tilde{P}_t(k) + \left[\partial_k \big|_{k=0} \tilde{P}_t(k)\right]^2  \\  = & \left(\frac{2\lambda}{\alpha^2}+\sigma^2\right)t, \\  ... &  \end{array}  .

Similarly, solving the FPE (4) with an absorbing boundary allows computing first-passage times (or at least their moments) for our jump-diffusion process.

Have fun, and do let me know if you have any comments!

Written by inordinatum

November 20, 2013 at 8:04 pm

How does an absorbing boundary modify the propagator of a stochastic process?

with one comment

A recurring theme throughout my blog are stochastic processes, widely used to model quantities under the influence of both deterministic evolution laws and random fluctuations. A complete solution for a stochastic process is its propagator: The probability density for the final state at some time t_f given an initial condition at time t_i. The propagator depends on the boundary conditions imposed on the stochastic process. For example, an absorbing boundary is useful for describing extinction in a stochastic model of population dynamics: When the population of a given species reaches zero, it cannot be re-introduced.
Typically, however, the “free” propagator for a stochastic process — i.e. the propagator without any boundaries — is much easier to determine than the propagator with any given boundary condition.

However, common sense tells us that the free propagator for a stochastic process contains already all there is to know about it. So I have wondered for quite a while if and how one can recover e.g. the propagator with an absorbing boundary, knowing only the free propagator. I found the solution only quite recently in this preprint on arXiv, and think that it is interesting enough to explain it here briefly.

The result

Consider a one-dimensional, stationary, Markovian stochastic process. Let us denote its free propagator by

\displaystyle G(x_i,x_f|t_f-t_i).

This is the density for the final value x_f at time t_f, given an initial value x_i at time t_i. Stationarity of the process implies that G_0 is a function of the time difference \tau := t_f-t_i only. Define the Laplace transform of the free propagator by

\displaystyle \hat{G}(x_i,x_f|\lambda) := \int_{0}^{\infty} \mathrm{d}\tau\, e^{\lambda \tau}G(x_i,x_f|\tau).

Now let us add an absorbing boundary at x=a. Then, the propagator

\displaystyle G_a(x_i,x_f|t_f-t_i)

is the density for the final value x_f at time t_f, given an initial value x_i at time t_i, without touching the line x=a between t_i and t_f. The upshot of this post is the following simple relationship between the Laplace transforms of the free propagator G and of the propagator with absorbing boundary G_a :

\displaystyle \hat{G}_a(x_i,x_f|\lambda) = \hat{G}(x_i,x_f|\lambda)-\frac{\hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{\hat{G}(a,a|\lambda)}.      (1)

In case the denominator on the right-hand side is zero, one may have to perform some limiting procedure in order to make sense of this formula. We can now check how nicely it works for simple cases, like Brownian motion with drift, where the result is known by other means.

Example: Brownian motion with drift

Let us consider an example, the Brownian motion with drift \mu. Its stochastic differential equation is

\displaystyle  \partial_t X(t) = \mu + \xi(t),

where \xi is white noise with \overline{\xi(t)\xi(t')} =2\sigma\delta(t-t').

The free propagator is a simple Gaussian

\displaystyle  G(x_f,x_i|\tau) = \frac{1}{\sqrt{4\pi\sigma\tau}}\exp\left[-\frac{(x_f-x_i-\mu \tau)^2}{4\sigma \tau}\right].

Its Laplace transform is given by:

\displaystyle  \begin{array}{rl}  \hat{G}(x_f,x_i|\lambda) =& \int_{0}^{\infty} \mathrm{d}\tau\, e^{\lambda \tau}G(x_i,x_f|\tau) \\  =&\frac{1}{\sqrt{\mu^2-4\lambda\sigma}}\exp\left[\frac{\mu}{2\sigma}(x_f-x_i) -\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|x_f-x_i|\right].  \end{array}

To compute G_a, let us assume x_i > a, x_f > a. Note this is without loss of generality, since if x_i and x_f are on opposite sides of the absorbing boundary the propagator G_a is zero. Applying the result (1), we have

\displaystyle   \begin{array}{rl}  \hat{G}_a(x_i,x_f|\lambda) =& \hat{G}(x_i,x_f|\lambda)-\frac{\hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{\hat{G}(a,a|\lambda)}  \\  \quad=& \frac{1}{\sqrt{\mu^2-4\lambda\sigma}}\exp\left[\frac{\mu}{2\sigma}(x_f-x_i) -\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|x_f-x_i|\right] - \\   & \frac{1}{\sqrt{\mu^2-4\lambda\sigma}}\exp\left[\frac{\mu}{2\sigma}(a-x_i) -\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|a-x_i| \right. \\  &\quad \left.+\frac{\mu}{2\sigma}(x_f-a) -\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|x_f-a|\right]  \\  \quad=& \frac{1}{\sqrt{\mu^2-4\lambda\sigma}}e^{\frac{\mu}{2\sigma}(x_f-x_i)}\left[e^{-\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}|x_f-x_i|} - e^{-\frac{1}{2\sigma}\sqrt{\mu^2-4\lambda\sigma}(x_i+x_f-2a)}\right].    \end{array}

The inverse Laplace transform of this expression is:

\displaystyle   \begin{array}{rl}  G_a(x_i,x_f|\tau) =& G(x_i,x_f|\tau)-e^{\frac{\mu}{\sigma}(a-x_i)}G(2a-x_i,x_f|\tau) \\  =& \frac{1}{\sqrt{4\pi\sigma\tau}}e^{-\frac{\mu^2 \tau}{4\sigma} + \mu\frac{x_f-x_i}{2\sigma}}\left[e^{-\frac{(x_f-x_i)^2}{4\sigma\tau}}-e^{-\frac{(2a-x_f+x_i)^2}{4\sigma\tau}}\right].  \end{array}

This is, of course, exactly the same result (the so-called Bachelier-Levy formula) which I had already discussed earlier in a guest blog post at Ricardo’s blog, using slightly different methods (a variant of the Cameron-Martin-Girsanov theorem). Note that some signs which were wrong in the original post are corrected here.

The proof

Now let us go back to the case of a general stochastic process, and discuss how formula (1) can be proven. Let us assume that the stochastic process in question has a path-integral representation, i.e. that

\displaystyle G(x_i,x_f|\tau) = \int_{x(t_i)=x_i}^{x(t_f)=x_f} \mathcal{D}[x(t)]\,e^{-S[x(t)]}

for some action S.

Now, to add an absorbing boundary at x=a we weigh each crossing of x=a by a factor e^{-\gamma} in the path integral, and then let \gamma \to \infty. This effectively retains only those trajectories in the path integral, which never touch x=a. We thus have:

\displaystyle G_a(x_i,x_f|\tau) = \lim_{\gamma \to \infty}\int_{x(t_i)=x_i}^{x(t_f)=x_f} \mathcal{D}[x(t)]\,e^{-S[x(t)]-\gamma\delta(x(t)-a)}.

Now, to evaluate the path integral, we expand it in a series in \gamma (or, in other words, in the number k of crossings of x=a). We get:

\displaystyle   \begin{array}{rl}  \lefteqn{G_a^{(\gamma)}(x_i,x_f|\tau) = G(x_i,x_f|\tau) + }&\\  & + \sum_{k=1}^{\infty} (-\gamma)^k \int_{t_i}^{t_f}\mathrm{d}t_1\, G(x_i,x_1|t_1-t_i) \int_{t_1}^{t_f}\mathrm{d}t_2\, G(x_1,x_2|t_2-t_1) \cdots \\  &  \cdots \int_{t_{k-1}}^{t_f}\mathrm{d}t_k\,G(x_{k-1},x_k|t_k-t_{k-1})\,G(x_k,x_f|t_f-t_k)  \end{array}  .

Note that here I cancelled the factor \frac{1}{k!}, coming from the Taylor expansion of e^{-\gamma \delta(x(t)-a)}, against the k! possibilities of ordering the times t_1\cdots t_k. This is why I can assume them to be in ascending order above. Also, the \delta functions actually set all x_1=x_2=\cdots=x_k=a.
Taking Laplace transforms, the time integrals (which are a k-fold convolution) simplify to a simple product:

\displaystyle  \begin{array}{rl}  \hat{G}_a^{(\gamma)}(x_i,x_f|\lambda) =& \hat{G}(x_i,x_f|\lambda) + \\  & + \sum_{k=1}^{\infty} (-\gamma)^k \hat{G}(x_i,a|\lambda)\hat{G}(a,a|\lambda)^{k-1} \hat{G}(a,x_f|\lambda) \\  =& \hat{G}(x_i,x_f|\lambda) - \frac{\gamma \hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{1+\gamma \hat{G}(a,a|\lambda)}\\  =& \hat{G}(x_i,x_f|\lambda) - \frac{\hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{\gamma^{-1}+ \hat{G}(a,a|\lambda)}  \end{array}.

Now, one easily obtains the claimed result:

\displaystyle  \begin{array}{rl}  \hat{G}_a(x_i,x_f|\lambda) =& \lim_{\gamma \to \infty}\hat{G}_a^{(\gamma)}(x_i,x_f|\lambda) \\  =& \hat{G}(x_i,x_f|\lambda) - \frac{\hat{G}(x_i,a|\lambda)\hat{G}(a,x_f|\lambda)}{\hat{G}(a,a|\lambda)}  \end{array}  .

This computation can, of course, be generalized to more complicated situations like several boundaries, higher dimensions, etc. For details see again the original preprint by C. Grosche. Have fun and let me know if you have questions!

Written by inordinatum

November 24, 2012 at 11:50 pm

The Alessandro-Beatrice-Bertotti-Montorsi model

with 2 comments

When a magnet is submitted to a slowly varying external magnetic field, its magnetization changes not smoothly, but in discrete jumps. These avalanches can be made audible using an induction coil. The resulting crackling signal is called Barkhausen noise. By analysing various features of this signal one can deduce information on material properties, for example residual stress or defect sizes, which is important for applications such as non-destructive testing. In this post, I will discuss a simple model describing the physics of Barkhausen noise. I will explain some of its predictions, including the stationary signal distribution, and sizes and durations of the avalanches.

Stochastic differential equation of the ABBM model

Example of signal generated by ABBM model

Example of signal generated by ABBM model

As you probably know, a ferromagnetic material with zero net magnetization consists of many magnetic domains. Inside one domain, the spins are oriented in parallel (thus each domain has a non-vanishing magnetization), however the magnetizations of different domains are randomly aligned and cancel out on average.
We will be interested in so-called soft magnets. In these materials domain walls can move quite freely, until they encounter a defect. This means they have a wide hysteresis loop. The dominant mechanism for magnetization is the motion of domain walls (and not the changing of magnetization inside one domain, as for hard magnets).

Alessandro, Beatrice, Bertotti and Montorsi (for details, see ref. [1]) model the change in magnetization under an external field through the motion of a single domain wall transverse to the magnetic field. They propose the following stochastic differential equation for the domain wall position u(t):

\displaystyle \Gamma \partial_t u(t) = H(t) -k u(t) + F(u(t))

Here, H(t) is the external applied magnetic field, and \Gamma is the domain wall relaxation time. The term -ku(t) is the so-called demagnetizing field, which keeps the domain wall from moving indefinitely. F(u) is a random pinning force, which depends on the position of the particle u(t). In the ABBM model, it is assumed to be a realization of Brownian motion. For more details on the motivation of the individual contributions, see e.g. the review in ref. [2].

Note that the random pinning force is quenched, i.e. depends on the particle position u(t) and not directly on the time t. A time-dependent random force would be a model for thermal noise (instead of localized defects).

Simulating the stochastic differential equation above yields a trajectory (see the figure on the right) which is very similar to the results of Barkhausen noise measurements. Due to the specific properties of Brownian motion, the ABBM model is easy to treat analytically. I will now discuss several observables which can be computed analytically: The stationary distribution of domain wall velocities, and the distributions of avalanche sizes and durations.

Stationary domain wall velocity distribution

To obtain a stationary state of domain wall motion, one ramps up the external field H linearly:
\displaystyle H(t) = c t.
Then, the instantaneous domain wall velocity \dot{u} has a stationary distribution given by
\displaystyle P(\dot{u}) = \frac{e^{-\dot{u}}\dot{u}^{-1+c}}{\Gamma(c)}.
Here I use dimensionless units k = \Gamma = 1. One way to derive this result is by solving the Fokker-Planck equation associated to the SDE above (as was done by ABBM in ref. [1]).
This distribution is interesting since it exhibits two different kinds of behaviour: For c < 1, P(\dot{u}=0)=\infty, meaning that the domain wall is pinned at near zero velocity most of the time. On the other hand, for c > 1, P(\dot{u}=0)=0 and the motion is smooth.
In the following I will focus on the case c<1, where we have intermittent avalanches.

Avalanche statistics

One way to obtain information on avalanches in the ABBM model is mapping the SDE above to a path integral (the Martin-Siggia-Rose formalism). This is done e.g. in references [3] and [4] below, where the resulting path integral is solved for any external field H(t). Probably the simplest way to define an avalanche is to apply a step-like field, H(t) = w \theta(t). The instantaneous increase at t=0 from H=0 to H=w triggers precisely one avalanche. Its size S and duration T are distributed according to (again in dimensionless units):
\displaystyle P(S) = \frac{w}{\sqrt{4\pi}S^{3/2}}e^{-\frac{(w-S)^2}{4S}},
\displaystyle P(T) = \frac{w}{(2\sinh T/2)^2}e^{-\frac{w}{e^T-1}}.
For small w and small avalanches, these distributions display a large power-law regime where one has
P(S) \sim S^{-3/2}, \quad P(T) \sim T^{-2}.
These power laws indicate that avalanches in the ABBM model are scale-free: There are both extremely small and extremely large ones (between the microscopic cutoff scale given by w and the macroscopic cutoff scale given by k=1).

Universality (or the lack of it)

The exponents of the power-law regimes in P(S) and P(T) above are universal for mean-field models of elastic interfaces. They do not depend on material properties or on details of the dynamics, but only on the fact that one has sufficiently long range interactions between different parts of the interface. These exponents are well-verified experimentally for magnets falling into the mean-field universality class.

Being universal, they also apply to other elastic interfaces with long-range interactions: Some even argue that the P(S) \sim S^{-3/2} behaviour is related to the Gutenberg-Richter distribution of earthquake moments. The avalanche size S would correspond to the earthquake moment, related to its magnitude M (the number reported in newspapers) via M \propto \log_{10} S. The exponent 3/2 for P(S) would give a Gutenberg-Richter b-value of b\approx 0.75, which is not too far off from the observed one.

On the other hand, I find it a little overly simplistic to try and find universal aspects of completely disparate physical systems. We know after all, that motion of magnetic domain walls and of earthquakes is not the same thing — so maybe the more interesting physics is in their differences, rather than their similarities.
A more detailed analysis of avalanches thus requires going beyond just power-law exponents. Several more refined observables — like mean shapes of avalanches — have been proposed to that end. It has been shown (see ref. [5]) that they are sensitive to the details of the dynamics. In my view, the interesting question (which is still not completely answered) is: What features should one look at, in order to determine if a signal is Barkhausen noise or something else? What can one learn from it about the microscopic disorder in one particular sample by listening to the Barkhausen noise it emits?

Outlook and References

If there is interest, in the future I may extend this blog post to a Wikipedia article, since I believe the model is simple but frequently used. It is still a field of active research, thus the list of references is certainly incomplete. Let me know if you want to add anything!

[1] B. Alessandro, C. Beatrice, G. Bertotti, and A. Montorsi, “Domain-wall dynamics and Barkhausen effect in metallic ferromagnetic materials. I. Theory,” J. Appl. Phys., vol. 68, no. 6, p. 2901, 1990.

[2] F. Colaiori, “Exactly solvable model of avalanches dynamics for Barkhausen crackling noise,” Adv. Phys., vol. 57, no. 4, p. 287, Jul. 2008.

[3] P. Le Doussal and K. J. Wiese, “Distribution of velocities in an avalanche,” Europhys. Lett., vol. 97, no. 4, p. 46004, Apr. 2012.

[4] A. Dobrinevski, P. Le Doussal, and K. J. Wiese, “Nonstationary dynamics of the Alessandro-Beatrice-Bertotti-Montorsi model,” Phys. Rev. E, vol. 85, no. 3, p. 18, Mar. 2012.

[5] S. Zapperi, C. Castellano, F. Colaiori, and G. Durin, “Signature of effective mass in crackling-noise asymmetry,” Nature Physics, vol. 1, no. 1, p. 46, Oct. 2005.

Written by inordinatum

October 11, 2012 at 8:37 pm

A quick introduction to the Martin-Siggia-Rose formalism

with 19 comments

The Martin-Siggia-Rose (MSR) formalism is a method to write stochastic differential equations (SDEs) as a field theory formulated using path integrals. These are familiar from many branches of physics, and are simple to treat perturbatively (or, in some cases, even to solve exactly).

We start from an SDE of the form:

\displaystyle \partial_t x(t) = F(x(t),t) + \xi(x(t),t),

where F is a deterministic function and \xi(x(t),t) is Gaussian noise with an arbitrary correlation function G:

\displaystyle \left\langle \xi(x,t) \xi(x',t') \right\rangle = G(x,t,x',t').

MSR recognized that observables, averaged over solutions of this SDE, can be written as a path integral with a certain action:

\left\langle \mathcal{O}[x(t)] \right\rangle = \int \mathcal{D}[x,\tilde{x}]\,\mathcal{O}[x(t)]\,e^{-S[x,\tilde{x}]},
where
\displaystyle S[x,\tilde{x}] = \int_t i\tilde{x}(t)\left[\partial_t x(t) - F(x(t),t)\right] + \frac{1}{2}\int_{t,t'} G(x(t),t,x(t'),t')\tilde{x}(t)\tilde{x}(t').

A useful special case is an automonous (time-independent) SDE with multiplicative noise:

\displaystyle \partial_t x(t) = F(x) + H(x)\xi(t),

where \overline{\xi(t)\xi(t')} = 2\sigma \delta(t-t') is Gaussian white noise. The resulting action for the path integral is local in time:

\displaystyle S[x,\tilde{x}] = \int_t i\tilde{x}(t)\left[\partial_t x(t) - F(x(t))\right] + \frac{\sigma}{2} H(x(t))^2\tilde{x}(t)^2.

I will now show a few concrete examples, and then explain how this mapping is derived.

Example: Ornstein-Uhlenbeck process

The Ornstein-Uhlenbeck process is one of the few stochastic processes treatable analytically. Its SDE is:
\displaystyle \partial_t x(t) = -\alpha x(t) + \xi(t),
where \alpha > 0 (else the process blows up) and \xi(t) is white noise:
\left\langle \xi(t)\xi(t') \right\rangle = 2\sigma \delta(t-t').

In this section, I will show that the generating functional for this Ornstein-Uhlenbeck process is given by:
\displaystyle \left\langle e^{\int_t \lambda(t) x(t)} \right\rangle = \exp \left[ \frac{\sigma}{2\alpha}\int_{-\infty}^\infty \mathrm{d} s_1 \int_{-\infty}^\infty \mathrm{d} s_2 \,e^{-\alpha|s_1-s_2|}\, \lambda(s_1) \lambda(s_2) \right] .

To show this, we apply the MSR formula to express the generating functional as a path integral:

\displaystyle \left\langle e^{\int_t \lambda(t) x(t)} \right\rangle = \int \mathcal{D}[x,\tilde{x}] \exp\left\{-\int_t i\tilde{x}(t)\left[\partial_t x(t) +\alpha x(t)\right] - \sigma\int_{t} \tilde{x}(t)^2 + \int_t \lambda(t) x(t)\right\}.

This path integral can be solved exactly. The key observation is that the exponent is linear in x(t), and hence one can integrate over this field. Using \int_x e^{i x q} \propto \delta(q), this gives (up to a normalization factor \mathcal{N}) a \delta-functional:

\displaystyle \left\langle e^{\int_t \lambda(t) x(t)} \right\rangle = \mathcal{N}\int \mathcal{D}[\tilde{x}] e^{ \sigma\int_{t} \tilde{x}(t)^2 } \delta\left[i\left(\partial_t - \alpha \right) \tilde{x}(t)+\lambda(t)\right].
The e^{ \sigma\int_{t} \tilde{x}(t)^2 } factor remains since it is the only term in the action independent of x(t).

The \delta-functional fixes \tilde{x}(t) in terms of \lambda:
\displaystyle \tilde{x}(t) = i \int_{t}^\infty \mathrm{d} s \,e^{\alpha(t-s)}\, \lambda(s) .

From this we get
\displaystyle \int_t \tilde{x}(t)^2 = - \int_t \int_{t}^\infty \mathrm{d} s_1 \,e^{\alpha(t-s_1)}\, \lambda(s_1) \int_{t}^\infty \mathrm{d} s_2 \,e^{\alpha(t-s_2)}\, \lambda(s_2) \\  =  - \frac{1}{2\alpha}\int_{-\infty}^\infty \mathrm{d} s_1 \int_{-\infty}^\infty \mathrm{d} s_2 \,e^{-\alpha|s_1-s_2|}\, \lambda(s_1) \lambda(s_2)

The normalization factor \mathcal{N} is seen to be 1 by imposing that for \lambda=0 we must have \left\langle e^{\int_t \lambda(t) x(t)} \right\rangle = 1.
Plugging this in gives the generating functional, as claimed above:
\displaystyle \left\langle e^{\int_t \lambda(t) x(t)} \right\rangle = \exp\left[ \frac{\sigma}{2\alpha}\int_{-\infty}^\infty \mathrm{d} s_1 \int_{-\infty}^\infty \mathrm{d} s_2 \,e^{-\alpha|s_1-s_2|}\, \lambda(s_1) \lambda(s_2) \right] .

One simple consequence is the exponential decay of the connected correlation function:
\displaystyle \left\langle x(t)x(t')\right\rangle^c = \frac{\delta^2}{\delta \lambda(t) \delta \lambda(t')}\big|_{\lambda=0} \ln \left\langle e^{\int_t \lambda(t) x(t)} \right\rangle = \frac{\sigma}{2\alpha}e^{-\alpha|s_1-s_2|}.

But of course, the full generating functional tells us much more – in some sense, it is the exact solution of the SDE!

Note added 01/12/2012: Of course, the generating functional can be written down immediately once one knows the two-point correlation function and the fact that the Ornstein-Uhlenbeck process is Gaussian. However, the latter fact is not immediately obvious (and was not used in our derivation), in fact one can see our MSR calculation as one way to prove Gaussianity for the OU process.

Derivation of the Martin-Siggia-Rose formula

Now that we’ve seen what the MSR path integral formulation is useful for, let us see how it can be derived. The average over an observable can be written as a path integral with the SDE enforced through a \delta-functional:
\displaystyle \left\langle \mathcal{O}[x(t)] \right\rangle = \left\langle\int \mathcal{D}[x]\,\mathcal{O}[x(t)]\,\delta\left[\partial_t x(t) - F(x(t),t) - \xi(x(t),t)\right]\right\rangle.
Now, we rewrite the \delta-functional using the formula \delta(q) \propto \int_{\tilde{x}} e^{i \tilde{x} q}. Since the \delta-functional is a product of a \delta-function at all points in space, effectively we are introducing an auxiliary field \tilde{x}(t):
\displaystyle \left\langle \mathcal{O}[x(t)] \right\rangle = \left\langle\int \mathcal{D}[x,\tilde{x}]\,\mathcal{O}[x(t)]\,\exp\left\{-\int_t i\tilde{x}(t)\left[\partial_t x(t) - F(x(t),t) - \xi(x(t),t)\right]\right\}\right\rangle.
Now, the only thing that depends on the noise \xi(x(t),t) is the factor e^{\int_t i\tilde{x}(t)\xi(x(t),t)}. Since \xi is Gaussian, the average of this factor can be simply expressed as
\displaystyle \left\langle e^{\int_t i\tilde{x}(t)\xi(x(t),t)} \right\rangle = \exp\left[-\frac{1}{2}\int_{t,t'}\tilde{x}(t)\tilde{x}(t')\left\langle\xi(x(t),t)\xi(x(t'),t')\right\rangle\right] =  \exp\left[-\int_{t,t'}\tilde{x}(t)\tilde{x}(t')G(x(t),t,x(t'),t') \right]
Altogether we obtain
\displaystyle \left\langle \mathcal{O}[x(t)] \right\rangle = \int \mathcal{D}[x,\tilde{x}]\,\mathcal{O}[x(t)]\,\exp\left\{-\int_t i\tilde{x}(t)\left[\partial_t x(t) - F(x(t),t)\right]-\int_{t,t'}\tilde{x}(t)\tilde{x}(t')G(x(t),t,x(t'),t') \right\}
which is just the MSR formula claimed above.

Amazingly, I did not yet find a clear derivation of this path integral representation on Wikipedia or anywhere else on the web… So I hope this will be useful for someone starting to work with the formalism!

If there is interest, I may write another blog post in the future with more examples and explanations e.g. on the physical role of the auxiliary field \tilde{x}.

Written by inordinatum

September 27, 2012 at 11:03 pm