inordinatum

Physics and Mathematics of Disordered Systems

Archive for the ‘Finance’ Category

Fluctuations of the time-weighted average price

leave a comment »

Motivated by an interview question posed to a friend of mine recently, today I’d like to talk about the time-weighted average price (TWAP) of a financial asset (e.g. a stock).

Let’s consider the price of our stock a short time scale (e.g. intraday) corresponding to the interval t\in [0;1]. Its price will start at P_0 and evolve through P_t to P_1; the time-weighted average price is

\displaystyle T:=\int_0^1 P_t \, \mathrm{d}t

T defined in this way is interesting, since this is the effective price obtained when executing a large order in the market by splitting it into smaller chunks and executing them throughout the day at a constant rate. This TWAP algorithm is one of the basic algorithmic execution tools used e.g. by asset managers to minimize market impact costs. More sophisticated ones include volume-weighted average price (VWAP) where one would adjust the volume executed at each time during the day proportionally to the typical trading volume at that time.

With this motivation, let us look at the statistics of the TWAP T, and especially its fluctuations. Usually, one approximates the logarithm of the stock price by a brownian motion (corresponding to the assumption, that relative price changes are independent and identically distributed). However, on small time scales (when fluctuations are small), the compounding effect is negligible. For simplicity, I will hence assume that S_t := P_t - P_0 is a brownian motion with drift \mu and variance \sigma. This means that absolute price changes are i.i.d..

Moments of the TWAP

The Gaussian process S_t is fully determined by its first two moments,

\displaystyle \left\langle S_t \right\rangle = \mu t, \quad \left\langle (S_{t_1}-\mu t_1)(S_{t_2}-\mu t_2) \right\rangle = \sigma \, \text{min}\, t_1, t_2

From this, we obtain the first moments of the TWAP T (for simplicity I subtracted the trivial P_0 component):

\displaystyle \left\langle T \right\rangle = \int_0^1 \left\langle S_t\right\rangle\, \mathrm{d}t = \int_0^1 \mu\,t\, \mathrm{d}t = \frac{1}{2}\mu

\displaystyle \left\langle T^2 \right\rangle = \int_0^1\mathrm{d}t_1 \int_0^1\mathrm{d}t_2 \left\langle S_{t_1} S_{t_2}\right\rangle = \int_0^1  \mathrm{d}t_1 \int_0^1 \mathrm{d}t_2 \, \left(\mu^2 t_1 t_2 + \sigma\, \text{min}\, t_1, t_2\right)= \frac{\mu^2}{4} +\frac{\sigma}{3}

In other words, the variance of the TWAP relates to the variance of the intraday stock price change as

\displaystyle \frac{ \left\langle T^2 \right\rangle^c}{ \left\langle S_1^2 \right\rangle^c} = \frac{\sigma/3}{\sigma} = \frac{1}{3}

(\left\langle\,\right\rangle^c denotes connected expectation values here).

This procedure can easily be continued to calculate higher moments of the TWAP T in terms of \mu and \sigma. In fact, given that the linear combination of Gaussians is again a Gaussian, we can directly infer that the TWAP has a Gaussian distribution with mean \mu/2 and variance \sigma/3.

Distribution of the TWAP via the MSR formalism

There is another way to obtain the full distribution of the TWAP directly, via the Martin-Siggia-Rose formalism. Using the MSR approach, we know that the generating function of any linear functional of the Brownian S_t is

\displaystyle \left\langle \text{exp}\left[\int_0^{t_0}\,\lambda_t S_t\mathrm{d}t\right]\right\rangle = \exp\left[\int_0^{t_0}\left(\mu\tilde{S}_t + \sigma \tilde{S}_t^2 \right)\mathrm{d}t \right]

where \tilde{S}_t is the solution of

\displaystyle \partial_t\tilde{S}_t = \lambda_t, \displaystyle \tilde{S}_{t>t_0} = 0

Applying this to a constant \lambda_t = \lambda / t_0, we get \tilde{S_t} = \lambda(1-t/t_0)\theta(t_0-t). From this follows the generating function of the TWAP T

\displaystyle \left\langle e^{\lambda T} \right\rangle = \exp \left(\frac{\mu}{2}t_0\lambda + \frac{\sigma}{3}t_0\lambda^2 \right)

This is clearly the generating function of a Gaussian distribution with the same moments as computed above.

Advertisements

Written by inordinatum

June 2, 2018 at 10:50 am

Estimating expected growth

leave a comment »

Let’s take some fluctuating time series — say the value of some financial asset, like a stock price. What is its average growth rate? This seemingly trivial question came up in a recent discussion I had with a friend; obviously, it is relevant for quantitative finance and many other applications. Looking at it in more detail, it turns out that a precise answer is actually not that simple, and depends on the time range for which one would like to estimate the expected growth. So I’d like to share here some insights on this problem and its interesting connections to stochastic processes. Suprisingly, this was only studied in the literature quite recently!

The setup

Let’s consider the price s_t of an asset at N+1 discrete times t=1...N+1. The corresponding growth rates r_t are defined by

\displaystyle 1+r_t := \frac{s_{t+1}}{s_t}.    (1)

I.e. if s_{t+1}=1.5 s_t, the growth rate is r_t=0.5=50\%. Let’s also assume that our growth process r_t is stationary, i.e. that the distribution of the growth rates does not change in time. Say for concreteness that the growth rates r_t are i.i.d. (independent identically distributed) random variables.

The arithmetic average

The most immediate idea for computing the average growth rate \overline{r} is just to take the arithmetic average, i.e.

\displaystyle \overline{r}^{AM} := \frac{1}{N}\sum_{t=1}^N r_t.    (2)

What does this give us? Obviously, by the assumption of stationarity made above, with increasing sample size N the expectation value for the growth rate at the next time step r_{N+1} (or any other future time step) is approximated better and better by \overline{r}^{AM}:

\displaystyle \lim_{N\to\infty} E[r_{N+1}]-\frac{1}{N}\sum_{t=1}^N r_t =0.

This seems to be what we’d expect for an average growth rate, so what’s missing? Let’s go back to our sample of N asset prices and take the total growth

\displaystyle \frac{s_{N+1}}{s_1} = \prod_{i=1}^N (1+r_i).

By the relationship between the arithmetic and the geometric mean, we know that

\displaystyle \frac{s_{N+1}}{s_1} = \prod_{i=1}^N (1+r_i) \leq \left[\frac{1}{N}\sum_{i=1}^N (1+r_i)\right]^N = \left(1+\overline{r}^{AM}\right)^N. (3)

The left-hand side is the total growth after N time periods and the right-hand side is its estimate from the arithmetic mean in eq. (2). Equality only holds in eq. (3) when all the r_t are equal. When the r_t fluctuate the total growth rate will always be strictly less than the estimate from the arithmetic mean, even as N \to \infty. So, by taking the arithmetic mean to obtain the total growth in the price of our asset at the end of the observation period, we systematically overestimate it. The cause for this is the compounding performed to obtain the total growth over more than one time step. This makes our observable a nonlinear function of the growth rate in a single time step. Thus, fluctuations in the growth rate don’t average out, and yield a net shift in its expectation value.

The first explicit observation of this effect I’ve found is in a 1974 paper by M. E. Blume, Unbiased Estimators of Long-Run Expected Rates of Return. Quantitative estimates from that paper and from a later study by Jacquier, Kane and Marcus show that in typical situations this overestimation may easily be as large as 25%-100%.

The geometric average

So we see that the arithmetic mean does not provide an unbiased estimate of the compounded growth over a longer time period. Another natural way to obtain the average growth, which intuitively seems more adapted to capture that effect, is to take the geometric average of the growth rates:

\displaystyle 1+\overline{r}^{GM} := \left(\prod_{t=1}^N 1+r_t\right)^{1/N}.    (4)

Now, by construction, the total growth at the end of our observation period, i.e. after N time periods, is correctly captured:

\displaystyle \frac{s_{N+1}}{s_1} = \left(1+\overline{r}^{GM}\right)^N

But this only solves the problem which we observed after eq. (3) for this specific case, when estimating growth for a time period whose length is exactly the same as the observation time span N from which we obtain the average. For shorter time periods (in particular, when estimating the growth rate for a single time step r_t), the geometric mean will now underestimate the growth rate. On the other hand, for time periods even longer than the observation period, it will still overestimate it like the arithmetic mean (see again the paper by Blume for a more detailed discussion).

Unbiased estimators

Considering the results above, the issue becomes clearer: A reasonable (i.e. unbiased) estimate for the compounded growth over a time period requires a formula that takes into account both the number of observed time steps N, and the number of time steps over which we’d like to estimate the compounded growth T. For T=1 we can use the arithmetic mean, for T=N we can use the geometric mean, and for general T we need another estimator altogether. For the general case, Blume proposes the following (approximately) unbiased estimator \left(\overline{r}^{UB}\right)^T:

\displaystyle \overline{r}^{UB} = \frac{N-T}{N-1}\left(1+\overline{r}^{AM}\right)^T+\frac{T-1}{N-1}\left(1+\overline{r}^{GM}\right)^T.    (5)

Eq. (5) is a reasonable approximation for the compounded growth, not having any further information on the form of the distribution, correlations, etc.
For T=1, i.e. estimating growth in a single time step, this gives just the arithmetic mean \overline{r}^{AM} which is fine as we saw above. For T=N, this gives the geometric mean which is also correct. For other values of T, eq. (5) is a linear combination of the arithmetic and the geometric mean.

To see how the coefficients in eq. (5) arise, let us start with an ansatz of the form

\displaystyle \left(\overline{r}^{UB}\right)^T = \alpha\left(1+\overline{r}^{AM}\right)^T+\beta\left(1+\overline{r}^{GM}\right)^T.    (6)

Let us further split up the growth rates as r_t = \mu + \epsilon_t, where \mu is the “true” average growth rate and \epsilon are fluctuations. Inserting this as well as the definitions of \overline{r}^{AM}, \overline{r}^{GM} into eq. (6) we get

\displaystyle \left(\overline{r}^{UB}\right)^T = \alpha\left(1+\mu +\frac{1}{N}\sum_{i=1}^N\epsilon_i\right)^T+\beta\left[\prod_{i=1}^N (1+\mu+\epsilon_i)\right]^{T/N}.

Now let us assume that the fluctuations \epsilon are small, and satisfy E[\epsilon_i\epsilon_j]=\sigma^2 \delta_{ij}. Expanding to second order in \epsilon (the first order vanishes), and taking the expectation value, we obtain

\begin{array}{rl}   \displaystyle E\left[\left(\overline{r}^{UB}\right)^T\right] = & \displaystyle \alpha\left(1+\mu\right)^T\left[1+\frac{T(T-1)\sigma^2}{2(1+\mu)^2N}\right]  \\ & \displaystyle +\beta\left(1+\mu\right)^T\left[1+\frac{T(T-N)\sigma^2}{2(1+\mu)^2N}\right]  \\  & \displaystyle + \mathcal{O}(\sigma^4).  \end{array}

To obtain an estimator that is unbiased (to second order in \sigma), we now choose \alpha and \beta such that the term of order \sigma^0 is just the true growth rate (1+\mu)^T, and the term of order \sigma^2 vanishes. This gives the system

\displaystyle   \begin{array}{rl}  \displaystyle \alpha + \beta & =  1,\\  \displaystyle \frac{T-1}{2}\alpha+\frac{T-N}{2}\beta & = 0.  \end{array}

The solution of this linear system for \alpha and \beta yields exactly the coefficients in eq. (5).
Of course, here we make the assumption of small fluctuations and also a specific ansatz for the estimator. If one has more information on the distribution of the growth rates r this may not be the most adequate one, but with what we know there’s not much more we can do!

Outlook

As you can see from the above discussion, estimating the expected (compounded) growth over a series of time steps is more complex than it appears at first sight. I’ve shown some basic results, but didn’t touch on many other important aspects:

  • In addition to the growth rate r, it is also interesting to consider the discount factor r^{-T}. Blume’s approach is extended to this observable in this paper by Ian Cooper.
  • If one assumes the growth factors 1+r in eq. (1) to be log-normally distributed, the problem can be treated analytically. Jacquier, Kane and Marcus discuss this case in detail in this paper, and also provide an exact result for the unbiased estimator.
  • The assumption of independent, identically distributed growth rates is not very realistic. On the one hand, we expect the distribution from which the annual returns are drawn to vary in time (i.e. due to underlying macroeconomic conditions). This is discussed briefly in Cooper’s paper. On the other hand, we also expect some amount of correlation between subsequent time steps, even if the underlying distribution does not change. It would be interesting to see how this modifies the results above.

Let me know if you find these interesting — I’ll be glad to expand on that in a future post!

Written by inordinatum

January 18, 2015 at 4:18 pm