Returns and Stylized Facts

Before modeling anything, we need to understand what financial return data actually looks like. This chapter introduces log-returns, then walks through the canonical stylized facts — empirical regularities observed consistently across assets, markets, and time periods.

Log-Returns¶

Let $P_t$ be the price of an asset at time $t$ . The simple return and log-return are defined as:

R_t = \frac{P_t - P_{t-1}}{P_{t-1}}, \qquad r_t = \ln\left(\frac{P_t}{P_{t-1}}\right) = \ln P_t - \ln P_{t-1}

(1)

For small values, $r_t \approx R_t$ . We prefer log-returns because:

They are time-additive: $r_{t,t+k} = \sum_{i=1}^{k} r_{t+i}$
They are defined on $(-\infty, +\infty)$ — convenient for modeling
They relate directly to the continuously compounded rate of return

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from scipy import stats
import warnings
warnings.filterwarnings('ignore')

# Reproducibility
np.random.seed(42)

# Simulate a GBM price series for illustration
n = 1000
mu, sigma = 0.0005, 0.015
eps = np.random.normal(0, 1, n)
log_returns = mu + sigma * eps
prices = 100 * np.exp(np.cumsum(log_returns))

# Also simulate a series with GARCH-like volatility clustering
h = np.zeros(n)
r_garch = np.zeros(n)
omega, alpha, beta = 0.00001, 0.10, 0.85
h[0] = omega / (1 - alpha - beta)
for t in range(1, n):
    h[t] = omega + alpha * r_garch[t-1]**2 + beta * h[t-1]
    r_garch[t] = np.sqrt(h[t]) * np.random.normal()

dates = pd.date_range('2018-01-01', periods=n, freq='B')
df = pd.DataFrame({'price': prices, 'log_return': log_returns, 'r_garch': r_garch}, index=dates)

print("Dataset shape:", df.shape)
print(df['log_return'].describe().round(6))

The Stylized Facts¶

Cont (2001) systematically documented statistical properties common to financial return series. We examine the most important ones.

1. Fat Tails (Excess Kurtosis)¶

Financial returns have heavier tails than a normal distribution. The kurtosis of a normal distribution is 3 (excess kurtosis = 0). Empirical return series consistently show excess kurtosis well above zero, implying large moves occur far more often than Gaussian models predict.

This has critical implications: Value-at-Risk calculated under normality systematically underestimates tail risk.

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

# Normal vs empirical histogram
ax = axes[0]
r = df['r_garch']
r_std = (r - r.mean()) / r.std()
x = np.linspace(-5, 5, 300)

ax.hist(r_std, bins=60, density=True, alpha=0.6, color='steelblue', label='Simulated returns')
ax.plot(x, stats.norm.pdf(x), 'r-', lw=2, label='Normal distribution')
ax.plot(x, stats.t.pdf(x, df=5), 'g--', lw=2, label='t-distribution (df=5)')
ax.set_xlim(-5, 5)
ax.set_xlabel('Standardized return')
ax.set_ylabel('Density')
ax.set_title('Fat Tails: Return Distribution vs Normal')
ax.legend()

# QQ plot
ax = axes[1]
stats.probplot(r_std, dist='norm', plot=ax)
ax.set_title('Normal Q-Q Plot (deviations in tails = fat tails)')
ax.get_lines()[1].set_color('red')

plt.tight_layout()
plt.show()

print(f"Excess kurtosis (GARCH series): {stats.kurtosis(r_garch):.3f}")
print(f"Excess kurtosis (normal benchmark): {stats.kurtosis(np.random.normal(size=10000)):.3f}")

2. Volatility Clustering¶

Returns themselves are nearly unpredictable, but their absolute values or squares are autocorrelated. Large moves tend to be followed by large moves (of either sign), and calm periods cluster together. This was first noted by Mandelbrot (1963):

“Large changes tend to be followed by large changes — of either sign — and small changes tend to be followed by small changes.”

Formally, $\text{Corr}(|r_t|, |r_{t-k}|) > 0$ for many lags $k$ , even though $\text{Corr}(r_t, r_{t-k}) \approx 0$ .

from statsmodels.graphics.tsaplots import plot_acf

fig, axes = plt.subplots(2, 2, figsize=(13, 7))

# Returns time series
axes[0, 0].plot(df.index, df['r_garch'], linewidth=0.6, color='steelblue')
axes[0, 0].set_title('Return Series')
axes[0, 0].set_ylabel('$r_t$')

# Squared returns time series
axes[0, 1].plot(df.index, df['r_garch']**2, linewidth=0.6, color='coral')
axes[0, 1].set_title('Squared Returns (Proxy for Volatility)')
axes[0, 1].set_ylabel('$r_t^2$')

# ACF of returns
plot_acf(df['r_garch'], lags=40, ax=axes[1, 0], title='ACF of Returns')

# ACF of squared returns
plot_acf(df['r_garch']**2, lags=40, ax=axes[1, 1], title='ACF of Squared Returns')

plt.tight_layout()
plt.show()

Summary Table¶

Stylized Fact	Description	Implication
Fat tails	Excess kurtosis > 0	Normal VaR underestimates risk
Volatility clustering	$\text{Corr}(r_t^2, r_{t-k}^2) > 0$	GARCH-type models needed
Absence of autocorrelation	$\text{Corr}(r_t, r_{t-k}) \approx 0$	Consistent with weak-form efficiency
Leverage effect	Negative shock $\Rightarrow$ higher volatility	Asymmetric GARCH (GJR, EGARCH)
Long memory in volatility	Slow ACF decay in $	r_t

References¶

Cont, R. (2001). Empirical properties of asset returns: stylized facts and statistical issues. Quantitative Finance, 1(2), 223–236.
Mandelbrot, B. (1963). The variation of certain speculative prices. Journal of Business, 36(4), 394–419.
Campbell, J., Lo, A., MacKinlay, C. (1997). The Econometrics of Financial Markets. Princeton University Press.