How does Extreme Value Theory (EVT) improve tail risk estimation, and what is the Peaks-over-Threshold approach?
I keep hearing that EVT is the 'gold standard' for modeling tail risk. For FRM Part II, how does it work, why is it better than fitting a normal or Student-t distribution to the whole return series, and what is the POT method?
Extreme Value Theory (EVT) is a branch of statistics that focuses exclusively on the behavior of extreme values. Instead of fitting a distribution to ALL returns, EVT models only the tails — where risk management needs the most accuracy.
The Key Insight:
Regardless of the overall distribution of returns, the behavior of extreme values converges to one of three specific distributions (the Generalized Extreme Value family). This is the Fisher-Tippett theorem — the tail's 'central limit theorem.'
Two Main EVT Approaches:
1. Block Maxima (BMM):
Divide data into blocks (e.g., monthly), take the maximum loss from each block, and fit a Generalized Extreme Value (GEV) distribution. Problem: wasteful — ignores other large losses within each block.
2. Peaks-over-Threshold (POT) — Preferred:
Select a high threshold u and model ALL exceedances above u using the Generalized Pareto Distribution (GPD).
For losses x > u:
P(X > x | X > u) = [1 + xi(x-u)/beta]^(-1/xi)
Where:
- xi (shape): Controls tail heaviness. xi > 0 = heavy tail (fat), xi = 0 = exponential tail, xi < 0 = bounded tail
- beta (scale): Controls the spread of exceedances
- u (threshold): Must be high enough for the GPD to be valid, but low enough to have sufficient exceedances
Example — Stonebridge Capital, 2,500 daily equity returns:
Step 1: Choose threshold u at the 95th percentile of losses = -2.1%
Step 2: Extract exceedances: 125 losses worse than -2.1%
Step 3: Fit GPD to exceedances: xi = 0.28, beta = 0.85%
Now estimate the 99.9% VaR:
VaR(99.9%) = u + (beta/xi) x [(n/n_u x (1-0.999))^(-xi) - 1]
= -2.1% + (0.85/0.28) x [(2500/125 x 0.001)^(-0.28) - 1]
= -2.1% + 3.036 x [0.02^(-0.28) - 1]
= -2.1% + 3.036 x [3.63 - 1] = -2.1% + 7.98% = -10.08%
Compare: Normal VaR(99.9%) would give approximately -4.8%. EVT gives -10.08% — more than double.
Why EVT Beats Whole-Distribution Fitting:
| Approach | Tail Accuracy | Data Usage | Extrapolation |
|---|---|---|---|
| Normal | Poor (thin tails) | All data, tail diluted | Unreliable beyond 99% |
| Student-t | Better | All data, single df for tail | Moderate |
| EVT (POT) | Excellent | Focuses on tail data only | Theoretically justified |
FRM Key Points:
- xi > 0 for most financial return series (fat tails confirmed)
- The threshold choice is critical — too low and GPD doesn't hold; too high and you have too few observations
- Mean excess plot helps choose the threshold: if it's roughly linear above u, the GPD is appropriate
- EVT can estimate quantiles BEYOND the observed data (e.g., 99.99% VaR from 2,500 observations)
- Basel FRTB uses Expected Shortfall, where EVT is particularly useful
Master tail risk modeling in our FRM Part II Market Risk module.
Master Part II with our FRM Course
64 lessons · 120+ hours· Expert instruction
Related Questions
Why is DV01 so much smaller than dollar duration if both are supposed to measure rate risk?
When should I stop using modified duration and switch to effective duration?
How should I think about the relationship between Macaulay duration and modified duration instead of memorizing two separate definitions?
Why do hedge calculations often use dollar duration or DV01 instead of just modified duration?
When should I prefer historical simulation VaR over delta-normal VaR?
Join the Discussion
Ask questions and get expert answers.