How do AIC and BIC work for comparing risk models, and when would they give different recommendations?
I'm learning about model selection criteria for FRM Part I. Both AIC and BIC penalize model complexity, but I've seen cases where AIC prefers a more complex model while BIC prefers a simpler one. What's the intuition behind each, and which should I trust for risk management applications?
AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) are both tools for comparing models that balance goodness-of-fit against complexity, but they differ in how aggressively they penalize additional parameters.
The Formulas
> AIC = -2 * ln(L) + 2k
> BIC = -2 ln(L) + k ln(n)
Where L is the maximized likelihood, k is the number of estimated parameters, and n is the sample size. Lower values indicate a better model.
The key difference is the penalty term:
- AIC penalty: 2k (constant per parameter)
- BIC penalty: k * ln(n) (grows with sample size)
For n > 8 (i.e., ln(n) > 2), BIC penalizes complexity more heavily than AIC. With typical risk management datasets of thousands of observations, the BIC penalty can be dramatically larger.
Example: Hawthorne Risk Group's VaR Model Selection
Hawthorne Risk Group is choosing among three GARCH variants for its equity portfolio VaR:
| Model | Parameters (k) | Log-Likelihood | AIC | BIC (n=1000) |
|---|---|---|---|---|
| GARCH(1,1) | 3 | -1285.4 | 2576.8 | 2591.5 |
| GJR-GARCH(1,1) | 4 | -1282.1 | 2572.2 | 2591.8 |
| EGARCH(2,1) | 5 | -1280.8 | 2571.6 | 2596.1 |
- AIC selects EGARCH(2,1): lowest at 2571.6 — the improved fit justifies the extra parameters.
- BIC selects GARCH(1,1): lowest at 2591.5 — the simpler model wins because BIC's heavier penalty outweighs the modest fit improvement.
When They Disagree — What to Do
This disagreement reflects a fundamental tradeoff:
- AIC is asymptotically efficient: it minimizes prediction error for the true model. It tends to select slightly more complex models that capture real but subtle features.
- BIC is consistent: as n grows, it converges to the true model if it is among the candidates. It prefers parsimony and avoids overfitting.
For risk management:
- Use BIC when robustness matters most (regulatory capital models, long-horizon forecasts). Overfitting a VaR model to recent data can produce dangerously optimistic risk estimates.
- Use AIC when predictive accuracy on new data is the priority (short-term trading risk, dynamic hedging). The extra complexity may capture real dynamics.
- Best practice: Report both. If they agree, you have strong evidence. If they disagree, investigate whether the extra parameters capture economically meaningful features (like asymmetric volatility in GJR-GARCH) or just noise.
FRM exam tip: Be able to calculate both AIC and BIC given log-likelihood, k, and n. Know that BIC penalizes more heavily for large samples and understand the consistency vs. efficiency distinction. A common exam question presents two or three models and asks which criterion selects which model.
Practice model selection problems in our FRM Part I question bank.
Master Part I with our FRM Course
64 lessons · 120+ hours· Expert instruction
Related Questions
Why is DV01 so much smaller than dollar duration if both are supposed to measure rate risk?
When should I stop using modified duration and switch to effective duration?
How should I think about the relationship between Macaulay duration and modified duration instead of memorizing two separate definitions?
Why do hedge calculations often use dollar duration or DV01 instead of just modified duration?
When should I prefer historical simulation VaR over delta-normal VaR?
Join the Discussion
Ask questions and get expert answers.