What is maximum likelihood estimation (MLE) and how is it used in risk modeling?

Question

AcadiFi · Accepted Answer

Maximum Likelihood Estimation (MLE) is a parameter estimation method that finds the values most likely to have produced the observed data. Unlike OLS (which minimizes squared errors), MLE maximizes the probability of observing the actual data given a distributional assumption.

**The Core Intuition:**

Imagine you flip a coin 100 times and get 65 heads. What's the most likely probability of heads? MLE says: try every possible p from 0 to 1, calculate the probability of getting exactly 65 heads in 100 flips for each p, and pick the p that gives the highest probability. The answer is p = 0.65.

**Formal Setup:**

Given observations x1, x2, ..., xn from a distribution f(x|theta), the likelihood function is:

L(theta) = Product of f(xi|theta) for all i

We typically maximize the **log-likelihood** (since products become sums):

ln L(theta) = Sum of ln f(xi|theta)

**Risk Modeling Example:**
Suppose Ashford Risk Analytics models daily portfolio losses as Normal(mu, sigma^2). With 250 observations:

ln L(mu, sigma) = -250/2 x ln(2*pi) - 250/2 x ln(sigma^2) - Sum of (xi - mu)^2 / (2*sigma^2)

Taking derivatives and setting to zero:
- MLE of mu = sample mean
- MLE of sigma^2 = (1/n) x Sum of (xi - mu)^2 (note: divides by n, not n-1)

```mermaid
flowchart TD
    A[Observed Data] --> B[Assume Distribution Family]
    B --> C[Write Likelihood Function]
    C --> D[Take Log-Likelihood]
    D --> E[Differentiate w.r.t. Parameters]
    E --> F[Set Derivatives = 0]
    F --> G[Solve for MLE Estimates]
    G --> H[Use for Risk Models]
```

**When MLE Beats OLS:**
- **Non-normal distributions**: For fat-tailed models (Student-t, GEV), MLE naturally handles the distributional shape
- **Binary outcomes**: Logistic regression (used in credit scoring) uses MLE because the outcome is 0/1
- **GARCH models**: Volatility clustering models require MLE since there's no closed-form OLS solution
- **Censored/truncated data**: MLE can properly handle incomplete observations

**Key MLE Properties for FRM:**
1. **Consistent**: Converges to the true parameter as sample size grows
2. **Asymptotically efficient**: Achieves the lowest possible variance among consistent estimators
3. **Asymptotically normal**: The MLE distribution approaches Normal for large samples
4. **Invariant**: If theta-hat is MLE of theta, then g(theta-hat) is MLE of g(theta)

Master MLE and other estimation techniques in our FRM Part I question bank.

What is maximum likelihood estimation (MLE) and how is it used in risk modeling?

Master Part I with our FRM Course

Related Questions

Practice Questions