Spring 2021

Introduction

How are stock returns distributed? According to [1], stock returns used to be interpreted as being normally distributed. Rather, real life data tend to have a thicker tail. This means that if we use a financial model based on normal return distribution, extreme events would occur more often than expected. This would consequently lead to large, unexpected losses.

Simple googling shows that people have been exploring different distribution functions, trying to find which is a better fit. One attempt[2] proves that returns are NOT normally distributed, and proposes the Laplace distribution instead. Hmm... not a bad idea? Unfortunately, the authors only disproves the Normal distribution hypothesis using the Shapiro-Wilk test and tests the Laplace distribution by simply plotting the cumulative distribution functions (CDFs).

Hence, I took the pleasure to take the paper one step further and try a goodness-of-fit test for the Laplace distribution!

Data

For this simple test, I chose four big stocks from S&P: Apple, Microsoft, GM and IBM. Not much reason behind this choice, really, other than that they are probably more prone to anomalies...? (which is also a clueless guess by the way). I took the daily change in closing price of each stock from the first day of 2018 to the last. Data was gathered using Yahoo finance API.

Plotting the Laplace curve onto the histograms does look somewhat promising, at least more than normal distributions. Hopefully, a statistical test would prove this. So, my next step would be to choose the suitable goodness-of-fit test!

Kolmogorov-Smirnov Test...?

Upon tabs and windows of pages, I noticed that most articles use the Kolmogorov-Smirnov test (K-S test) when proposing a new distribution hypothesis. But upon inspection, the K-S test does not seem suitable for this purpose.

Roughly speaking, the K-S test measures how good a distribution function describes a set of data using the maximum difference between two cumulative distribution functions: the empirical CDF of actual data, and a hypothesized CDF.

Comparing two CDF, the black arrow representing the maximum difference.

Comparing two CDF, the black arrow representing the maximum difference.

However, the maximum difference would usually occur somewhere near the peak of the distribution (mean/median), or at least not at the tails. But as mentioned previously, one of the biggest reasons we consider stock return to be not normally distributed the thicker tails. Consequently, the K-S test is unsuited for our purpose.