Sunday, 14 November 2010

Yo Momma is a Data Miner

Having a lot of data makes research easier - we now have more data in easily readable formats than ever before, and an amazing amount of computing power on our desktops (I have far more horsepower on my desk than NASA had in total in the 1980s)..

Unfortunately, there's a flip side to that coin - we can easily find variables (or specifications) that seem to "predict" returns (or just about anything). In reality, we're often just overfitting the data.

Here's a pretty good piece on the topic titled "Yo Momma is a Data Miner", by David Leinwebber in which he fits a polynomial time-series regression to the S&P 500 with surprising (if you don;t follow what he's doing) good results - particularly since he's using things like the sheep population and Bangladesh Butter production as regressors.

No comments:

Post a Comment