Jan
6
2010
Some thoughts on Back Testing
Back testing is fairly common when analyzing the profitability of a strategy, but there are many other things to be considered besides returns. Much of this list came from perusing Nuclear Phynance (particularly, FDAXHunter’s input).
- Length of the Period Tested: Over what time-frame did you run the test? How long was that time-frame? What market conditions persisted over it? The goal should be to run the strategy on as diverse a time-frame as possible, to help you discover what market factors play a critical role in the success or failure of your method.
- Out of Sample Tests & Locations: Much like above, you want to have a fairly significant and diverse set of out of sample data to test on.
- Average Trade / Win / Loss: What does the average trade of the system look like? Are you normally profitable, or do you lost money and it was a couple fat-tail trades that gave you profitability?
- Volatility & Skewness of P&L Stream: Is your profitability stable? Do you have a fat loss tail? How skewed positive are you?
- Maximum Consecutive Losers / Winners: This is very important for when the system goes live. Is five bad trades in a row a reason to pull the system? Ten? What is normal for the system? When should we start getting concerned?
- Maximum Draw Down & Time: If we implemented the system, what sort of draw-downs would we have to stomach, and over how long would we have to stomach them? I don’t care much about a 1300% return over 5 years if for 4 of them, I faced an 80% draw-down. You would probably pull the plug long before the fifth year came around.
- Average Draw Down & Time: What does the average draw down look like? Is it stable?
- Percent of Winners Removed Until Neutral: What percentage of our best trades do we have to remove before the system breaks? Is it only a few? Is our success based on a few large winners, or do we have a stable set of success?
- Histogram of P&L: What does the P&L look like, historically? This is a visualization of the skew and volatility from above.
- Shape of Equity Curve: Are we talking about a long, smooth curve? A curve with lots of jumps? How much interim volatility between new highs?
- Optimal Parameter Location: Are the parameters we used in a stable location, or are they at a pin-point? If they are at a pin-point, the success of the model is most likely the result of data-mining, instead of a true edge. Instead, we would like to see that our parameters are in a plateau — the model remains stable for moderate changes in our parameter values.
- Performance in Other Markets: How does the model perform on securities it wasn’t designed to trade for? Does it succeed in similar securities? Does it fail in securities which the edge shouldn’t exist on?
All of these things should be considered along-side a simple profitability analysis, or else you will end up with a ‘successful back-test of three years, blow up in three days’ scenario.