Jul 22 2009

Some thoughts on developing quantitative models…

When developing a quantitative model, I typically break my process down into seven steps. I follow the standard outline that econometricians use when developing models.

Theory
The very first step to take in developing a quantitative system is defining the underlying theory behind the model. If we have recognized a phenomenon in the markets that we feel we can exploit, we want to ask ourselves why the phenomenon exists, and more importantly, will it continue to exist? Effectively, what we are asking is, “if this idea works in practice, does it work in theory?” In this stage we develop a fundamental understanding of what we will be modeling and why we are modeling it. By taking the time to define the underlying theory, we can avoid ‘data-mining’ strategies that have no defendable basis.

Data Collection
In the second stage, we collect data that is relevant to our model. Availability of data has a large influence on how the model is designed, so we must keep in mind not only what information we have available now, but what we will have available going forward.

Specifying the Model
In model specification, we begin outlining how the data we have collected will influence the results of our model. In econometrics, this may be determining whether or not certain data elements should have a positive or negative affect, or what form the data should be in (normal, logarithmic, exponential). More broadly, we are trying to determine how to translate the data we have collected into signals we can use.

This is the step that the most time will be spent on. Modelers should do their best to keep their model as simple as possible, using Occam’s razor at each turn, to prevent over-fitting. We want to ensure that the model is as robust as possible while using as little input as possible. John von Neumann once said, “with four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” The goal here should be to create the simplest model possible that still explains the underlying theory.

To ensure that models will work over a large input range, as many parameters as possible should be made adaptive. This means that instead of changing them by hand, the model should change them itself based on the data it is receiving. This helps the model adapt to new conditions.

Estimating Results
After the model has been specified, tests should be run on ‘out-of-sample’ data. Ideally, we would like to perform tests on as much data as possible. What is important is that the data we test on should not be the same data we used to help specify and determine our model. We want to see how well our model works on data it has never seen before. If it does extremely poorly, we should attempt to re-specify our model.

Running Diagnostic Tests
Diagnostic tests in Econometrics often deal with determining whether certain features of the model are statistically significant, or whether we are allowed to make certain assumptions. For example, is our error term white noise, or is it auto-regressive? Does our data have any breaks? Are we using too many parameters to fit our data? What is our trade-off between bias and goodness of fit? While many of the same questions can be asked in developing quantitative investment methods, I choose to use this section of the development cycle to run general tests that quantify the robustness and performance of our model.

Just because a method is mathematical does not mean it is precise. Applying a numerical method to a set of data will result in numbers — just not necessarily meaningful ones. It is important, then to run tests to ensure that our model is robust and statistically significant. We should ensure that our method is available over a large range of parameters, and that the model succeeds in different time-frames and markets. Sometimes, we even would like to test to make sure that our mode does not work on different markets and time-frames, if our theory calls for it. If we have written a model to take advantage of certain liquidity occurrences in forex markets based on time of day, we would not expect it to work on U.S. equities, because it does not follow our underlying thesis. If it does, than we should seriously rethink our theory or our model. On the other hand, if we are running a breakout strategy on U.S. futures, we should expect it to work over several time-frames and futures markets, unless we are specifically taking advantage of certain qualities of a specific instrument, which would then be defined in our theory.

If we find that our model fails our diagnostic tests, we would want to return to specifying our model.

Run Hypothesis Tests
After developing a model and running diagnostic tests, we want to determine whether our not our hypothesis is true. Ultimately, we want to ask, “if it works in theory, does it work in practice?” Sometimes a theory may not take into account realities such a commission and slippage which may reduce the edge of a certain strategy. Our hypothesis, more often than not, is quite simply that the model will provide a methodology for creating alpha. Ultimately, this tests comes in two parts. First, does it generate the returns we are looking for? Second, is it statistically significant?

The first is fairly easy to perform. The second one takes a bit of creativity. One of the best tests I have found is to simply determine whether our model is significantly better than ‘random.’ Let’s assume that your model outputs 1, 0, or -1 for buy, neutral, or short signals. When you run your model over the data, determine the percentage of 1s, 0s, and -1s that are in the output. Then, generate several thousand random outputs with the same distribution of 1s, 0s, and -1s. Determine the theoretical returns of these systems. If your system does not decisively outperform the vast majority of these systems, it is more than likely that your model simply stumbled upon a method that gives the right distribution of 1s, 0s, and -1s instead of actually modeling the underlying data.

For example, let’s say I had data on the S&P 500 from 2003 to 2007, and I am attempting to generate monthly buy or sell signals. Let’s say I start with a simple system that always generates a buy signal. What we would find is that even though our model performs extremely well, it does not beat random. Every other random model would also have a distribution of 100% buy signals, and give an identical return. What we have discovered, then, is not that our model necessarily appropriately describes the data, but rather chose an appropriate distribution of signals to maximize return.

Conclusion
Now that our thesis has been defined, our model specified, tested for robustness, and successfully shown to not only generate positive returns in a statistically significant manner, we are ready to run it in the wild — and with confidence; we have successfully developed a model that is defendable both in theory and in practice!

  • Share/Bookmark