How many years in back testing?

Discussions about the testing and simulation of mechanical trading systems using historical data and other methods. Trading Blox Customers should post Trading Blox specific questions in the Customer Support forum.
Post Reply
marriot
Roundtable Knight
Roundtable Knight
Posts: 347
Joined: Thu Nov 20, 2008 3:02 am

How many years in back testing?

Post by marriot » Wed May 30, 2012 6:50 am

Is it true that is better test using all data ?
Is it true that would be better have more ?
Is ti true that markets have not changed ?
A test based on last 10, 20, 40 years of data, for how long will be in sinc with future?

:D

sluggo
Roundtable Knight
Roundtable Knight
Posts: 2986
Joined: Fri Jun 11, 2004 2:50 pm

Post by sluggo » Wed May 30, 2012 8:51 am


BuyHigh SellLow
Roundtable Fellow
Roundtable Fellow
Posts: 50
Joined: Wed Apr 27, 2011 12:46 pm
Location: U.S.

Post by BuyHigh SellLow » Wed May 30, 2012 11:41 am

sluggo, what can I say, you are pretty awesome. Easily half of my favorite TBlox Forum posts are ones you've written.

squaredQ
Full Member
Full Member
Posts: 23
Joined: Wed Feb 01, 2012 5:27 pm

Post by squaredQ » Sat Jun 02, 2012 5:51 am

I like this question a lot, because it's something I've given quite a lot of thought to. Without divulging my own conclusions in full, I'll make a few comments (also with respect to the other great link posted):

The market is an open system, so we can never really be certain of the true population parameters and behaviours; we can only make estimates about these variables.

There is a great deal of difference in measuring robustness and optimising parameters. On the one hand, more data implies that we can see more unknown non-parametric shocks that we didn't expect with prior data (1987 could not have reasonably been anticipated by any hindsight, statistical or not); and anticipate more likely boundaries. Other things like correlations also change over time. Many of these measures can and do change drastically, rather than remain the same. As the complexity changes and new hithero unseen behaviors are observed, more information about its adaptive changes and responses can be known, studied, and prepared for. Therefore, I would conclude the most robust system is one that has the most historical access to data. Why? Because, we can better model and prepare for, anticipate, and design around realistic anomalies, behaviour, and new boundaries of stress going forward.

On the other hand, optimisation and tweaking does not necessarily benefit from LOTS of data. If you accept that amongst the infinite trajectories, there is some kind of short lived persistence over small time scales, then the most recent guy's data should be the most optimal. In this sense, he's the best in both aspects and it makes a lot of sense from the perspective of evolution as well; he gets to look back and adapt to a larger body of prior evidence and converge closer towards the optimal region if you will.

I just read something pretty profound in the latest Schwager book that echoes some of my own thoughts on the duration optimisation matter, which coming from one of my heroes gives me a lot of comfort. It particularly relates to the topic of correlation lookback period as an input:

"Since correlations between markets change so radically over time, even changing sign, how long of a lookback period do you use?" Schwager to Thorp

"We found that 60 days was about best. If you use too short of a window, you get a lot of noise; if you use too long of a window, you get a lot of old information that isn't relevant."
Ed Thorp

So there you have thoughts about the length of input parameter data as an example from someone who certainly is an exemplar for us to borrow experience from.

Post Reply