Guidelines for system rules & degrees of freedom
Guidelines for system rules & degrees of freedom
I am looking for some general guidelines for mechanical system design to help avoid overfitting to past data. As a general rule it would make sense to try to minimise the number of trading rules and also only use rules which come into play frequently (ie. avoid rules which eliminate a small number of big loosers or capture a small number of big winners).
When I started with my first mechanical trend following system I had X rules:
Entry Trend Filter: 3 rules
Entry volatility filter: 1 rule
Entry timing trigger: 2 rules
Liquidity filter: 2 rules
Initial stop: 1 rule
Trailing stop:1 rule
Total: 10 rules
This system has worked fine in real time, but I have been concerned about the number of degrees of freedom and the potential for overfitting during my optimisation...noting that during optimisation i am not searching for the 'best' paramater combinations but rather I am searching for broad hills of good performance (in line with discussions by Pardo and Stridesman etc).
I have been working on a modified (simplified) system which uses a similar philosophy that has the following:
Entry Trend Filter: 1 rule
Entry volatility filter: 1 rule
Entry timing trigger: 1 rules
Liquidity filter: 0 rules
Initial stop: 1 rule
Trailing stop:1 rule
Total: 5 rules
This works equally well (if not better) in historical testing, and generates many more trades
What guidelines are other mechanical traders using to ensure their systems do not take up too many degrees of freedom  how many rules is too many and how many rules is not enough to generate strong performance?
When I started with my first mechanical trend following system I had X rules:
Entry Trend Filter: 3 rules
Entry volatility filter: 1 rule
Entry timing trigger: 2 rules
Liquidity filter: 2 rules
Initial stop: 1 rule
Trailing stop:1 rule
Total: 10 rules
This system has worked fine in real time, but I have been concerned about the number of degrees of freedom and the potential for overfitting during my optimisation...noting that during optimisation i am not searching for the 'best' paramater combinations but rather I am searching for broad hills of good performance (in line with discussions by Pardo and Stridesman etc).
I have been working on a modified (simplified) system which uses a similar philosophy that has the following:
Entry Trend Filter: 1 rule
Entry volatility filter: 1 rule
Entry timing trigger: 1 rules
Liquidity filter: 0 rules
Initial stop: 1 rule
Trailing stop:1 rule
Total: 5 rules
This works equally well (if not better) in historical testing, and generates many more trades
What guidelines are other mechanical traders using to ensure their systems do not take up too many degrees of freedom  how many rules is too many and how many rules is not enough to generate strong performance?
Adrian77,
You already understand the concept, that simpler (but not too simple) system works more robustly than a more complex system. Unfortunately beyond that I don't know any numerical rules, that tells you exactly how many rules are too many.
However Walk Forward testing will likely tell you, if you overoptimized or not. I would suggest using 20 years worth of data, use the first 10 years for system development, then use the last 10 years for Walk Forward testing (10 tests 1 year at a time) to prove the robustness of the system.
Unfortunately one can only do that manually in TB, (I wish it would be done automatically in any time period including bar by bar), but it can be done.
Maybe others have more concrete ideas for the max allowable degree of freedom, but I don't know. I just know how I prefer to test for robustness.
Just my 2c
explorer
You already understand the concept, that simpler (but not too simple) system works more robustly than a more complex system. Unfortunately beyond that I don't know any numerical rules, that tells you exactly how many rules are too many.
However Walk Forward testing will likely tell you, if you overoptimized or not. I would suggest using 20 years worth of data, use the first 10 years for system development, then use the last 10 years for Walk Forward testing (10 tests 1 year at a time) to prove the robustness of the system.
Unfortunately one can only do that manually in TB, (I wish it would be done automatically in any time period including bar by bar), but it can be done.
Maybe others have more concrete ideas for the max allowable degree of freedom, but I don't know. I just know how I prefer to test for robustness.
Just my 2c
explorer
Adrian77,
I run across some answers for your question while I was searching for something else.
According to this book:
Beyond Technical Analysis,
How to Develop and Implement a Winning Trading System
Tushar S. Chande, PhD
"The statistical theory of design of experiments says that even complex processes are controllable using five to seven "main" variables. It is rare for a process to depend on more than ten main variables, and it is quite difficult to reliably control a process that depends on 20 or more variables. It is also rare to find processes that depend on the interactions of four or more variables. Thus, the effect of higherorder interactions is usually insignificant. The goal is to keep the overall number of rules and variables as small as possible."
There are another dozen pages from this book, that give more details, formulas and charts related to the number of rules, but I can't put it all here. You may find a used copy of the book if more detail is needed.
explorer
I run across some answers for your question while I was searching for something else.
According to this book:
Beyond Technical Analysis,
How to Develop and Implement a Winning Trading System
Tushar S. Chande, PhD
"The statistical theory of design of experiments says that even complex processes are controllable using five to seven "main" variables. It is rare for a process to depend on more than ten main variables, and it is quite difficult to reliably control a process that depends on 20 or more variables. It is also rare to find processes that depend on the interactions of four or more variables. Thus, the effect of higherorder interactions is usually insignificant. The goal is to keep the overall number of rules and variables as small as possible."
There are another dozen pages from this book, that give more details, formulas and charts related to the number of rules, but I can't put it all here. You may find a used copy of the book if more detail is needed.
explorer
Iâ€™ve read the book for about two thirds now and I want to congratulate you on a job very well done! Iâ€™m writing software and systems for myself and have studied and practiced trading for about 10 years now. Iâ€™ve learned most things of the things mentioned in your book the hard way . But when I started reading your book it was almost as if I was reading my own book, meaning that it gave me great joy knowing that my way of handling and seeing things is probably a good one. It was very recognizable. Especially the parts referring to game theory. Most people focus way to much on the percentage winners (and hence loosers) in de expectancy formula, neglecting what are perhaps the two most important variables: the average size of winners and loosers. I wrote my own software because I wanted to research adaptive and synthetic systems myself (even if it ment just enjoying the beauty of it all whilst feeding my passion). Also incorporating fuzzy logic to learn software subjective terms such as â€˜a big candleâ€™ and â€˜probable crossoverâ€™. Anyway, it had a lot of fun doing all of this, while making some decent money (nothing exuberant), not surprisingly with the most simple of systems. And, to wrap this story up, in tryning to maximize gains and minimizing losses I see a lot of people using SL, options to minimize losses but trying to maximize gains seems always to be done merely through leverage. I focussed a lot on â€˜reverse piramidingâ€™ so I could test my systems when I told them to load up the more it becomes clear they got a winner. And with very nice results I might add. Thatâ€™s what I was doing for several years already in live trading and I couldnâ€™t find software where it was easy to test this kind of stuff. In the meantime a lot has changed in â€˜trading software landâ€™ for the better. I was very pleased when I begun to read that you had about the same background as I did when you started (For that matter, I hold by now, myself, bought a software and electronics master and currently teach statistics and software design and analysis, design patterns and the like).. Am I correct in stating that you donâ€™t trade anymore, yourself?
I have two questions regarding your book at this point (I arrived at page 163 this morning):
1) If it is so, as you state and Iâ€™m willing to believe, that one has to look out for overfitting/overoptimizing or curve fitting by looking at, among others, rules that are used infrequently, then the number of times a rule is involved in the decision making in a testrun relative to the number of trades that are registered can function as a robust measure of systemquality and the way a system is subject to overoptimization?
2) My second question concerns your text on the optimization paradox on page 171, where you are trying to proof that optimizing is the way to go but almost guarantees lesser results in the future. But in fact, as the future is no exact replica of the past, figure 115 would almost certainly be a different graph, if not a totally different on. So value A could, in fact be the optimum value for that parameter, while value B might become the worst. The graph may be vertically mirrored for that matter. So do I look at this totally wrong or is there any base for assuming that the optimization graph will likely be the same in the future as in the past, to that extent that value A and B will be still in the neighbourhood of the optimum???
3) One of the things I also wonder about is why we couldnâ€™t use trendanalysis on the equity curve just as we do on prices. What I mean is this. Periods of good system performance seem tot interchange with periods of lesser performance (drawdowns). If we see the equitycurve breaking its trend (with whatever possible way one wants to measure this by, say it by trendlines, averages being crossed over, just name it â€¦) we could stop trading the system (or even go the other way with it and start trading oppposites). When the equitycurve reshapes in a positive trend we could start trading it again.
4) One thing I diagree about, at least a little , is your warning in not missing any trends. Isnâ€™t it basic statistics that one can expect the same results, on average, over a long time when picking random trades from a trading system? Say, for instance, we have a system peforming nice in both tests as in real trading and we did 10000 trades. Couldnâ€™t one expect approximately the same performance when we did only 8000 trades we picked randomly through its history?
Just my two cents.
Looking forward to a reaction!
Dirk
I have two questions regarding your book at this point (I arrived at page 163 this morning):
1) If it is so, as you state and Iâ€™m willing to believe, that one has to look out for overfitting/overoptimizing or curve fitting by looking at, among others, rules that are used infrequently, then the number of times a rule is involved in the decision making in a testrun relative to the number of trades that are registered can function as a robust measure of systemquality and the way a system is subject to overoptimization?
2) My second question concerns your text on the optimization paradox on page 171, where you are trying to proof that optimizing is the way to go but almost guarantees lesser results in the future. But in fact, as the future is no exact replica of the past, figure 115 would almost certainly be a different graph, if not a totally different on. So value A could, in fact be the optimum value for that parameter, while value B might become the worst. The graph may be vertically mirrored for that matter. So do I look at this totally wrong or is there any base for assuming that the optimization graph will likely be the same in the future as in the past, to that extent that value A and B will be still in the neighbourhood of the optimum???
3) One of the things I also wonder about is why we couldnâ€™t use trendanalysis on the equity curve just as we do on prices. What I mean is this. Periods of good system performance seem tot interchange with periods of lesser performance (drawdowns). If we see the equitycurve breaking its trend (with whatever possible way one wants to measure this by, say it by trendlines, averages being crossed over, just name it â€¦) we could stop trading the system (or even go the other way with it and start trading oppposites). When the equitycurve reshapes in a positive trend we could start trading it again.
4) One thing I diagree about, at least a little , is your warning in not missing any trends. Isnâ€™t it basic statistics that one can expect the same results, on average, over a long time when picking random trades from a trading system? Say, for instance, we have a system peforming nice in both tests as in real trading and we did 10000 trades. Couldnâ€™t one expect approximately the same performance when we did only 8000 trades we picked randomly through its history?
Just my two cents.
Looking forward to a reaction!
Dirk
Hi Dirk, thought I'd take a stab at some of this, if you don't mind.
In expectancy, win% and the win/loss size are both important, and generally work against each other. It's hard to find a system that has both, and to the extent that I maximize one, I degrade the other. I keep in mind that it can be psychologically hard to trade a low win% and a low win% will increase my risk of going out of business, given the same expectancy as a different system with a higher win%.
I remember to add "inventory turnover" to my evaluation. If choosing between two systems with equal expectancy, I would choose the one that gave me more trades, because more money risked at a given expectancy means more money earned.
Practically speaking, if I had a system that seemed to work but I wanted to see if the rules or filters are redundant or superfluous, I might try deleting the rules, one at a time, and rerunning the tests. I would keep in mind that substantially similar test results are probably the same thing as exactly equal test results (because backtesting isn't a 100% accurate picture of the future) and if I got the same results with fewer rules, I would go for fewer rules.
Google antimartingale systems and Kelly betting in regards to question 3.
I think the warning about not missing any trends comes from the system they were trading, which had low win% and a high win/loss size. In that type of system, you don't want to miss the one trade that might make your year, since a very few trades contribute most of the profits. In a system with a high win% and lower win/loss, missing a few trades doesn't hurt.
In expectancy, win% and the win/loss size are both important, and generally work against each other. It's hard to find a system that has both, and to the extent that I maximize one, I degrade the other. I keep in mind that it can be psychologically hard to trade a low win% and a low win% will increase my risk of going out of business, given the same expectancy as a different system with a higher win%.
I remember to add "inventory turnover" to my evaluation. If choosing between two systems with equal expectancy, I would choose the one that gave me more trades, because more money risked at a given expectancy means more money earned.
Practically speaking, if I had a system that seemed to work but I wanted to see if the rules or filters are redundant or superfluous, I might try deleting the rules, one at a time, and rerunning the tests. I would keep in mind that substantially similar test results are probably the same thing as exactly equal test results (because backtesting isn't a 100% accurate picture of the future) and if I got the same results with fewer rules, I would go for fewer rules.
Google antimartingale systems and Kelly betting in regards to question 3.
I think the warning about not missing any trends comes from the system they were trading, which had low win% and a high win/loss size. In that type of system, you don't want to miss the one trade that might make your year, since a very few trades contribute most of the profits. In a system with a high win% and lower win/loss, missing a few trades doesn't hurt.
Here are three Blox systems with different numbers of useradjustable parameters. One of them has 4 parameters, another has 5 parameters, and a third system has 6 parameters.
They embody the same basic trading idea and differ only in small details. The main idea is, construct a moving average line and then draw volatility bands above and below the moving average. When price breaks above the upper volatility band, get long. When price breaks below the lower volatility band, get short.
When I backtest these systems on several different portfolios of futures markets, over five, ten, twenty, and thirty year periods, with tens of thousands of different parameter settings, I find that in the end I prefer the 6parameter system the most, the 5parameter system second best, and the 4parameter system least. Please note that I'm not trying to proclaim which system is "best" of the three, I'm only giving my own personal opinion based upon my own subjective judgement. Just because I like the 6param system doesn't mean you will like it too. (incidentally, the parameter value settings I prefer are not necessarily the ones shown in the attached images.)
But sometimes people want to know "what should I do?" Should I choose the 4parameter system because it eats up the fewest degrees of freedom; because it has the least number of rules? Should I reject the 5 and 6parameter systems because the 4parameter system is more compact? In my opinion, there is no unique "right answer" to this question that applies to all traders and all account sizes and all human personalities. In my opinion, Rabelais had the right idea: I go to seek a vast perhaps.
*the 3rd system is a free download in the customers section of this website: viewtopic.php?p=25698&highlight=therm%2A#25698
They embody the same basic trading idea and differ only in small details. The main idea is, construct a moving average line and then draw volatility bands above and below the moving average. When price breaks above the upper volatility band, get long. When price breaks below the lower volatility band, get short.
When I backtest these systems on several different portfolios of futures markets, over five, ten, twenty, and thirty year periods, with tens of thousands of different parameter settings, I find that in the end I prefer the 6parameter system the most, the 5parameter system second best, and the 4parameter system least. Please note that I'm not trying to proclaim which system is "best" of the three, I'm only giving my own personal opinion based upon my own subjective judgement. Just because I like the 6param system doesn't mean you will like it too. (incidentally, the parameter value settings I prefer are not necessarily the ones shown in the attached images.)
But sometimes people want to know "what should I do?" Should I choose the 4parameter system because it eats up the fewest degrees of freedom; because it has the least number of rules? Should I reject the 5 and 6parameter systems because the 4parameter system is more compact? In my opinion, there is no unique "right answer" to this question that applies to all traders and all account sizes and all human personalities. In my opinion, Rabelais had the right idea: I go to seek a vast perhaps.
*the 3rd system is a free download in the customers section of this website: viewtopic.php?p=25698&highlight=therm%2A#25698
 Attachments

 volatility bands systems with differing numbers of user adjustable parameters
 bands_systems.png (57.33 KiB) Viewed 8204 times
Curve fitting impossible with TBB
If my dated mathematics is correct, for a graph with N turning points, you need an equation with N+2 variables.
E.g a quadratic equation with 1 turning point can be perfectly described by:
ax2 + bx + c = 0
For every extra turning point, you need an additional variable to capture it.
Given a market of 40 or so futures over 10 years, the number of turning points becomes astronomical.
In such circumstances using simple systems such as Turtle, Donchain & Triple MA, any type of curvefitting is impossible, and you are truly testing your statistical probability of making money. I think you would be hard pushed to prove even partial curve fitting by simply optimizing your parameters to get your chosen MAR/CAGR.
E.g a quadratic equation with 1 turning point can be perfectly described by:
ax2 + bx + c = 0
For every extra turning point, you need an additional variable to capture it.
Given a market of 40 or so futures over 10 years, the number of turning points becomes astronomical.
In such circumstances using simple systems such as Turtle, Donchain & Triple MA, any type of curvefitting is impossible, and you are truly testing your statistical probability of making money. I think you would be hard pushed to prove even partial curve fitting by simply optimizing your parameters to get your chosen MAR/CAGR.
Spot on, dmford.
I was trying to answer the 12/19/2007 questions specifically with my earlier comment, but I'll take a stab at this, too.
If I were analyzing a stock market index, I might try to fit various technical, valuation, and econometric data to the index. Looking at 20+ years of data with points at every week, I wouldn't feel adverse to a dozen or more variables in a multivariate setting  with 1000 data points, I think there's little danger of overfitting. Ditto with maybe six to ten variables for selecting stocks to trade; or four to eight technical rules for commodities if applied to a batch of them; etc.
To me, a more important worry than overfitting is the inclusion of variables with spurious or insignificant correlation, which adds noise to the data and complication to the process.
In a regression equation, one gets "student's t" and "pvalue" output, as well as an "Fscore" by which to evaluate the variables (or the total, via the "Fscore"). In a system like Blox, perhaps a reverse stepwise approach, where variables are pruned off one at a time until test results degrade significantly, is an option.
I was trying to answer the 12/19/2007 questions specifically with my earlier comment, but I'll take a stab at this, too.
If I were analyzing a stock market index, I might try to fit various technical, valuation, and econometric data to the index. Looking at 20+ years of data with points at every week, I wouldn't feel adverse to a dozen or more variables in a multivariate setting  with 1000 data points, I think there's little danger of overfitting. Ditto with maybe six to ten variables for selecting stocks to trade; or four to eight technical rules for commodities if applied to a batch of them; etc.
To me, a more important worry than overfitting is the inclusion of variables with spurious or insignificant correlation, which adds noise to the data and complication to the process.
In a regression equation, one gets "student's t" and "pvalue" output, as well as an "Fscore" by which to evaluate the variables (or the total, via the "Fscore"). In a system like Blox, perhaps a reverse stepwise approach, where variables are pruned off one at a time until test results degrade significantly, is an option.