Parameter Stepping Methods and Implications
Parameter Stepping Methods and Implications
A couple of the choices which need to be made in optimization testing of a trading system are which parameters to step, the min to max range and how much to step within the range. In regards to step size, most discussion Iâ€™ve seen revolves around the absolute size of the step. For example, if you wanted to step long moving average days from 50 to 250 days, you might consider stepping by 10 days (21 steps) or 20 days (11 steps).
However, another approach would be to step by equal percentage increments instead of equal absolute size increments. For instance, in the example above one might optimize over approximately the same range by starting at 50 and doing 10 tests with 20% increments. (50, 60, 72, 86, 104, 124, 149, 179, 215, 258)
According to Perry Kaufman (refer to his book, â€œNew Trading Systems and Methodsâ€
However, another approach would be to step by equal percentage increments instead of equal absolute size increments. For instance, in the example above one might optimize over approximately the same range by starting at 50 and doing 10 tests with 20% increments. (50, 60, 72, 86, 104, 124, 149, 179, 215, 258)
According to Perry Kaufman (refer to his book, â€œNew Trading Systems and Methodsâ€
An even more "unbiased" way to run simulations is: don't allow humans to control which parameter values are (or are not) included in the set of tests. Instead, choose the parameter values for each test by random chance; a method sometimes called "Monte Carlo". Now you don't care whether opinion 1 is right or wrong; nor do you care whether opinion 2 is right or wrong; you're not relying upon either one of them. You're picking parameter values AT RANDOM! If you are truly paranoid / cautious, repeat the exercise with several different random number generators having several different probability distributions.
Opinion 1: Stepping parameters linearly (and with high granularity!) from Small to Big, is perfectly adequate and negligibly biased
Opinion 2: Opinion 1 is wrong! Stepping parameters geometrically (i.e. stepping the percentages) is the only legitimate, righteous, pure, and holy method.
Opinion 1: Stepping parameters linearly (and with high granularity!) from Small to Big, is perfectly adequate and negligibly biased
Opinion 2: Opinion 1 is wrong! Stepping parameters geometrically (i.e. stepping the percentages) is the only legitimate, righteous, pure, and holy method.
Another way to avoid "biased" optimization is: stop looking for the best parameter set. Quit looking for the optimum.
Instead of doggedly searching for one "best" set of system parameters, you could content yourself to find (just to name a number) six "reasonably good" sets of parameters, and then trade this combination of six systems. You could trade them simultaneously in a "suite", if you have enough (6X) capital. Or you could aggregate their signals using some procedure or another (HERE is one procedure that I happen to like) so you only need 1X the capital.
Now that you're not looking for the "best" set of paraeter values, you don't care too much whether the spacing between the dots (which is set by the parameter stepping scheme you select) has the kind of gradient that you feel is "unbiased". Instead you tell yourself, "Hey I've got six pretty good systems that are all fairly different from each other so I feel good about the diversification I'm getting."
Instead of doggedly searching for one "best" set of system parameters, you could content yourself to find (just to name a number) six "reasonably good" sets of parameters, and then trade this combination of six systems. You could trade them simultaneously in a "suite", if you have enough (6X) capital. Or you could aggregate their signals using some procedure or another (HERE is one procedure that I happen to like) so you only need 1X the capital.
Now that you're not looking for the "best" set of paraeter values, you don't care too much whether the spacing between the dots (which is set by the parameter stepping scheme you select) has the kind of gradient that you feel is "unbiased". Instead you tell yourself, "Hey I've got six pretty good systems that are all fairly different from each other so I feel good about the diversification I'm getting."
 Attachments

 From thread "One system, 24000 parameter settings"
 Suite_of_Many.png (46.37 KiB) Viewed 5769 times
sluggos points aside,
bobsyd, i think you are correct. I have for many years tried to stucture my parameter selection around the idea of percentage stepping.
given that a dedicated %step feature has not been available( in this or many other backtest software), there is not a huge penalty to just "fudging it", by splitting your total range into 23 baskets each with progressively larger steps(though the same number of steps in each).
It is an acceptable workaround.
bobsyd, i think you are correct. I have for many years tried to stucture my parameter selection around the idea of percentage stepping.
given that a dedicated %step feature has not been available( in this or many other backtest software), there is not a huge penalty to just "fudging it", by splitting your total range into 23 baskets each with progressively larger steps(though the same number of steps in each).
It is an acceptable workaround.
I wouldn't favour either fixed step nor fixed percentage parameter selection, if your intention is to pick more than one point on the Sharpefrontier (or whatever you wish to call it). Instead I would go for "fixed" correlation. Obviously this means that we have to do some backwardengineering, but that is a fairly simple thing. And once we do we will find out that we can write a very simple function that returns which parameters to use in order to get the most even correlation between system returns (it will not be perfect at all times, and given any universe, but pretty close). What I'm saying is that we don't want the correlation to be 0.99 between three settings and 0.8 between the other three (given six in total), but instead we want all six to show 0.95 (for example). If you just want to pick the best setting it doesn't really matter how you do it since what you pick will be extremely close to the maximum no matter what. In addition, whatever you pick, it will not represent the best performing parameter setting in the future, so why pay so much attention to the exact steps involved

 Roundtable Knight
 Posts: 229
 Joined: Thu Jul 08, 2010 2:36 pm
 Location: Boulder, CO
 Contact:
I am curious as to the logic Kaufman is using to make the claim that searching the parameter space using equal steps will result in a slower trading system than searching the space using percentage steps.
Consider a parameter that could take any value from 0 to infinity. This is your parameter space. In Bobsyd's original example we see a proposed search using a subset of the parameter space using percentagebased values, let's call this Percentage Search Space, PSS : {50, 60, 72, 86, 104, 124, 149, 179, 215, 258}. Consider another search space using equally spaced values, we will call it ESS: {50, 51, ..., 258}. Clearly PSS is a subset of ESS. It follows that IF the optimum is a member of PSS it must also be a member of ESS: both searches will recommend that same parameter setting.
So unless the issue is simply one of practicality (researchers don't search the small numbers in enough detail) I can't see how one method or the other results in a different system. I mean, if you discover using steps of 50 that the optimum appears to be around 100, wouldn't you then search around 100 using finer granularity?
Consider a parameter that could take any value from 0 to infinity. This is your parameter space. In Bobsyd's original example we see a proposed search using a subset of the parameter space using percentagebased values, let's call this Percentage Search Space, PSS : {50, 60, 72, 86, 104, 124, 149, 179, 215, 258}. Consider another search space using equally spaced values, we will call it ESS: {50, 51, ..., 258}. Clearly PSS is a subset of ESS. It follows that IF the optimum is a member of PSS it must also be a member of ESS: both searches will recommend that same parameter setting.
So unless the issue is simply one of practicality (researchers don't search the small numbers in enough detail) I can't see how one method or the other results in a different system. I mean, if you discover using steps of 50 that the optimum appears to be around 100, wouldn't you then search around 100 using finer granularity?

 Full Member
 Posts: 11
 Joined: Sun Aug 28, 2011 4:00 am
 Location: A Hot and Dusty Place
My first post  call me a Newbie to the forum. I have taken some time to read through many of the threads and posts to gain a feel for the personalities and atmosphere. I find many knowledgable and helpminded traders here and that excites me (unlike many other forums full of strife and turmoil). As many new arrivals I want to be cautious with any comments or remarks as to not get off on the wrong footing with the Leader Board. I am a long time trader (both Swing and Trend) that has finally moved over to Trading Blox from a very basic and human intense system in hopes of becoming more efficient and to free up more time. I am by no means an expert of trading systems. I am here to learn and contribute where possible. That said here are my thoughts on this subject:
I have seen where there are varying thoughts regarding Robert Pardo and his writings. During my ongoing education I have attempted to maintain an open mind to testing which hopefully in the end produces viable Optimization and Parameters that give the best possible chances of a stable trading system that produces reliable results. In the end my trading system must be simple and reliable.
What I took away from Pardo regarding optimization and parameter stepping were his statements in his first book (Design, Testing, and Optimization of Trading Systems) "Therefore to lend any creditability to the top models found, the mean of all tests must be profitable and one standard deviation below the mean must be profitable. The larger the percentage of very profitable models found in a test batch, the greater the likelihood that this is a sound trading model. A sound trading model has many profitable parameter combinations. The second way to evaluate an optimization run is by "shape" of the results space. A top model must be rejected if it is a profit spike which is another type of anomaly. The performance of such a model turns from profit to loss at the smallest shift in its parameters...... The optimization process must select a top model that is sitting on top of a gently sloping profit "hill." The performance of such a model will only show a small reduction of profit in the face of smalltomedium shifts in parameters. In very robust models, even the most dramatic shift may only lead to a large decrease in profit instead of a loss."
What this means to me is that no matter the amount of parameter step spacing (incremental based on fixed or percentages) in the end the results must produce a model that is reliable and valid to markets that it is intended for. I also feel that the only way to get there is by conducting indepth testing across a wide spectrum if you are to identify what Pardo is suggesting and if you are to avoid zeroing in on profit spikes as your selected model. I also feel that most if not all trading models must be tested across "multimarkets and multiperiods if you are to gain more accurate results to better understand your model(s).
I look forward to developing solid relationships with many of you that I have read your comments here on the forum and to exchange and learn as much as possible. As my father used to tell me; "Son, learn something new each and everyday for a day without learning is a day wasted. The day you stop learning is the day you die."
Please note that I am a "Keep It Simple Stupid" (KISS) practitioner and a follower of Occam's Razor principles (it is the usually the simplest idea/solution that is the best one). I attempt to not overcomplicate life with things that often can become "noise or clutter."
Kind regards,
Trend Rider
I have seen where there are varying thoughts regarding Robert Pardo and his writings. During my ongoing education I have attempted to maintain an open mind to testing which hopefully in the end produces viable Optimization and Parameters that give the best possible chances of a stable trading system that produces reliable results. In the end my trading system must be simple and reliable.
What I took away from Pardo regarding optimization and parameter stepping were his statements in his first book (Design, Testing, and Optimization of Trading Systems) "Therefore to lend any creditability to the top models found, the mean of all tests must be profitable and one standard deviation below the mean must be profitable. The larger the percentage of very profitable models found in a test batch, the greater the likelihood that this is a sound trading model. A sound trading model has many profitable parameter combinations. The second way to evaluate an optimization run is by "shape" of the results space. A top model must be rejected if it is a profit spike which is another type of anomaly. The performance of such a model turns from profit to loss at the smallest shift in its parameters...... The optimization process must select a top model that is sitting on top of a gently sloping profit "hill." The performance of such a model will only show a small reduction of profit in the face of smalltomedium shifts in parameters. In very robust models, even the most dramatic shift may only lead to a large decrease in profit instead of a loss."
What this means to me is that no matter the amount of parameter step spacing (incremental based on fixed or percentages) in the end the results must produce a model that is reliable and valid to markets that it is intended for. I also feel that the only way to get there is by conducting indepth testing across a wide spectrum if you are to identify what Pardo is suggesting and if you are to avoid zeroing in on profit spikes as your selected model. I also feel that most if not all trading models must be tested across "multimarkets and multiperiods if you are to gain more accurate results to better understand your model(s).
I look forward to developing solid relationships with many of you that I have read your comments here on the forum and to exchange and learn as much as possible. As my father used to tell me; "Son, learn something new each and everyday for a day without learning is a day wasted. The day you stop learning is the day you die."
Please note that I am a "Keep It Simple Stupid" (KISS) practitioner and a follower of Occam's Razor principles (it is the usually the simplest idea/solution that is the best one). I attempt to not overcomplicate life with things that often can become "noise or clutter."
Kind regards,
Trend Rider
Last edited by Trend Rider on Tue Aug 30, 2011 11:33 am, edited 2 times in total.
Welcome and excellent first post Trend Rider!
I'll probably have some comments on your post and others in this thread by the end of the week. In the meantime I've emailed Perry Kaufman in the hope that we can get some comment from him and will try to code up a blox based on Event Horizon's latest post in the Feature Request section.
I'll probably have some comments on your post and others in this thread by the end of the week. In the meantime I've emailed Perry Kaufman in the hope that we can get some comment from him and will try to code up a blox based on Event Horizon's latest post in the Feature Request section.
I think the issue is not about bias towards certain lengths when searching for an "optimal" set per se;
it is one of variable precision, which gets more distorted at the extremes.
We live in a noisy world. When dealing with a parameter that cannot be negative (e.g. length), using fixed stepping over too wide a range will give underprecision at the bottom end, and way too much precision near the top end.
It is rare that your outsample results and your insample results ever match up in their choice of optimal set to within closer than 10%(at a guess). Therefore discriminating in lenght by say 12% is a waste of resources. This doesn't matter too much with the brute force available to us nowadays, though it still can be misleading mentally i suppose.
Likewise, you wouldn't be doing your search justice if you stepped at 50100% intervals, but this is what you do at the bottom end, if you say run a length test from 5100 with steps of 5.
I could write more on this, but i haven't time right now. My basic opinion though is that this is just commonsense, and judging by some feedback in the thread so far, some people may overthink this.
it is one of variable precision, which gets more distorted at the extremes.
We live in a noisy world. When dealing with a parameter that cannot be negative (e.g. length), using fixed stepping over too wide a range will give underprecision at the bottom end, and way too much precision near the top end.
It is rare that your outsample results and your insample results ever match up in their choice of optimal set to within closer than 10%(at a guess). Therefore discriminating in lenght by say 12% is a waste of resources. This doesn't matter too much with the brute force available to us nowadays, though it still can be misleading mentally i suppose.
Likewise, you wouldn't be doing your search justice if you stepped at 50100% intervals, but this is what you do at the bottom end, if you say run a length test from 5100 with steps of 5.
I could write more on this, but i haven't time right now. My basic opinion though is that this is just commonsense, and judging by some feedback in the thread so far, some people may overthink this.
at the top of page 852 in the mentioned kaufman book,
"Using equal increments will weight this set of tests heavily towards the long end;"
a few more comments but no explanation of why results are biased towards the long end.
bobsyd, have you got a page number where there is more explanation?
my guess is that a step test using equal increments of 1 showing two peaks on a graph would indicate the short term peak, say at 20 days, is a spike, compared to a second peak at 90 days which would look rounded.
using equal percentage steps(19,20,21 and 85,90,95) those two peaks might look the same.
"Using equal increments will weight this set of tests heavily towards the long end;"
a few more comments but no explanation of why results are biased towards the long end.
bobsyd, have you got a page number where there is more explanation?
my guess is that a step test using equal increments of 1 showing two peaks on a graph would indicate the short term peak, say at 20 days, is a spike, compared to a second peak at 90 days which would look rounded.
using equal percentage steps(19,20,21 and 85,90,95) those two peaks might look the same.
My belief is that investors of systematic managers and not buying into the models themselves, but the model building skills of the manager. If the manager is unable to tell the difference between sitting atop a mound of robustness vs. a pinnacle of optimization, then the investors are in for great dissappointment.

 Roundtable Knight
 Posts: 229
 Joined: Thu Jul 08, 2010 2:36 pm
 Location: Boulder, CO
 Contact:
Thanks 7432  that was exactly the clarification I was looking for.
Kaufman is taking into account the terrain around the optimum as well as the optimum itself. When doing so, one needs to work in percentage steps to avoid a bias against the lower values in the range.
Sorry if I am overthinking it Ric  I just like to know what I am "missing" when a statement is made (especially by someone with Kaufman's expertise) that doesn't immediately make sense to me!
Kaufman is taking into account the terrain around the optimum as well as the optimum itself. When doing so, one needs to work in percentage steps to avoid a bias against the lower values in the range.
Sorry if I am overthinking it Ric  I just like to know what I am "missing" when a statement is made (especially by someone with Kaufman's expertise) that doesn't immediately make sense to me!
Eventhorizon, to be exact, fixed percentage steps will not entirely make up for the bias you are talking about either. In fact, you will have to use the largest percentage step in the beginning, and then successively reduce it (fast in the beginning, slow in the end). As someone alluded to earlier this is a total overkill if you're only looking at picking one or two points at the peak of the distribution. However, if you intend to pick 20 or more points on the frontier, as we do, it's fairly important to adjust for the bias we're talking about. I mean, what's the point of picking 20 if five of them are more or less the same?! Hence, what we are looking for is even correlation between settings (when comparing the resulting return series), and one way of getting there is as explained above.