Drawing Conclusions from Tests with Too Little Data

Discussions about the testing and simulation of mechanical trading systems using historical data and other methods. Trading Blox Customers should post Trading Blox specific questions in the Customer Support forum.
Post Reply
Forum Mgmnt
Roundtable Knight
Roundtable Knight
Posts: 1842
Joined: Tue Apr 15, 2003 11:02 am
Contact:

Drawing Conclusions from Tests with Too Little Data

Post by Forum Mgmnt » Mon Sep 29, 2003 10:19 am

in another topic TK wrote:After running a couple of tests myself, I came to a sad conclusion that no matter how prudent you are about your entry/exit rules, position sizing models, optimization process and Monte Carlo simulation, in the end it is all about... market selection. If you haven't done it yourself yet, I suggest you do the following:

1. Open the file Demo.set in the Futures Sets folder.
2. Replace the symbol PA (palladium) with SI (silver).
3. Save the file and run the test again on the Demo portfolio.

What you have just done is replace only one (out of 15) market in the portfolio with another market from the same sector (Metals), so you wouldn't expect a major difference in the results, would you? Well, run the test and see the CAGR drop from 169.2% to 52.7%, the MaxDD rise from 39.1% to 51.0%, and MAR fall from 4.44 to 1.03...

All said and done, it seems market selection is the key.
TK,

I think you've hit upon the limitation of testing with too little data, as much or more than the effect of changing one market.

With five years data, you simply don't have enough information to draw valid trading conclusions. Even 10 years of data is too little in my book (notwithstanding Richard Dennis' comments otherwise).

People often talk about there being too few trades, I look at it more from the perspective of too few trends. It's really only the bad periods and the good periods that count. A single good trend at the right time can make a huge difference in a five year test. Likewise a single series of losses at the wrong time, can make an equally bad difference.

On average, my guess is that in any given market you might get one good trend or less in five years, and perhaps a couple smaller winners.

So even across a basket of 30 markets, the data that matters most is in very short supply. Perhaps 20 to 30 good trades in a sample of five years. This is simply too little data to make statistically significant judgements on.

All those short periods of little losses or smaller wins don't really matter that much, because even if you have hundreds of data points from the perspective of trades, there are very few of those that make much difference.

The other problem is the effect of a single year on the results. In five years on a compounded basis a single year affects the entire results considerably. CAGR% can move around quite a bit based on the outcomes of only a few trades. The solution again is getting more data.

Your example is illustrative because Palladium happened to have six good trades over the test period:

Feb 2000 - 54.7%
Oct 2000 - 29.6%
Jul 2001 - 25.7%
Dec 2001 - 16.9%
Nov 2002 - 22.9%
Mar 2003 - 35.5%

where as Silver probably lost money during the same period.

Over a 20 year period, the differential in results would not have been so dramatically different.

MAR is particularly vulnerable to short test periods because a drawdown impacts MAR calculations in both the numerator and denominator, since a large drawdown reduces CAGR% as well. This is especially true in a short test where you might only have one period (or none) of large drawdowns. The presence or absence of the large drawdown will change the MAR more than any other result.

Now, that having been said, I do believe that market selection is very important but not as easy as it might appear, but that is a subject for another topic.

- Forum Mgmnt

biggdonn
Full Member
Full Member
Posts: 11
Joined: Thu Aug 07, 2003 2:24 pm

Post by biggdonn » Thu Nov 13, 2003 2:31 pm

Forum Mgmnt,

Out of the data sets (demo, trending, turtle futures, and Currencies and financials) on the demo version. Which set performed the best over the long-term?

Thanks

Don

Hiramhon
Roundtable Fellow
Roundtable Fellow
Posts: 98
Joined: Fri May 09, 2003 12:45 am

Post by Hiramhon » Thu Nov 13, 2003 4:39 pm

Don, think about what you're asking. c.f. is selling software called Veritrader that helps to answer "what-if" questions like
  • What if I tested these portfolios over the long term? Which ones would show the most impressive results?
It occurs to me that one simple answer is "buy Veritrader, run it, and find out." Once you have the software you'll be able to ask and answer huge numbers of other questions too. If you want, you can even take requests from people on this PHP-board and run a test when somebody asks.

Forum Mgmnt
Roundtable Knight
Roundtable Knight
Posts: 1842
Joined: Tue Apr 15, 2003 11:02 am
Contact:

Post by Forum Mgmnt » Thu Nov 13, 2003 7:40 pm

Don,

Hiramhon's response might seem a bit harsh, and might not even apply in your particular case, but one could learn a lot by following his advice.

You may already be doing what I am about to suggest, if so I apologize, the following advice is intended for a general group of people that I see frequent forums like this one.

The issue is not VeriTrader, or Trading Recipes, or whatever. The issue is one of deciding whether you will participate in the process of trading research or just take a an observer's seat.

The path to success in trading is one of getting your own answers. Not because the answers others might provide aren't valid, but because if you are going to trade, you are the one who needs to know the answers.

Getting an answer from someone else, even someone you respect, is not enough if you are the one who has to pull the trigger. An answer might be a valuable step in the process of your own learning, but if you take it as the "truth" and don't put the time into discovering why for yourself, you won't be building the kind of foundation you need for successful trading.

When the 8 month or 1 year drawdown comes, you need to have enough faith in your own trading to keep going. You will have this faith if you have confidence in your own analysis, if you have looked for the reasons behind the answers you have received from others, if you use these answers as a starting point, the seed of your further investigation.

- Forum Mgmnt

P.S. I haven't personally compared the demo portfolio against the others. The trending portfolio worked the best in my testing which did not include that portfolio. The interesting and instructive thing might be to determine for yourself why this might be the case.

biggdonn
Full Member
Full Member
Posts: 11
Joined: Thu Aug 07, 2003 2:24 pm

Post by biggdonn » Fri Nov 14, 2003 1:18 am

Obviously I need to elaborate on why I asked that question. :shock:

I am not some mindless leach that is looking for a handout. Although after reading my post and the replies to my post, I do appear that way. I do believe in the old saying, “Give a man a fish…Teach a man to fish…â€

Erwin Dicker
Senior Member
Senior Member
Posts: 25
Joined: Mon Dec 22, 2003 3:11 pm
Location: Moelingen - Belgium

Results in Veritrader oké? TRADELOG.TXT

Post by Erwin Dicker » Mon Jan 12, 2004 2:24 pm

I tested and tested of course Veritrader Demo. It is very usefull, although it is of course a black box to me.

Can someone tell if i take the right conclusions

When you check the trade log you see negative prices for some futures (is it because of back adjusting?)

When the equity grows i get for example positions of
U P Entry Time Entry Stop Risk Quantity In Fill

1 ED 14 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 13 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 12 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 11 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 10 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 9 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 8 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 7 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 6 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 5 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 4 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 3 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 2 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 1 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02
1 ED 0 S 10/17/02 Open 98.5475 98.6227 0.49 712 98.5445 10/25/02

Question: does this mean 15 units of 712 futures ??????
I think this is too much for this market.
I traded 100 positions ED, but you influence the market already.

.....who wants to react?

Erwin

Forum Mgmnt
Roundtable Knight
Roundtable Knight
Posts: 1842
Joined: Tue Apr 15, 2003 11:02 am
Contact:

Post by Forum Mgmnt » Mon Jan 12, 2004 2:50 pm

Yes, negative prices are because of back-adjusting.

I used to regularly trade 1,000 to 1,200 contracts of Eurodollar over 15 years ago.

Yes, you moved the market a bit but not too much. It was pretty liquid even then. It's much better now.

You've got to keep one thing in mind. Your positions sizes will always grow in historical testing if you have results that make money.

I don't worry too much about the exact liquidity constraints because I am not asking the question: "Can I make $50 billion from $1 million in 20 years?".

Rather, I am trying to figure out what my returns are likely to be over the next few years. If you were trying to figure out what your returns might be over the next 20 years, you might have to worry about liquidity constraints caused by having a $400 million plus account size but that's not what most people are trying to determine. They want to know what they can expect to make with the size they are likely to trade, and for that purpose, you can ignore liquidity a bit if you have a portfolio that includes only markets that are liquid enough to handle your trading account for the next several years.

If you start trading huge account sizes then you'll have to worry about this, but that's a problem most people would be happy to have.

Erwin Dicker
Senior Member
Senior Member
Posts: 25
Joined: Mon Dec 22, 2003 3:11 pm
Location: Moelingen - Belgium

DATA check

Post by Erwin Dicker » Mon Jan 12, 2004 3:27 pm

Hello Forum Mgmnt,

Perhaps you can help me.

When i check my CSI data the prices of the ED future at 17-10-2002 where

Open: 0.9498
High: 0.96040
Low: 0.9438
Close: 0.9458

The system traded on different values ( 0.98...) as you can see in my previous post.
Is this because of adjusting data to continuous futures.....?

I also saw negative values (HU) in the system test.

Erwin

Maxwell Cintes
Full Member
Full Member
Posts: 14
Joined: Wed Apr 16, 2003 10:08 pm
Location: Hartford, CT

Post by Maxwell Cintes » Mon Jan 12, 2004 3:44 pm

Do some more research on back-adjusting, you will find it interesting.

Yes, negatives and different prices occur because of back-adjusting. Commodities with a larger spread between the delivery months have a higher chance of going negative using back-adjusting.

CSI has a lot of information on this on their web site.

d-g
Roundtable Knight
Roundtable Knight
Posts: 290
Joined: Tue Apr 15, 2003 11:39 am

Post by d-g » Mon Jan 12, 2004 3:59 pm

Back adjusted futures may look different between vendors in specific pricing because of the triggers used to roll from one contract to the other.
The OOWDG data used in the Demo rolls 5 days prior to contract expiration. Your CSI data may roll on volume or open interest or date related triggers, the UA application allows you to specify your roll.

Any back adjustment software takes todays price and roll backwards. So today the price for a day in October 2002 could show up as 100, while tomorrow IF THE CONTRACT CHANGES the back adjusted price for the same day could have a different numerical value.

The value of your testing and using the back adjusted values are the relative price movement, not the absolutes of the specific values.

Back adjusting can also result in negative values for futures.

Here is a link to a nice paper by Bob Fulks on the topic:

http://www.traders2traders.com/papers/b ... tracts.htm

Erwin Dicker
Senior Member
Senior Member
Posts: 25
Joined: Mon Dec 22, 2003 3:11 pm
Location: Moelingen - Belgium

Thanks for the confirmation...

Post by Erwin Dicker » Mon Jan 12, 2004 5:08 pm

And the confidence!

Erwin

verec
Roundtable Knight
Roundtable Knight
Posts: 162
Joined: Mon Jun 09, 2003 7:04 pm
Location: London, UK
Contact:

Post by verec » Mon Jan 12, 2004 7:00 pm

Forum Mgmnt wrote:With five years data, you simply don't have enough information to draw valid trading conclusions. Even 10 years of data is too little in my book (notwithstanding Richard Dennis' comments otherwise).
This started me wondering ...

In some other threads you have repeatedly empasized that you wouldn't consider anything less than 20 years data for back testing... but if you follow that line of reasoning backwards, that is, once you have setlled on a given portfolio that you have back-tested on 20 years, and assuming it is going to more or less behave the same way in the future, then you will need to trade for ... 20 years to achieve results that match your back testing!

Because, stopping trading that system after, say, only 5 years of bad results would mean that the 20 years back-testing was useless! in other words, a good system that showed profits over the past 20 years may not be considered a bad system until you have traded it for at least 20 years!.

What's the flaw in my reasoning? :shock:

blueberrycake
Roundtable Knight
Roundtable Knight
Posts: 125
Joined: Mon Apr 21, 2003 11:04 pm
Location: California

Post by blueberrycake » Tue Jan 13, 2004 1:49 am

verec wrote: In some other threads you have repeatedly empasized that you wouldn't consider anything less than 20 years data for back testing... but if you follow that line of reasoning backwards, that is, once you have setlled on a given portfolio that you have back-tested on 20 years, and assuming it is going to more or less behave the same way in the future, then you will need to trade for ... 20 years to achieve results that match your back testing!

Because, stopping trading that system after, say, only 5 years of bad results would mean that the 20 years back-testing was useless! in other words, a good system that showed profits over the past 20 years may not be considered a bad system until you have traded it for at least 20 years!.

What's the flaw in my reasoning? :shock:
The flaw in your reasoning is that the two numbers are not in any way related.

When you are testing your system you are assuming that your test period is simply a representative sample of some underlying distribution(which consists of past/future trading results).

Using your test sample you are trying to estimate the basic characteristics of the underlying population, such as the mean and the standard deviation. As you make the test sample larger, the confidence intervals narrow and the statistical significance of your test goes up.

Once you make an estimate of the underlying distribution, you have an idea of what you should expect in the future. Once you start trading, you can do a test to estimate the likehood that your trading results are consistent with your original assumptions of the underlying distribution. Obviously the more time goes by, the more definitively you can answer the question of whether what you see is consistent with what you had expected.

However the length of the period necessary to make such an assertion has nothing to do with the length of time used in your original testing.

Stat 101.

-bbc

verec
Roundtable Knight
Roundtable Knight
Posts: 162
Joined: Mon Jun 09, 2003 7:04 pm
Location: London, UK
Contact:

Post by verec » Tue Jan 13, 2004 5:19 am

Thanks for the answer... but
blueberrycake wrote:However the length of the period necessary to make such an assertion has nothing to do with the length of time used in your original testing.

Stat 101.
Take a fair coin. How many draws before you can expect to reach the 50% head/tail distribution?

Obviously on 2 draws, your probability of getting exactly one head and one tail is 0.5. [HH, HT, TH, TT, only HT & TH qualify, so that's 2 out of 4]

On 4 draws, the probability of reaching the 50/50 distribution drops to 6/16. [From HHHH to TTTT, only keep the ones with an equal number of Ts and Hs]

How many draws to get p=0.75 ? p=0.80 ?

This seem to mean to me that. whatever the expectancy of your system, you will need "a certain number" of trades to reach the predicted payoff.

What you are saying is that the number of samples (length of time) needed to arrive at a certain expectancy has nothing do to with the average number of trades you will need to perform to reach that expected number.

What's the relation, then? Is there any? What am I confusing again? :oops:

yoyo2000
Roundtable Fellow
Roundtable Fellow
Posts: 58
Joined: Fri Jan 30, 2004 10:37 pm

Re: Drawing Conclusions from Tests with Too Little Data

Post by yoyo2000 » Fri Jul 08, 2005 1:42 am

Forum Mgmnt wrote:
in another topic TK wrote:.......
TK,

I think you've hit upon the limitation of testing with too little data, as much or more than the effect of changing one market.

With five years data, you simply don't have enough information to draw valid trading conclusions. Even 10 years of data is too little in my book (notwithstanding Richard Dennis' comments otherwise).

- Forum Mgmnt
Hi,Forum Mgmnt,I knew you wrote a book just now,could you tell me the name of the book?There must be much useful stuff in the book.

Post Reply