Erwin,
It is important to understand the rationale behind the concept of a statistically siginificant sample as it relates to trading.
In most contexts, statistical significance and confidence intervals are discussed with reference to samples taken from a population (polls, quality control samples from a production line, etc), where measurements are taken of the samples in an attempt to draw conclusions about the entire population. In these instances, a sample is used because polling or testing the entire population is too expensive.
If the samples taken are representative of the overal population, then the size of the sample determines the confidence that the measurements of the sample are representative of the underlying population.
In the context of trading, the expression: "
If the samples taken are representative of the overal population" is especially important.
In trading system discussions where trades represent the samples, the population is all trades a given system would have taken in the past, and will take in the future.
If one looks only at the number of trades, and not at whether or not those trades are representative of the entire population (i.e. the past and future trades) then one is likely to end up drawing erroneous conclusions.
One example: there are plenty of stock trading systems with 10,000 trades or more over the years 1997 through 1999 that have excellent results over that period but that performed very poorly after the crash. Conclusions drawn from tests using only those years would have seemed to be very "statistically significant" based on the number of trades in the simulation.
However, when one asks the question: "How representative are the years 1997 to 1999 of the stock market behavior, in general?", one quickly gets to the crux of the problem with using number of trades as the only measure of confidence.
So the reason to use many years of data is
not so much to get
a certain number of trades, but
because we want to have the best chance that our tests include a sufficient variety of market conditions that they are representative of what is likely to occur in the future.
In trading we don't have the luxury of being able to test the future. So our tests are, at best, a sample of the population of all trades. If we don't use enough data, we run the risk of testing over a period that is not typical.
Then when the future comes, it will be very different from our testing.
This phenomenon generally results in trading losses, either because we end up trading a system that only works under certain types of markets, or because we stop trading a successful system because our actual results are so much worse than our testing indicated.
(see also:
viewtopic.php?t=559)