The figure of merit I propose, is
- CPU clock ticks per bar
The fastest imaginable software would somehow manage to process all the rules of your system (Portfolio Manager, Entries, Exits, Money Manager, and statistics) in one single CPU instruction that took just 1 CPU clock tick to execute. ("the ultimate CISC", a bit of an inside joke). So the theoretical limit, the best "CPU Clocks per bar" imaginable, is 1.0. (or 1/N if you can bring yourself to imagine an N-core ultimate CISC CPU). Actual real-life CPUs runing actual real-life software, will have quite a few CPU clocks per bar, because it takes quite a lot of atomic CPU instructions to accomplish all the work of running a system.
I recommend running approximately the same trading system on all backtesters that you're measuring and comparing, so they are all doing approximately the same amount of "work". The triple EMA crossover system is a decent choice here.
I also recommend running an optimization that steps parameters through several values, so that the entire run takes at least ten minutes. (That way, if you make a one- or two-second error in timing the run, its effect will be negligible.)
I did the measurement just now on Blox Builder 2.1.21 on my Latitude D820 laptop, which is a dual core machine running at 1.95 GHz, with 2.0 gigabytes of RAM. Here are the data:
- portfolio of 30 markets
- 11.0 years tested
- 82636 daily bars (wrote a little Update Indicators blok that counted them)
- manual double-check: 30 mkts * 11 years * approx 250 days/year = 82500, which agrees with the actual count very well
- Triple MA system
- Optimization run that included 900 individual tests
- Stopwatch measured runtime of the optimization run: 17 minutes + 47 seconds
Total # seconds = 47 + (17*60) = 1067 seconds
Total # Clock ticks = (1067 seconds) * (1.95 billion clocks per sec) = 2081 billion ticks
Total # bars = (82636 bars/test) * (900 tests) = 74,372,400 bars
Clock ticks per bar = (2081 billion ticks) / (74.372 million bars) = 27,980 clock ticks per bar
In summary, Blox takes about twenty eight thousand clock ticks to process each bar of data, when running the Triple MA system. It's a simple way to measure software speed, giving results in intuitive units that are easy to remember: 28K clocks per bar, bada boom.