Main Question:
How can we statistically assess that a change to a strategy results in a significance improvement and can /should be applied to any portfolio?
There are probably many ways you could test but I would be interested if anyone with a bit more math (statistics) knowledge than me could tell if the following seems to be correct (statistically) or has obvious mistakes. More details in the attached EXCEL.
Background:
Stock trading system, Simulation over 21 years. Portfolio of 1015 S&P stocks including delisted.
We want to assess if a change to the system results in a significant improvement to the MAR. What we change to the system doesn't really matter  in this case it is about the methodology used. Note: In the example, the "change" is moving from a strategy with no rebalancing between entry and exit to a strategy with ongoing rebalancing based on position size and risk. We use MAR ratio as the relevant output but we could use any output.
Initial simulation:
Includes the entire portfolio (1015 stocks) over 21 years and results in a significant improvement in MAR from "before" (0.34) to "after" (0.51) the change.
Main Issue:
Maybe a small number of stocks drive the improvement? How to statistically check if the change to our strategy results in an improvement across a large set of portfolios? Again, I am sure there are many ways, but is the following a correct application of statistics?
Step 1.
We generate a number of Random Portfolios (in the example 20 portfolios), each portfolio consists of 200 randomly selected stocks out of the entire "population" of 1015 stocks.
Step 2.
We run "before" and "after" the change simulations on each of the 20 portfolios i.e. in total 40 simulations (with 40 MAR observations)
Step 3.
We calculate the difference in MAR from "before" to "after" the change for each portfolio (i.e. 20 data points of difference in MAR). We calculate the Mean and Standard Deviation of the sample . In the example, Sample mean is +0.028 MAR (7.1%) (i.e. MAR improves on average by 7.1%) with a Standard Deviation of 0.063 MAR (~15%).
Step 4.
Calculate Confidence (95%) Interval using tstatistic (is this correctly applied...?). In the example, I derive to a margin of error of 7.2% > Confidence Interval: 0.1 to +14.3%
Conclusion:
with 95% confidence the change to the system results in an improvement of MAR of between 0.1% (no improvement !) and +14.3%. In other words, with 97.5% confidence the improvement is greater than 0.1% and the change can (should) be applied to any portfolio. At least to any random portfolio within the Universe of 1015 S&P stocks over the 21 year period.
Views much appreciated!
Confidence Interval around a system change
Confidence Interval around a system change
 Attachments

 Confidence Interval around a change to a system.xlsx
 (1.03 MiB) Downloaded 4 times