Meaningful Correlation

Discussions about the testing and simulation of mechanical trading systems using historical data and other methods. Trading Blox Customers should post Trading Blox specific questions in the Customer Support forum.
Post Reply
Sean Travers
Contributing Member
Contributing Member
Posts: 8
Joined: Tue Aug 05, 2003 3:36 pm
Location: United States

Meaningful Correlation

Post by Sean Travers »

Hi,

In an attempt to seek out meaningful correlations in a one market intraday futures breakout system, I set up a spreadsheet containing the following daily stats for a year of trading history: date, daily range, morning gap (+ or - and size), daily P&L, first trade of day P&L, and first trade time of day. Lately this system has been suffering particularly hard on gaps, and I was hoping to write in new entry filtering logic using possibly range, gap size/direction or both.

I ran correlation matrices to search out potentially exploitable relationships. While most of these fields had correlations very near zero, i did notice daily range had a ~.35 correlation to daily P&L, unfortunately the 'strongest' one in the lot.

Here's my question, is a .35 correlation as negligable as I think it is, or in your opinion could this quantify a meaningful relationship within a mechanical strategy. What is considered a strong correlation between strategy characteristics?

Thanks
seneca_kw
Contributor
Contributor
Posts: 1
Joined: Sat Feb 14, 2004 10:17 am
Location: Ohio

Correlation

Post by seneca_kw »

Sean,

I'm looking at correlations too and have similar questions. I have these notes from a statistics text:

To determine the strength of a relationship, square the correlation. The result is the percentage that you've improved your prediction of the second by knowledge of the first.

In your example .35 x .35 = .1225. Thus your ability to predict is improved by 12%. I don't know if this is enough of an edge to trade on. I hope someone else can comment on this.

Here's an idea that I've been thinking about. When I look at correlations I'm usually trying to test what I think may be a market principle. In other words, I'm not randomly looking for correlations of data, but I have an idea of what might be correlated based on ideas of market logic.

Let's say I test a market principle and get a correlation result that is seemingly low but above 0. I then think: if this market principle is true, what filters or other conditions can I add on that should improve the result IF the principle is true.

I then start adding these conditions on one at a time, and if the correlation improves each time, I might be on to something. If, however, the correlation improves and worsens in a seemingly random pattern, then I conclude that it's likely that my results are caused by chance.

PS I'm no statistician (that might be evident!)

Wayne
Chris67
Roundtable Knight
Roundtable Knight
Posts: 1052
Joined: Tue Dec 16, 2003 2:12 pm
Location: London

Post by Chris67 »

dont know if it helps you guys but when looking at different markets i look at > 0.66 = strongly correlated 0.33-0.66 = lossely correlated and <0.33 = not statistical significance .. just my 2 cents .. hope it helps

regards

Chris
crazybuddha
Full Member
Full Member
Posts: 10
Joined: Thu Sep 18, 2003 1:18 pm
Location: boston

Post by crazybuddha »

Here are some things that might interest you.

1) The assumption in correlation analysis is that the variables being compared would vary linearly. That is, you expect that scaling one variable by one unit would correspond in scaling the other unit by one unit. What you are ultimately trying to figure out is how much this is the case. Variables that don't change by a uniform amount would give you bogus results.

2) Another assumption is that the variables are selected randomly. In practice, this can be tricky to figure out (for me anyway). Any kind of "filter" on the selection process should raise a red flag.

3) Finally, correlation analysis (using the coefficient) assumes a "joint" normal distribution. That is, for an 'x', the 'y' is normally distributed. I gather that this is typically taken for granted, but I am suspicious of financial data fitting this criteria.

4) As far as interpreting the results go -- and what is significant -- it depends on the degrees of freedom, etc.

Excuse the stats-babble, but I have given a fair amount of thought to this kind of thing, so I thought I would bring these points to your attention.

Eric
The Icelander
Contributor
Contributor
Posts: 4
Joined: Fri Jan 30, 2004 5:01 pm
Location: US and Iceland

Meaningful Correlation (OLD Post)

Post by The Icelander »

You apparently have a 0.35 correlation in a model and want to know if it is "significant".

This is not such an easy question. As with other varibles in science, we can test statistical hypotheses bettween even correlation coefficients. One tail or two? What type of data are we dealing with? Nominal, Ordinal, Interval or Ratio? Oh yeah, is the data linear or curvilinear?

Bunches of questions....

Usually, a statistitican will evaluate the correlation based on the two variables by squaring the r. This gives him or her an idea of
how much variance the two explain together and it will allow for some
logical connection to potentially come into the person's mind.

There are tables in books and on the internet for signficance testing of r's all over.

If the model you had were, say, A+B^0.25 = C, then it gets easier to explain. In bivariate cases (2 variables needed to get the 0.35 correlation), you square the correlation to see what the strength of association is between them. In this case, you will get 12% of the variance. Thus, the relationship between A and B only has a 12% association. I would not bet my money on this system. Although, as
I mention below, this variable could provide important.

If your model is more complex (multiple regression, for example), the first thing you have to do is make sure you have set up your model correctly. Now that computer power is so fast, a number of stats programs can produce stepforward, backward, etc under the regression umbrella. The model you use can have a great degree on your data validity. There are situations in mutiple correlation where one or two of the variables have low correlation with the others, but when some group of them is included with one of low correlation variables, things pick up on the total model fit idea (this is known as a suppresor variable case).


Overall, without knowing what sort of system you will be using the correlation for, it is really impossible to answer the question "is it meaningful". In fact, I have began agognizing over the correlation of reutrns afforded by multple SYSTEMS.

I am now passed the point of conducting factor analyses on stock and commodity variables. I can easily see what patterns are in the data,
rather than trying to just think of one correlation for 2 varibles.

Lots of good books on factor analysis and matrix algebra. Some of the multidimensional scaling material is very interesting :)

Did or can you explain what the correlation was for?



- Jon :roll:
Post Reply