CSI Data - its a moving target and then some

General discussions about futures.
Post Reply
AFJ Garner
Roundtable Knight
Roundtable Knight
Posts: 2040
Joined: Fri Apr 25, 2003 3:33 pm
Location: London
Contact:

CSI Data - its a moving target and then some

Post by AFJ Garner » Tue Mar 01, 2011 11:48 am

My colleague and I download the same symbols the same day, each day, using the same portfile.adm file.

Today we discover in the case of 5 symbols that I have data for 31st January 2011 whereas my colleague does not.

Not grumbling, not complaining. Just pointing it out for the unwary.

ratio
Roundtable Knight
Roundtable Knight
Posts: 338
Joined: Sun Jan 15, 2006 11:07 pm
Location: Montreal, Canada

Post by ratio » Wed Mar 02, 2011 8:17 am

I believe it depend at what time you download.

We wait for the 8pm release before downloading exactly because we had that problem, at the 8pm it seem that everything is ready.

Denis

sluggo
Roundtable Knight
Roundtable Knight
Posts: 2986
Joined: Fri Jun 11, 2004 2:50 pm

Post by sluggo » Wed Mar 02, 2011 8:31 am

AFJ, thank you for this report!

I believe I may ask my team of programmers to create another data checker program, which would operate roughly as follows:
  1. Every day, after price data download, read the ASCII price data file of each market.
  2. For each market, write a file of dates-with-no-price-data. This will include weekends and holidays.
  3. Retain these daily files of dates-with-no-price-data. Keep them for the most recent 60 (approx) days
  4. Compare today's files of dates-with-no-price-data, against the files of the most recent 60 (approx) days. Issue terrifying warnings if they don't agree.
I think this will sound the alarm when, as in AFJ's case, a day's prices suddenly disappear. It will also ring the bell if a new day's prices suddenly appears. We'll have to avoid issuing stupid messages that Feb 20th files don't contain Feb 22nd prices, but that seems relatively simple.

AFJ Garner
Roundtable Knight
Roundtable Knight
Posts: 2040
Joined: Fri Apr 25, 2003 3:33 pm
Location: London
Contact:

Post by AFJ Garner » Wed Mar 02, 2011 10:45 am

Thank you both, yes, helpful suggestions. Both my colleague and I download way after the 8pm CSI deadline(although at different times from each other), so , no problem there. Yes, I believe it important to run a daily "no data" checker along with the rest of the routine.

In a way, it is all so utterly arbitrary. It is not going to make a whole lot of difference to the long term success of your trading if CSI does or does not make a small correction here or there or does or does not include a day's data (perhaps different servers send different data out on occasion?).

As ever, there can not really be said to be any ultimate "truth" in much of all this. I have absolutely no doubt that despite all of their efforts, there are still mistakes in the historic data. And what of the times assumed for "open" and "close" of the various markets, especially during the change over from pit to electronic. And all the other assumptions one has to make. Assuming the mistakes and distortions are not too wild or too frequent, none of it will make a devastating difference to a LTTF system.

In some respects this is evidenced by fooling around with different rolling algorithms: LTTF still works I suspect whatever little boxes you tick or un-tick or fiddle with in UI.

But I guess one has to draw a line, set standards. This is OK but that simply will not do. One of us having data for a particular day and one of us not having the available data simply will not do. It will not "do" to have different data files from each other, even in the tiniest respect. The "butterfly" effect soon leads to wildly different test results over time and different order output in no time at all.

We can and should strive for perfection but must try not to mind too much that in practice the goal is unachievable. To do otherwise would be to risk madness and despair! Well, that's my take on it anyway.

AFJ Garner
Roundtable Knight
Roundtable Knight
Posts: 2040
Joined: Fri Apr 25, 2003 3:33 pm
Location: London
Contact:

Post by AFJ Garner » Tue Oct 25, 2011 8:28 am

These "problems" with CSI are so well known now that I hesitate to bring the subject up again. Yet again today however, we noticed a big difference in our test results from last Friday once we had downloaded yesterday's data and I thought it just worth re-iterating to the community that together with all the other vagaries of back testing, data issues are one of the main areas which can cause big effects. The flap of a butterfly's wings. We have four or five files which are affected.

To give an example, on FLG for the date 24th December 2009 contract volume has been changed for the March 2010 contract from 6354 to 6404. On 27th July 2011 for the September 2011 contract the Hi has been altered by 0.2.

For FFI, numerous changes have been made to the volume for the June 2011 contract for the dates 26th May, 6th June and 10th June 2011. Ditto for the September 2011 contract between the dates 22nd July and 3rd August 2011 inclusive. And quite a few other dates and contracts. Those who impose volume control may see differences in their test results.

In FEI, for the December 2012 contract for the date 27th July 2011 there is a difference of 0.005 in the opening price between my file on Friday and my download today. This represents a correction and is not an effect on roll/back adjusting.

And so it goes on. CSI must have just had a big clean up perhaps. And no criticism intended: they aim to keep accurate and clean data and that is what these changes are aimed at.

Forgive me if I have mis-transcribed some of the dates or figures; the details are not really important. You will very probably have slightly different files than I do, depending on the dates and times you download.

The point is this: data accuracy is one thing, continuity and certainty in on-going trading is another. Positions and position sizes change as a result of these frequent changes. That is not great news for daily trading. Personally, I don’t really give a toss about small changes in price or volume and would rather they did not happen – I would prefer stability over super-accuracy.

I need a different approach for my daily trading. I am more than happy to run 2 sets of files in one of which I make no backwards looking changes, so that I am assured of stability going forward. In the other I could accept whatever corrections CSI make and use those for back testing.

No answers required – just tossing out some dross on an age old subject.

marriot
Roundtable Knight
Roundtable Knight
Posts: 347
Joined: Thu Nov 20, 2008 3:02 am

Post by marriot » Tue Oct 25, 2011 10:59 am

>No answers required
Ops: )

Some months ago i begin to refresh all simbols and currencies every
week end.
At least i have to do adjustments only once a week and i am ready for that.

Post Reply