A New Consortium for a New Trading Data Standard

Use this forum to discuss data providers like CSI, charting, or other non testing software.
Post Reply
Forum Mgmnt
Roundtable Knight
Roundtable Knight
Posts: 1842
Joined: Tue Apr 15, 2003 11:02 am
Contact:

A New Consortium for a New Trading Data Standard

Post by Forum Mgmnt »

In another topic, rwk wrote:TT data format was a huge breakthrough in the early days, but it has some serious limitations now. I have not found an open (or semi-open) format that is any better, including Computrac/Metastock IMHO. The proprietary formats, such as TC2000 and CSI, are as only good as the vendor and the interface as c.f. pointed out. Chartbook is the main reason for staying with TT format, and I haven't found anything that comes within light-years of Chartbook for power and ease-of-use. It is a shame that it isn't more readily available.
A few weeks back I decided that I wanted to do something about the lack of a reasonable format for Trading Data. Problems like not having a uniform format and where the value of the tick, big point value, underlying market currency, etc. are different for each market; or survivorship bias in indices like the S&P 500 which changes over time and therefore needs a set of historical adds and deletes; symbols that change meaning over time as companies get acquired and stock symbols are reused etc.

We are going to form an industry consortium for an open standard for data. Then we will modify VeriTrader to support that new format. Further we will promote the new format with the small and large data providers all around the world.

We will also develop and release an open source multi-platform data library and sample application and give it to the consortium which will then manage continued development.

I've done this very successfully before. Check out www.hr-xml.org which is the consortium I started for defining standardized ways to exchange human-resources data like resumes, job requisitions, compensation and benefits changes, payroll updates, etc. You'll note that we were able to recruit all the major players.

Problems associated with the hodgepodge of data formats is our biggest single support issue. It is worth quite a bit for us to solve this problem which is why we are going to put significant resources into eliminating the older data formats.

The goal of the standards is that the data and associated dictionary information including symbol, portfolio construction, index membership (and changes over time), fundamentals, etc. is standardized so there is no need to do anything to install the data and use it in a variety of situations.

We aren't going to do much until 2.0 ships but I thought I'd start the discussion here to solicit our member's advice about the format since we are going to create the initial format standard ourselves.

If we do it right, this will solve a big problem for many people.

- Forum Mgmnt
Roscoe
Roundtable Knight
Roundtable Knight
Posts: 250
Joined: Sat Jan 24, 2004 2:06 am
Location: Houston TX

Post by Roscoe »

Hi Forum Mgmnt,

Great idea and long overdue, thank you for taking the initiative.

My strong preference would be for what I would describe as a “openâ€
Dierk Droth
Full Member
Full Member
Posts: 16
Joined: Tue Jul 13, 2004 4:56 pm
Contact:

Re: A New Consortium for a New Trading Data Standard

Post by Dierk Droth »

Forum Mgmnt,
Forum Mgmnt wrote: A few weeks back I decided that I wanted to do something about the lack of a reasonable format for Trading Data. Problems like not having a uniform format and where the value of the tick, big point value, underlying market currency, etc. are different for each market; or survivorship bias in indices like the S&P 500 which changes over time and therefore needs a set of historical adds and deletes; symbols that change meaning over time as companies get acquired and stock symbols are reused etc.
You might consider taking a look at the TradeMagic symbol repository. TM provides a unified model for all supported symbol types (right now: stocks, futures, index, options). Some of the properties you are looking for, are already there:
- point value
- tick size
- URL to contract spec
- contract periodicy
...

I'm open to suggestions for more.

Regards
jdfagan
Contributing Member
Contributing Member
Posts: 5
Joined: Fri Nov 26, 2004 7:04 pm
Location: San Francisco, CA

wow

Post by jdfagan »

Interesting and courageous idea! :shock:

Here's some sites I found which may or may not be useful in this endeavor:

http://www.mddl.org/ (I saw a timeseries example of quotes in latest beta spec docs)

http://www.service-architecture.com/xml ... e_xml.html (links to other XML finance based languages)

JD
bmitchell
Full Member
Full Member
Posts: 12
Joined: Wed Sep 17, 2003 9:33 am
Location: Atlanta, GA

Post by bmitchell »

While i'm ordinarily not a fan of it, I think an xml-derived standard would have much to offer here, at the obvious expense of increased size to store records and the slower processing speed (which is a bigger deal, as it is likely to be orders of magnitude slower). In return, you get ease of implementation and flexability of format.

[quote="Roscoe"]Hi Forum Mgmnt,

Great idea and long overdue, thank you for taking the initiative.

My strong preference would be for what I would describe as a “openâ€
tobbe
Senior Member
Senior Member
Posts: 41
Joined: Sat Feb 21, 2004 4:25 pm

Post by tobbe »

I would really like to see an XML based standard, being a real fan of it :) .

On the other hand, it's not that obvious how to append data to an XML document in a clean way without parsing and rewriting the complete file (sure, it can be done in some not so clean ways). With thousands of files with price data to update every day (if you're an EOD user), it would likely be perceived to take "too long".

Perhaps a master XML document with instrument definitions and pointers to documents to be included that contain the actual price data would be a viable solution. Any thoughts on this?

Loading data into memory would take slightly longer using XML (compared to a non-ascii format) but I don't think that's an issue. Usually it's only done once and then cached in memory, even if you run multiple simulations (depending on your simulator, of course).

cheers,
tobbe
Forum Mgmnt
Roundtable Knight
Roundtable Knight
Posts: 1842
Joined: Tue Apr 15, 2003 11:02 am
Contact:

Post by Forum Mgmnt »

I have been thinking of XML as the language for dictionary entries, etc. and a binary format for the data.

The XML dictionary would define the fields, their order, etc. while the binary data would be readable using standard binary file routines. There would be provision in the format to handle big-endian and little-endian byte ordering for support for interoperability between Intel and PowerPC et al. chips that have opposite ordering.

When you are looking at intraday or tick-level data the size of the representation is actually very important to the speed of the simulation since you can't really hold everything in memory without many gigabytes of RAM and a 64-bit operating system to support that much RAM. This might not be true in 10 years but for now the size is a problem.

- Forum Mgmnt
tobbe
Senior Member
Senior Member
Posts: 41
Joined: Sat Feb 21, 2004 4:25 pm

Post by tobbe »

Forum Mgmnt wrote:When you are looking at intraday or tick-level data the size of the representation is actually very important to the speed of the simulation since you can't really hold everything in memory without many gigabytes of RAM and a 64-bit operating system to support that much RAM. This might not be true in 10 years but for now the size is a problem.
Living in the EOD data universe I didn't think that far :wink: . Anyway, how data is handled during simulation could still be the responsibility of the actual program using the data. If the XML price data converts to several Gb of binary data, the simulator could convert the XML data and cache the binary image to disk (and also format the binary data to suit the intended simulation best).

The standard could provide for an XML only based "transport format" and a binary "working format"? It would be easy to provide conversion tools or an API in various languages. But then again is the problem of appending data to XML files.

Perhaps the mixed solution you describe is the better of two worlds.

In a binary format I think it would be good to store the definition of byte ordering and perhaps field layout in a header within the binary file, so that it would be possible to make sense of the binary data "standalone".

cheers,
tobbe
Forum Mgmnt
Roundtable Knight
Roundtable Knight
Posts: 1842
Joined: Tue Apr 15, 2003 11:02 am
Contact:

Post by Forum Mgmnt »

Tobbe wrote:In a binary format I think it would be good to store the definition of byte ordering and perhaps field layout in a header within the binary file, so that it would be possible to make sense of the binary data "standalone".
I agree, a standalone format is very desirable, so the format for the individual files should either be fixed or contain enough information in a header so that readers can interpret the contents.

- Forum Mgmnt
AMD
Full Member
Full Member
Posts: 22
Joined: Fri Jan 21, 2005 8:19 am

Post by AMD »

Hello.

How's the progress with introducing the new data standard? It's been almost a year, anything new?
Bernd
Roundtable Knight
Roundtable Knight
Posts: 126
Joined: Wed Apr 30, 2003 6:39 am

Post by Bernd »

:wink:
Post Reply