SQL db design recommendation

Discussions about custom-built testing platforms written in C, C++, or Java.
Post Reply
chs245
Contributor
Contributor
Posts: 1
Joined: Wed Mar 03, 2004 6:23 am
Location: Amsterdam, NL

SQL db design recommendation

Post by chs245 » Wed Nov 14, 2007 7:02 am

Hi
I'm planning to host my own MSSQL stock data db. I want to store 1min OHLCV for 2000 tickers every day (ca 7.5hours x 60min x 2000 tickers = 900k rows per day)
I want to use the database to store the data locally and to easily be able to retrieve it into other applicaions.
What I'm not sure of is how to design the database: store the data in tables by ticker ? by date ?
Can anyone offer advice as to this ?

Best regards,
Oliver

jungle
Full Member
Full Member
Posts: 20
Joined: Tue Mar 25, 2008 3:31 pm

Post by jungle » Wed Mar 26, 2008 11:15 am

using a relational database for tick data storage may be undesirable. i will try to post a link to a discussion of this elsewhere. in the mean time, you may wish to look at kdb or xenomorph.

mojojojo
Roundtable Knight
Roundtable Knight
Posts: 131
Joined: Mon Oct 31, 2005 2:07 pm

Post by mojojojo » Thu Mar 27, 2008 7:38 am

It really depends on how you plan on using the data and accessing that data. For a database that will grow by that amount you might want to pick up a book on database design. Little things like what to index can make a huge difference.

davidib
Contributing Member
Contributing Member
Posts: 9
Joined: Mon Jul 18, 2005 6:09 pm
Location: Chicago

Post by davidib » Fri Mar 28, 2008 11:35 am

Oliver,

like mojojojo said much depends on how you want to use the data. FWIW I have not used a relational db for price data for a long time, I've just used simple flat files to store this info.

Lots of reasons for this choice:

0.flat files (i.e. csv files) infinitely easier to manage/manipulate than having the data in a database.

1..99% of the time the historic data is just accessed chronologically (i.e. select * order by date) anyway, hence no need for fancy select stmts.

1a.Even if you plan to do lots of research besides backtesting, sql will only provide the barest of analysis tools (i.e. like std of your time series). You'll end up having to pump the data into something stronger (i.e. Excel, R, etc). All/most of these packages are designed to read csv/flat files

2.historic data is relatively static, no real need to do lots of edits once its in your data store

3.You don't have to deal with another piece of technology (i.e. rdb). You'll probably wont have time to spend futzing with a db anyway, your time will be spent doing research

4.with ODBC/JDBC, you can access the flat files like a db anyway, so you have access to sql like functionality if you need it.

If you still want to put the data into a relation db, my prof. opinion/experience is to do make a table for each symbol, i.e. you'll have an IBM table, an AAPL table, etc. This gives u the ability to easily access the data across dates vs having to do a subselect for each day if you broke the tables up by date.

FWIW, when I put together a tick database each day was saved as a compressed blob, (again organized by symbol). I mention this only as a space saving/time saving mechanism, unless your histories are very long this probably won't be an issue.

David

Post Reply