|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Table design questionThis is a typical historical data table for financial apps that store stock
prices. Columns are 'stock', 'date', 'open_price', 'close_price', etc... I know I definitely need a unique index on (stock, date) but I wonder if that is best done with a multi-column primary key constraint on those columns, or with a regular unique index. Also, should the index be clustered? Would it be useful to create another column that serves as a unique identifier, possibly making use of the system-assigned IDENTITY. Of course, these questions arise because of performance. This table can grow very large and will be queried extensively by stock, date ranges, etc... > This is a typical historical data table for financial apps that store First, every table should have a primary key unless there are exceptionalstock > prices. Columns are 'stock', 'date', 'open_price', 'close_price', etc... > > I know I definitely need a unique index on (stock, date) but I wonder if > that is best done with a multi-column primary key constraint on those > columns, or with a regular unique index. Also, should the index be clustered? > Would it be useful to create another column that serves as a unique > identifier, possibly making use of the system-assigned IDENTITY. circumstances and the implications of not having one are WELL understood. A unique index is not a valid substitute. Second, you apparently have not yet discovered what "stock" means - this is another entity which should exist in your model. Without this entity, it will soon become impossible to derive meaning from your pricing data. The members of the PK are, at a minimum, "stock" (for lack of a proper term) and date. Indexing strategy has been discussed many times in this NG - BOL also has some good information about indexing. The choice of a clustered index depends on many factors - you really need to know your system and how it is used to answer any indexing questions. As far as a surrogate key, I doubt that will provide much (if any advantage). Do you have any reasons for generating the surrogate? Thank you Scott. I agree with you: stock is an entity and a foreign key in
this table. The 'stock' (an int) is the primary key of another table. And yes, I think making a primary key with stock and date should work and I don't see the purpose of generating the unique key. Just wondered what the purpose of such a key (you called it surrogate) is when I saw it in another system that is identical to mine. As I said in my first post, queries are performed based on stock and date, like 'select max(high) from [prices] where stock = 'MSFT' and date between ...' etc... Show quote "Scott Morris" wrote: > > This is a typical historical data table for financial apps that store > stock > > prices. Columns are 'stock', 'date', 'open_price', 'close_price', etc... > > > > I know I definitely need a unique index on (stock, date) but I wonder if > > that is best done with a multi-column primary key constraint on those > > columns, or with a regular unique index. Also, should the index be > clustered? > > Would it be useful to create another column that serves as a unique > > identifier, possibly making use of the system-assigned IDENTITY. > > First, every table should have a primary key unless there are exceptional > circumstances and the implications of not having one are WELL understood. A > unique index is not a valid substitute. Second, you apparently have not yet > discovered what "stock" means - this is another entity which should exist in > your model. Without this entity, it will soon become impossible to derive > meaning from your pricing data. > > The members of the PK are, at a minimum, "stock" (for lack of a proper term) > and date. Indexing strategy has been discussed many times in this NG - BOL > also has some good information about indexing. The choice of a clustered > index depends on many factors - you really need to know your system and how > it is used to answer any indexing questions. > > As far as a surrogate key, I doubt that will provide much (if any > advantage). Do you have any reasons for generating the surrogate? > > > Surrogate keys are generally used for a few reasons:
* ease of programming - if every table has the same kind of primary key, then every table has a common pattern to program against. I don't have to worry if the key has six parts, or just one. It always has one so the user can expect that. * performance - if you have a lot of data, the clustered key is best set to be a small, unique value, and the primary key if it is an integer is perfect. It is just four bytes. This is the key used in all other indexes to get to the row value. If you cluster on something that is not very selective (few unique values) then SQL server adds four byte values to your index to uniquify the value in the index. >> First, every table should have a primary key unless there are exceptional Every table should have a primary key period. If you have to introduce some >> circumstances and the implications of not having one are WELL understood. sort of surrogate to enforce uniqueness due to weird conditions (I have seen this happen, but it is usually poor design in the first place) that is ok, but the surrogate makes sure that you can "by golly" tell two rows apart programattically, if not logically. Last nitpick, think about all of your attributes and think about what they describe. Is it the stock itself or the transaction, or any other objects that aren't clear from your email. It will pay for itself later to make individual things individual tables and use relationships to connect them! -- Show quote---------------------------------------------------------------------------- Louis Davidson - http://spaces.msn.com/members/drsql/ SQL Server MVP "Ken Abe" <Ken***@discussions.microsoft.com> wrote in message news:FDECF27A-377D-42CC-9435-6BF0A7524C35@microsoft.com... > Thank you Scott. I agree with you: stock is an entity and a foreign key in > this table. The 'stock' (an int) is the primary key of another table. And > yes, I think making a primary key with stock and date should work and I > don't > see the purpose of generating the unique key. Just wondered what the > purpose > of such a key (you called it surrogate) is when I saw it in another system > that is identical to mine. > > As I said in my first post, queries are performed based on stock and date, > like > 'select max(high) from [prices] where stock = 'MSFT' and date between ...' > etc... > > > "Scott Morris" wrote: > >> > This is a typical historical data table for financial apps that store >> stock >> > prices. Columns are 'stock', 'date', 'open_price', 'close_price', >> > etc... >> > >> > I know I definitely need a unique index on (stock, date) but I wonder >> > if >> > that is best done with a multi-column primary key constraint on those >> > columns, or with a regular unique index. Also, should the index be >> clustered? >> > Would it be useful to create another column that serves as a unique >> > identifier, possibly making use of the system-assigned IDENTITY. >> >> First, every table should have a primary key unless there are exceptional >> circumstances and the implications of not having one are WELL understood. >> A >> unique index is not a valid substitute. Second, you apparently have not >> yet >> discovered what "stock" means - this is another entity which should exist >> in >> your model. Without this entity, it will soon become impossible to >> derive >> meaning from your pricing data. >> >> The members of the PK are, at a minimum, "stock" (for lack of a proper >> term) >> and date. Indexing strategy has been discussed many times in this NG - >> BOL >> also has some good information about indexing. The choice of a clustered >> index depends on many factors - you really need to know your system and >> how >> it is used to answer any indexing questions. >> >> As far as a surrogate key, I doubt that will provide much (if any >> advantage). Do you have any reasons for generating the surrogate? >> >> >> |
|||||||||||||||||||||||