Home All Groups Group Topic Archive Search About
Author
8 Jul 2005 5:19 PM
Ken Abe
This is a typical historical data table for financial apps that store stock
prices. Columns are 'stock', 'date', 'open_price', 'close_price', etc...

I know I definitely need a unique index on (stock, date) but I wonder if
that is best done with a multi-column primary key constraint on those
columns, or with a regular unique index. Also, should the index be clustered?
Would it be useful to create another column that serves as a unique
identifier, possibly making use of the system-assigned IDENTITY.

Of course, these questions arise because of performance. This table can grow
very large and will be queried extensively by stock, date ranges, etc...

Author
8 Jul 2005 6:24 PM
Scott Morris
> This is a typical historical data table for financial apps that store
stock
> prices. Columns are 'stock', 'date', 'open_price', 'close_price', etc...
>
> I know I definitely need a unique index on (stock, date) but I wonder if
> that is best done with a multi-column primary key constraint on those
> columns, or with a regular unique index. Also, should the index be
clustered?
> Would it be useful to create another column that serves as a unique
> identifier, possibly making use of the system-assigned IDENTITY.

First, every table should have a primary key unless there are exceptional
circumstances and the implications of not having one are WELL understood.  A
unique index is not a valid substitute.  Second, you apparently have not yet
discovered what "stock" means - this is another entity which should exist in
your model.  Without this entity, it will soon become impossible to derive
meaning from your pricing data.

The members of the PK are, at a minimum, "stock" (for lack of a proper term)
and date.  Indexing strategy has been discussed many times in this NG - BOL
also has some good information about indexing.  The choice of a clustered
index depends on many factors - you really need to know your system and how
it is used to answer any indexing questions.

As far as a surrogate key, I doubt that will provide much (if any
advantage).  Do you have any reasons for generating the surrogate?
Author
8 Jul 2005 6:59 PM
Ken Abe
Thank you Scott. I agree with you: stock is an entity and a foreign key in
this table. The 'stock' (an int) is  the primary key of another table. And
yes, I think making a primary key with stock and date should work and I don't
see the purpose of generating the unique key. Just wondered what the purpose
of such a key (you called it surrogate) is when I saw it in another system
that is identical to mine.

As I said in my first post, queries are performed based on stock and date,
like
'select max(high) from [prices] where stock = 'MSFT' and date between ...'
etc...


Show quote
"Scott Morris" wrote:

> > This is a typical historical data table for financial apps that store
> stock
> > prices. Columns are 'stock', 'date', 'open_price', 'close_price', etc...
> >
> > I know I definitely need a unique index on (stock, date) but I wonder if
> > that is best done with a multi-column primary key constraint on those
> > columns, or with a regular unique index. Also, should the index be
> clustered?
> > Would it be useful to create another column that serves as a unique
> > identifier, possibly making use of the system-assigned IDENTITY.
>
> First, every table should have a primary key unless there are exceptional
> circumstances and the implications of not having one are WELL understood.  A
> unique index is not a valid substitute.  Second, you apparently have not yet
> discovered what "stock" means - this is another entity which should exist in
> your model.  Without this entity, it will soon become impossible to derive
> meaning from your pricing data.
>
> The members of the PK are, at a minimum, "stock" (for lack of a proper term)
> and date.  Indexing strategy has been discussed many times in this NG - BOL
> also has some good information about indexing.  The choice of a clustered
> index depends on many factors - you really need to know your system and how
> it is used to answer any indexing questions.
>
> As far as a surrogate key, I doubt that will provide much (if any
> advantage).  Do you have any reasons for generating the surrogate?
>
>
>
Author
9 Jul 2005 5:33 AM
Louis Davidson
Surrogate keys are generally used for a few reasons:
* ease of programming - if every table has the same kind of primary key,
then every table has a common pattern to program against.  I don't have to
worry if the key has six parts, or just one.  It always has one so the user
can expect that.
* performance - if you have a lot of data, the clustered key is best set to
be a small, unique value, and the primary key if it is an integer is
perfect.  It is just four bytes.  This is the key used in all other indexes
to get to the row value.   If you cluster on something that is not very
selective (few unique values) then SQL server adds four byte values to your
index to uniquify the value in the index.

>> First, every table should have a primary key unless there are exceptional
>> circumstances and the implications of not having one are WELL understood.
Every table should have a primary key period.  If you have to introduce some
sort of surrogate to enforce uniqueness due to weird conditions (I have seen
this happen, but it is usually poor design in the first place) that is ok,
but the surrogate makes sure that you can "by golly" tell two rows apart
programattically, if not logically.

Last nitpick, think about all of your attributes and think about what they
describe.  Is it the stock itself or the transaction, or any other objects
that aren't clear from your email.  It will pay for itself later to make
individual things individual tables and use relationships to connect them!

--
----------------------------------------------------------------------------
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP


Show quote
"Ken Abe" <Ken***@discussions.microsoft.com> wrote in message
news:FDECF27A-377D-42CC-9435-6BF0A7524C35@microsoft.com...
> Thank you Scott. I agree with you: stock is an entity and a foreign key in
> this table. The 'stock' (an int) is  the primary key of another table. And
> yes, I think making a primary key with stock and date should work and I
> don't
> see the purpose of generating the unique key. Just wondered what the
> purpose
> of such a key (you called it surrogate) is when I saw it in another system
> that is identical to mine.
>
> As I said in my first post, queries are performed based on stock and date,
> like
> 'select max(high) from [prices] where stock = 'MSFT' and date between ...'
> etc...
>
>
> "Scott Morris" wrote:
>
>> > This is a typical historical data table for financial apps that store
>> stock
>> > prices. Columns are 'stock', 'date', 'open_price', 'close_price',
>> > etc...
>> >
>> > I know I definitely need a unique index on (stock, date) but I wonder
>> > if
>> > that is best done with a multi-column primary key constraint on those
>> > columns, or with a regular unique index. Also, should the index be
>> clustered?
>> > Would it be useful to create another column that serves as a unique
>> > identifier, possibly making use of the system-assigned IDENTITY.
>>
>> First, every table should have a primary key unless there are exceptional
>> circumstances and the implications of not having one are WELL understood.
>> A
>> unique index is not a valid substitute.  Second, you apparently have not
>> yet
>> discovered what "stock" means - this is another entity which should exist
>> in
>> your model.  Without this entity, it will soon become impossible to
>> derive
>> meaning from your pricing data.
>>
>> The members of the PK are, at a minimum, "stock" (for lack of a proper
>> term)
>> and date.  Indexing strategy has been discussed many times in this NG -
>> BOL
>> also has some good information about indexing.  The choice of a clustered
>> index depends on many factors - you really need to know your system and
>> how
>> it is used to answer any indexing questions.
>>
>> As far as a surrogate key, I doubt that will provide much (if any
>> advantage).  Do you have any reasons for generating the surrogate?
>>
>>
>>

AddThis Social Bookmark Button