Home All Groups Group Topic Archive Search About

Performance Problems On Recursive Table

Author
11 Aug 2005 9:13 PM
Paul Tiseo
I have a table that has a parent-child relation to itself. (see SQL at end)
Basically, records of type 0 are related to type 1, and type 1 to type 2.
This table has about 2 million records right now. When I do this:

INNER JOIN
        ExpSrvLog I ON E.ExpSrvID = I.ExpSrvID AND I.ItemType = 2
LEFT OUTER JOIN
        ExpSrvLog B ON I.LogID = B.ParentID AND B.ItemType = 1

It seems to consume an inordinate amount of time. Any way to optimize this
behavior? Thanks.

CREATE TABLE [ExpSrvLog] (
    [LogID] [bigint] IDENTITY (1, 1) NOT NULL ,
    [ExpSrvID] [int] NOT NULL ,
    [CalendarDate] [smalldatetime] NOT NULL ,
    [ItemType] [tinyint] NOT NULL ,
    [VendorInvoiceID] [int] NULL ,
    [ClientInvoiceID] [int] NULL ,
    [PaymentID] [int] NULL ,
    [BillingAmount] [money] NULL ,
    [ActualQty] [decimal](8, 2) NULL ,
    [ActualRate] [money] NULL ,
    [ParentID] [bigint] NULL ,
    CONSTRAINT [PK_ExpSrvLog] PRIMARY KEY  CLUSTERED
    (
        [LogID] DESC
    ) WITH  FILLFACTOR = 90  ON [PRIMARY] ,
    CONSTRAINT [FK_ExpSrvLog_ClientInvoice] FOREIGN KEY
    (
        [ClientInvoiceID]
    ) REFERENCES [ClientInvoice] (
        [InvoiceID]
    ),
    CONSTRAINT [FK_ExpSrvLog_ExpSrvLog] FOREIGN KEY
    (
        [ParentID]
    ) REFERENCES [ExpSrvLog] (
        [LogID]
    ),
    CONSTRAINT [FK_ExpSrvLog_VendorInvoice] FOREIGN KEY
    (
        [VendorInvoiceID]
    ) REFERENCES [VendorInvoice] (
        [InvoiceID]
    )
) ON [PRIMARY]
END

Author
11 Aug 2005 9:29 PM
ML
Apart from the primary key, are there any other indexes on that table?

Foreign key columns should at least be indexed.


ML
Author
12 Aug 2005 11:06 AM
Paul Tiseo
"ML" wrote:
>
> Apart from the primary key, are there any other indexes on that table?

Yes. One over ExpSrvID, CalendarDate and Itemtype. One over ParentID. (The
VendorInvoiceID and CLientInvoiceID fields are also indexes as they are FKs.)
I've also tried an index over ParentID and LogID, to no help.

- alphadog
Author
12 Aug 2005 11:42 AM
frank chang
Paul, While recursive joins are elegant, they are notoriously slow on
recursive tables with > 1 million rows. You might want try to use mutiples
queries and(or) tables to accomplish your task instead of the recursive joins.

Show quote
"Paul Tiseo" wrote:

> "ML" wrote:
> >
> > Apart from the primary key, are there any other indexes on that table?
>
> Yes. One over ExpSrvID, CalendarDate and Itemtype. One over ParentID. (The
> VendorInvoiceID and CLientInvoiceID fields are also indexes as they are FKs.)
> I've also tried an index over ParentID and LogID, to no help.
>
> - alphadog
Author
12 Aug 2005 12:25 PM
Paul Tiseo
Really? Damn. It works so well that way. Oh well, thanks for the info, Frank.

So, what are all my options? Physically splitting the table is something I
have planned, but at this point in developement is not an option. I guess
it'll have to be temp tables.

Show quote
"frank chang" wrote:

> Paul, While recursive joins are elegant, they are notoriously slow on
> recursive tables with > 1 million rows. You might want try to use mutiples
> queries and(or) tables to accomplish your task instead of the recursive joins.
>
> "Paul Tiseo" wrote:
>
> > "ML" wrote:
> > >
> > > Apart from the primary key, are there any other indexes on that table?
> >
> > Yes. One over ExpSrvID, CalendarDate and Itemtype. One over ParentID. (The
> > VendorInvoiceID and CLientInvoiceID fields are also indexes as they are FKs.)
> > I've also tried an index over ParentID and LogID, to no help.
> >
> > - alphadog
Author
12 Aug 2005 12:49 PM
ML
Maybe this example can help you find a way to navigate the
ancestor/descendant axes in your hierarchy:
http://milambda.blogspot.com/2005/07/climbing-trees-is-for-monkeys.html


ML
Author
12 Aug 2005 1:43 PM
--CELKO--
>> I have a table that has a parent-child relation to itself. (see SQL at end) Basically, records [sic] of type 0 are related to type 1, and type 1 to type 2.  This table has about 2 million records [sic] right now. When I do this: <<

No relational key, IDENTITY, too many NULL-able columns, money and
bigint proprierary datatypes and you don't know that rows and records
are not the same.  .And there are not specs.

>> It seems to consume an inordinate amount of time. Any way to optimize this
behavior? <<

Change the DDL.  Does this hierarchy only go down three levels?  Can
level 0 have more than one level 1 subordinate?  Can a level 1 have
more than one level 2 subordinate?  If I assume not, then:

CREATE TABLE ExpsrvLog
(expsrv_id INTEGER NOT NULL,
hierarchy_level INTEGER DEFAULT 0 NOT NULL
       CHECK (hierarchy_level IN (0, 1, 2)),
payment_date DATETIME NOT NULL,
vendor_invoice_nbr INTEGER NOT NULL,
    REFERENCES VendorInvoices (invoice_nbr)
    ON UPDATE CASCADE,
client_invoice_nbr INTEGER NOT NULL
    REFERENCES ClientInvoices (invoice_nbr)
    ON UPDATE CASCADE,
payment_nbr INTEGER NOT NULL,
billing_amount DECIMAL(8,2) NOT NULL,
actual_qty DECIMAL(8,2) NOT NULL,
actual_rate DECIMAL(8,2) NULL,
PRIMARY KEY (expsrv_id, hierarchy_level));

If the asumption was wrong, we can move onto the nested sets model.
You might want to get a copy of TREES & HIERARCHIES IN SQL along with a
basic data modeling book.
Author
12 Aug 2005 2:47 PM
Paul Tiseo
"--CELKO--" wrote:
> No relational key,

Of what type? I have a primary key (see first post), an alternate key (not
in the original DDL, but basically CREATE UNIQUE INDEX AK_ExpSrvLog ON
dbo.ExpSrvLog([ExpSrvID], [CalendarDate] DESC , [ItemType]) ) and some
primary-foreign key relationships. What exactly is missing? Do you mean a
natural key? If so, it's the AK that I did not include.

> IDENTITY

Why? How does IDENTITY affect my performance question?

> too many NULL-able columns,

The Real World intrudes into the World of Relational Model Puritans
sometimes. <shrug> Again, how do the NULLable columns affect my performance
problem?

> money and bigint proprierary datatypes

Not a problem for this project.

> and you don't know that rows and records are not the same.

Do you know what they say about "ASS-U-ME"? I use common idioms. Forgive me
for not abiding by your stricter one.

> And there are not specs.

I'm sorry, but I don't understand this comment. What specs do you need?

> Change the DDL.  Does this hierarchy only go down three levels?  Can
> level 0 have more than one level 1 subordinate?  Can a level 1 have
> more than one level 2 subordinate?  If I assume not, then:

Yes (three and only three levels), yes and yes.

Show quote
> CREATE TABLE ExpsrvLog
> (expsrv_id INTEGER NOT NULL,
>  hierarchy_level INTEGER DEFAULT 0 NOT NULL
>        CHECK (hierarchy_level IN (0, 1, 2)),
>  payment_date DATETIME NOT NULL,
>  vendor_invoice_nbr INTEGER NOT NULL,
>     REFERENCES VendorInvoices (invoice_nbr)
>     ON UPDATE CASCADE,
>  client_invoice_nbr INTEGER NOT NULL
>     REFERENCES ClientInvoices (invoice_nbr)
>     ON UPDATE CASCADE,
>  payment_nbr INTEGER NOT NULL,
>  billing_amount DECIMAL(8,2) NOT NULL,
>  actual_qty DECIMAL(8,2) NOT NULL,
>  actual_rate DECIMAL(8,2) NULL,
>  PRIMARY KEY (expsrv_id, hierarchy_level));
>
> If the asumption was wrong, we can move onto the nested sets model.

So, you changed the structure of the table in a way that doesn't address my
original problem (in fact you seem to have removed the recursive
relationship) but makes it more purist-friendly in terms of datatypes and
nullability. Your "SQL standardization" and "Relational Model" efforts are
appreciated (although your zeal blinded you to my actual question) but it's
not a solution to my immediate problem.

I don't need to know *how* to model or query the hierarchy. I need to know
*why* it isn't performing properly when I use the self-join.

Thanks.

AddThis Social Bookmark Button