Home All Groups Group Topic Archive Search About

Solution and question about indexes in SQL 7 vs. SQL 2000

Author
19 Jul 2006 5:35 PM
Carl Imthurn
I solved the following problem posted a while back, but I now have big questions in my mind about indexes . . .

In a nutshell:
I am in the process of converting a database from 7 to 2000. Here's the steps I followed:

1) detached the db from the SQL 7 instance *with* update statistics
2) made a copy of the physical (.mdf) file -- I wanted SQL Server to create a new .ldf file
3) attached the copy to the SQL 2000 instance
4) ran an UPDATE STATISTICS {mytablename} WITH FULLSCAN on the SQL 2000 database for tbl_AHSTDN (see below).

After having done that, I started running the stored procedures that comprise a nightly download/import process.
They ran fine until I got to one which contained the following code (there's more to the sp, but this statement was the one I narrowed it down to):

UPDATE tblProcedureHistory
SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
FROM
    (
    SELECT    T1.OFFICE_NUM,
        T1.PatientID,
        T1.PatientType,
        T1.StudentID,
        T1.ProcedureID,
        T1.ProcedureSuffix,
        T1.Tooth,
        T1.Surface,
        MAX(T1.PostingDate) AS MaxPostingDate

    FROM tbl_AHSTDN AS T1
    INNER JOIN tblProcedureHistory AS T2
    ON T1.OFFICE_NUM = T2.OFFICE_NUM
    AND T1.PatientID = T2.PatientID
    AND T1.PatientType = T2.PatientType
    AND T1.StudentID = T2.StudentID
    AND T1.ProcedureID = T2.ProcedureID
    AND T1.ProcedureSuffix = T2.ProcedureSuffix
    AND T1.Tooth = T2.Tooth
    AND T1.Surface = T2.Surface
    AND T1.Status = 'A'

    GROUP BY
    T1.OFFICE_NUM,
    T1.PatientID,
    T1.PatientType,
    T1.StudentID,
    T1.ProcedureID,
    T1.ProcedureSuffix,
    T1.Tooth,
    T1.Surface
    )
AS [derived]

WHERE
tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
AND tblProcedureHistory.PatientID = [derived].PatientID
AND tblProcedureHistory.PatientType = [derived].PatientType
AND tblProcedureHistory.StudentID = [derived].StudentID
AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
AND tblProcedureHistory.Tooth = [derived].Tooth
AND tblProcedureHistory.Surface = [derived].Surface

I know about the proprietary aspects of UPDATE...FROM and the possibility of unexpected update results.
*For purposes of this discussion only*, please assume that because of the existing data, that is not an issue. Thanks.

The preceding SQL statement ran in about 90 seconds in SQL 7. In the new database, it will run until I cancel execution.
I let it run once for about 2 hours just for the sheer thrill of it and had to finally cancel execution.

I tried:
1) deleting and re-creating the stored procedure
2) exec sp_updatestats on the database
3) deleting and re-creating the database and re-attaching the .mdf file

Here's what I finally tried that worked: I created a new composite index in tbl_AHSTDN on the following columns:
OFFICE_NUM, PatientID, PatientType, StudentID, ProcedureID, ProcedureSuffix, Tooth, Surface, Status

and the waters parted, and the angels sang, and the SQL statement above ran in 21 seconds.

This raised a number of questions in my mind:
First, why in the world would this work in SQL 7 *without* that index but SQL 2000 had to have that index?
Second, why would the sp not run to completion at all? I can understand the possibility of significantly decreased performance (taking 5 minutes to run as opposed to 21 seconds) but not running to completion? Makes no sense to me . . .
Third, most (but not all) of the 9 columns in the new composite index were already indexed individually in tbl_AHSTDN. Do I have to have a covering index for every possible combination of columns referenced in the WHERE clause and JOIN operations of every
stored procedure I run?

ANY insight into this issue will be incredibly helpful. This drove me crazy, and I was able to figure it out, but I have *absolutely* no idea why this happened, nor why the index fixed it, and in a way that drives me even crazier . . .
Thanks in advance --

Carl

Author
19 Jul 2006 8:01 PM
Roy Harvey
Either your query found a bug in SQL Server 2000, or SQL Server is
generating a simply terrible execution plan - until you added the
index, that is.

Looking at your query, I can not see any purpose for joining to
tblProcedureHistory in the derived table.  No columns are referenced
other than in the join process, and the outer WHERE clause joining
accomplishes the same thing.  I believe it could be simplified (below)
and just maybe the optimizer would approach it differently even
without the index.

UPDATE tblProcedureHistory
SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
FROM
    (
    SELECT    T1.OFFICE_NUM,
        T1.PatientID,
        T1.PatientType,
        T1.StudentID,
        T1.ProcedureID,
        T1.ProcedureSuffix,
        T1.Tooth,
        T1.Surface,
        MAX(T1.PostingDate) AS MaxPostingDate

    FROM tbl_AHSTDN AS T1
    GROUP BY
    T1.OFFICE_NUM,
    T1.PatientID,
    T1.PatientType,
    T1.StudentID,
    T1.ProcedureID,
    T1.ProcedureSuffix,
    T1.Tooth,
    T1.Surface
    )
AS [derived]

WHERE
tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
AND tblProcedureHistory.PatientID = [derived].PatientID
AND tblProcedureHistory.PatientType = [derived].PatientType
AND tblProcedureHistory.StudentID = [derived].StudentID
AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
AND tblProcedureHistory.Tooth = [derived].Tooth
AND tblProcedureHistory.Surface = [derived].Surface

Roy Harvey
Beacon Falls, CT

On Wed, 19 Jul 2006 10:35:35 -0700, Carl Imthurn <nospam@all.thanks>
wrote:

Show quote
>I solved the following problem posted a while back, but I now have big questions in my mind about indexes . . .
>
>In a nutshell:
>I am in the process of converting a database from 7 to 2000. Here's the steps I followed:
>
>1) detached the db from the SQL 7 instance *with* update statistics
>2) made a copy of the physical (.mdf) file -- I wanted SQL Server to create a new .ldf file
>3) attached the copy to the SQL 2000 instance
>4) ran an UPDATE STATISTICS {mytablename} WITH FULLSCAN on the SQL 2000 database for tbl_AHSTDN (see below).
>
>After having done that, I started running the stored procedures that comprise a nightly download/import process.
>They ran fine until I got to one which contained the following code (there's more to the sp, but this statement was the one I narrowed it down to):
>
>UPDATE tblProcedureHistory
>SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
>FROM
>    (
>    SELECT    T1.OFFICE_NUM,
>        T1.PatientID,
>        T1.PatientType,
>        T1.StudentID,
>        T1.ProcedureID,
>        T1.ProcedureSuffix,
>        T1.Tooth,
>        T1.Surface,
>        MAX(T1.PostingDate) AS MaxPostingDate
>
>    FROM tbl_AHSTDN AS T1
>    INNER JOIN tblProcedureHistory AS T2
>    ON T1.OFFICE_NUM = T2.OFFICE_NUM
>    AND T1.PatientID = T2.PatientID
>    AND T1.PatientType = T2.PatientType
>    AND T1.StudentID = T2.StudentID
>    AND T1.ProcedureID = T2.ProcedureID
>    AND T1.ProcedureSuffix = T2.ProcedureSuffix
>    AND T1.Tooth = T2.Tooth
>    AND T1.Surface = T2.Surface
>    AND T1.Status = 'A'
>
>    GROUP BY
>    T1.OFFICE_NUM,
>    T1.PatientID,
>    T1.PatientType,
>    T1.StudentID,
>    T1.ProcedureID,
>    T1.ProcedureSuffix,
>    T1.Tooth,
>    T1.Surface
>    )
>AS [derived]
>
>WHERE
>tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
>AND tblProcedureHistory.PatientID = [derived].PatientID
>AND tblProcedureHistory.PatientType = [derived].PatientType
>AND tblProcedureHistory.StudentID = [derived].StudentID
>AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
>AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
>AND tblProcedureHistory.Tooth = [derived].Tooth
>AND tblProcedureHistory.Surface = [derived].Surface
>
>I know about the proprietary aspects of UPDATE...FROM and the possibility of unexpected update results.
>*For purposes of this discussion only*, please assume that because of the existing data, that is not an issue. Thanks.
>
>The preceding SQL statement ran in about 90 seconds in SQL 7. In the new database, it will run until I cancel execution.
>I let it run once for about 2 hours just for the sheer thrill of it and had to finally cancel execution.
>
>I tried:
>1) deleting and re-creating the stored procedure
>2) exec sp_updatestats on the database
>3) deleting and re-creating the database and re-attaching the .mdf file
>
>Here's what I finally tried that worked: I created a new composite index in tbl_AHSTDN on the following columns:
>OFFICE_NUM, PatientID, PatientType, StudentID, ProcedureID, ProcedureSuffix, Tooth, Surface, Status
>
>and the waters parted, and the angels sang, and the SQL statement above ran in 21 seconds.
>
>This raised a number of questions in my mind:
>First, why in the world would this work in SQL 7 *without* that index but SQL 2000 had to have that index?
>Second, why would the sp not run to completion at all? I can understand the possibility of significantly decreased performance (taking 5 minutes to run as opposed to 21 seconds) but not running to completion? Makes no sense to me . . .
>Third, most (but not all) of the 9 columns in the new composite index were already indexed individually in tbl_AHSTDN. Do I have to have a covering index for every possible combination of columns referenced in the WHERE clause and JOIN operations of every
>stored procedure I run?
>
>ANY insight into this issue will be incredibly helpful. This drove me crazy, and I was able to figure it out, but I have *absolutely* no idea why this happened, nor why the index fixed it, and in a way that drives me even crazier . . .
>Thanks in advance --
>
>Carl
Author
19 Jul 2006 8:14 PM
Carl Imthurn
Thanks for taking the time to reply Roy -- I see what you're saying about joining to tblProcedureHistory.
I will make some time to test that theory (delete the index, modify the stored procedure, see if it runs to completion).

Roy Harvey wrote:
Show quote
> Either your query found a bug in SQL Server 2000, or SQL Server is
> generating a simply terrible execution plan - until you added the
> index, that is.
>
> Looking at your query, I can not see any purpose for joining to
> tblProcedureHistory in the derived table.  No columns are referenced
> other than in the join process, and the outer WHERE clause joining
> accomplishes the same thing.  I believe it could be simplified (below)
> and just maybe the optimizer would approach it differently even
> without the index.
>
> UPDATE tblProcedureHistory
> SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
> FROM
>     (
>     SELECT    T1.OFFICE_NUM,
>         T1.PatientID,
>         T1.PatientType,
>         T1.StudentID,
>         T1.ProcedureID,
>         T1.ProcedureSuffix,
>         T1.Tooth,
>         T1.Surface,
>         MAX(T1.PostingDate) AS MaxPostingDate
>
>     FROM tbl_AHSTDN AS T1
>     GROUP BY
>     T1.OFFICE_NUM,
>     T1.PatientID,
>     T1.PatientType,
>     T1.StudentID,
>     T1.ProcedureID,
>     T1.ProcedureSuffix,
>     T1.Tooth,
>     T1.Surface
>     )
> AS [derived]
>
> WHERE
> tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
> AND tblProcedureHistory.PatientID = [derived].PatientID
> AND tblProcedureHistory.PatientType = [derived].PatientType
> AND tblProcedureHistory.StudentID = [derived].StudentID
> AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
> AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
> AND tblProcedureHistory.Tooth = [derived].Tooth
> AND tblProcedureHistory.Surface = [derived].Surface
>
> Roy Harvey
> Beacon Falls, CT
>
> On Wed, 19 Jul 2006 10:35:35 -0700, Carl Imthurn <nospam@all.thanks>
> wrote:
>
Author
19 Jul 2006 8:51 PM
Steve Kass
Roy,

Your query is logically different from the OP's, I think.  Your
query will update all rows of tblProcedureHistory for which
there is a matching row in tbl_AHSTDN, and update them
with the latest posting date among all the matches.  Carl's will
only update those rows of tblProcedureHistory for which there
is a matching row in tbl_AHSTDN having status = 'A', and it
will update them with the latest status_A posting date.  I think
he wants the latter, given that the column being updated is
called Status_A, but your query doesn't refer to the Status
column.

I can't guess why the query ran so slowly after the upgrade,
but if Carl posts the query plans, I could hazard a guess.

Steve Kass
Drew University
www.stevekass.com

Show quote
"Roy Harvey" <roy_har***@snet.net> wrote in message
news:mk3tb2pb88t9h6mcplh22tonl3je4osbg8@4ax.com...
> Either your query found a bug in SQL Server 2000, or SQL Server is
> generating a simply terrible execution plan - until you added the
> index, that is.
>
> Looking at your query, I can not see any purpose for joining to
> tblProcedureHistory in the derived table.  No columns are referenced
> other than in the join process, and the outer WHERE clause joining
> accomplishes the same thing.  I believe it could be simplified (below)
> and just maybe the optimizer would approach it differently even
> without the index.
>
> UPDATE tblProcedureHistory
> SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
> FROM
> (
> SELECT T1.OFFICE_NUM,
> T1.PatientID,
> T1.PatientType,
> T1.StudentID,
> T1.ProcedureID,
> T1.ProcedureSuffix,
> T1.Tooth,
> T1.Surface,
> MAX(T1.PostingDate) AS MaxPostingDate
>
> FROM tbl_AHSTDN AS T1
> GROUP BY
> T1.OFFICE_NUM,
> T1.PatientID,
> T1.PatientType,
> T1.StudentID,
> T1.ProcedureID,
> T1.ProcedureSuffix,
> T1.Tooth,
> T1.Surface
> )
> AS [derived]
>
> WHERE
> tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
> AND tblProcedureHistory.PatientID = [derived].PatientID
> AND tblProcedureHistory.PatientType = [derived].PatientType
> AND tblProcedureHistory.StudentID = [derived].StudentID
> AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
> AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
> AND tblProcedureHistory.Tooth = [derived].Tooth
> AND tblProcedureHistory.Surface = [derived].Surface
>
> Roy Harvey
> Beacon Falls, CT
>
> On Wed, 19 Jul 2006 10:35:35 -0700, Carl Imthurn <nospam@all.thanks>
> wrote:
>
>>I solved the following problem posted a while back, but I now have big
>>questions in my mind about indexes . . .
>>
>>In a nutshell:
>>I am in the process of converting a database from 7 to 2000. Here's the
>>steps I followed:
>>
>>1) detached the db from the SQL 7 instance *with* update statistics
>>2) made a copy of the physical (.mdf) file -- I wanted SQL Server to
>>create a new .ldf file
>>3) attached the copy to the SQL 2000 instance
>>4) ran an UPDATE STATISTICS {mytablename} WITH FULLSCAN on the SQL 2000
>>database for tbl_AHSTDN (see below).
>>
>>After having done that, I started running the stored procedures that
>>comprise a nightly download/import process.
>>They ran fine until I got to one which contained the following code
>>(there's more to the sp, but this statement was the one I narrowed it down
>>to):
>>
>>UPDATE tblProcedureHistory
>>SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
>>FROM
>> (
>> SELECT T1.OFFICE_NUM,
>> T1.PatientID,
>> T1.PatientType,
>> T1.StudentID,
>> T1.ProcedureID,
>> T1.ProcedureSuffix,
>> T1.Tooth,
>> T1.Surface,
>> MAX(T1.PostingDate) AS MaxPostingDate
>>
>> FROM tbl_AHSTDN AS T1
>> INNER JOIN tblProcedureHistory AS T2
>> ON T1.OFFICE_NUM = T2.OFFICE_NUM
>> AND T1.PatientID = T2.PatientID
>> AND T1.PatientType = T2.PatientType
>> AND T1.StudentID = T2.StudentID
>> AND T1.ProcedureID = T2.ProcedureID
>> AND T1.ProcedureSuffix = T2.ProcedureSuffix
>> AND T1.Tooth = T2.Tooth
>> AND T1.Surface = T2.Surface
>> AND T1.Status = 'A'
>>
>> GROUP BY
>> T1.OFFICE_NUM,
>> T1.PatientID,
>> T1.PatientType,
>> T1.StudentID,
>> T1.ProcedureID,
>> T1.ProcedureSuffix,
>> T1.Tooth,
>> T1.Surface
>> )
>>AS [derived]
>>
>>WHERE
>>tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
>>AND tblProcedureHistory.PatientID = [derived].PatientID
>>AND tblProcedureHistory.PatientType = [derived].PatientType
>>AND tblProcedureHistory.StudentID = [derived].StudentID
>>AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
>>AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
>>AND tblProcedureHistory.Tooth = [derived].Tooth
>>AND tblProcedureHistory.Surface = [derived].Surface
>>
>>I know about the proprietary aspects of UPDATE...FROM and the possibility
>>of unexpected update results.
>>*For purposes of this discussion only*, please assume that because of the
>>existing data, that is not an issue. Thanks.
>>
>>The preceding SQL statement ran in about 90 seconds in SQL 7. In the new
>>database, it will run until I cancel execution.
>>I let it run once for about 2 hours just for the sheer thrill of it and
>>had to finally cancel execution.
>>
>>I tried:
>>1) deleting and re-creating the stored procedure
>>2) exec sp_updatestats on the database
>>3) deleting and re-creating the database and re-attaching the .mdf file
>>
>>Here's what I finally tried that worked: I created a new composite index
>>in tbl_AHSTDN on the following columns:
>>OFFICE_NUM, PatientID, PatientType, StudentID, ProcedureID,
>>ProcedureSuffix, Tooth, Surface, Status
>>
>>and the waters parted, and the angels sang, and the SQL statement above
>>ran in 21 seconds.
>>
>>This raised a number of questions in my mind:
>>First, why in the world would this work in SQL 7 *without* that index but
>>SQL 2000 had to have that index?
>>Second, why would the sp not run to completion at all? I can understand
>>the possibility of significantly decreased performance (taking 5 minutes
>>to run as opposed to 21 seconds) but not running to completion? Makes no
>>sense to me . . .
>>Third, most (but not all) of the 9 columns in the new composite index were
>>already indexed individually in tbl_AHSTDN. Do I have to have a covering
>>index for every possible combination of columns referenced in the WHERE
>>clause and JOIN operations of every
>>stored procedure I run?
>>
>>ANY insight into this issue will be incredibly helpful. This drove me
>>crazy, and I was able to figure it out, but I have *absolutely* no idea
>>why this happened, nor why the index fixed it, and in a way that drives me
>>even crazier . . .
>>Thanks in advance --
>>
>>Carl
Author
19 Jul 2006 9:36 PM
Roy Harvey
Show quote
On Wed, 19 Jul 2006 16:51:46 -0400, "Steve Kass" <sk***@drew.edu>
wrote:

>Roy,
>
>Your query is logically different from the OP's, I think.  Your
>query will update all rows of tblProcedureHistory for which
>there is a matching row in tbl_AHSTDN, and update them
>with the latest posting date among all the matches.  Carl's will
>only update those rows of tblProcedureHistory for which there
>is a matching row in tbl_AHSTDN having status = 'A', and it
>will update them with the latest status_A posting date.  I think
>he wants the latter, given that the column being updated is
>called Status_A, but your query doesn't refer to the Status
>column.

Correct, thanks for catching that.  I needed to preserve the test
T1.Status = 'A'.

Revised, hopefully correct, version:

UPDATE tblProcedureHistory
SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
FROM
    (
    SELECT    T1.OFFICE_NUM,
        T1.PatientID,
        T1.PatientType,
        T1.StudentID,
        T1.ProcedureID,
        T1.ProcedureSuffix,
        T1.Tooth,
        T1.Surface,
        MAX(T1.PostingDate) AS MaxPostingDate

    FROM tbl_AHSTDN AS T1
    WHERE T1.Status = 'A' --Was missing
    GROUP BY
    T1.OFFICE_NUM,
    T1.PatientID,
    T1.PatientType,
    T1.StudentID,
    T1.ProcedureID,
    T1.ProcedureSuffix,
    T1.Tooth,
    T1.Surface
    )
AS [derived]

WHERE
tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
AND tblProcedureHistory.PatientID = [derived].PatientID
AND tblProcedureHistory.PatientType = [derived].PatientType
AND tblProcedureHistory.StudentID = [derived].StudentID
AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
AND tblProcedureHistory.Tooth = [derived].Tooth
AND tblProcedureHistory.Surface = [derived].Surface

Roy
Author
19 Jul 2006 10:12 PM
Carl Imthurn
Thanks Roy and Steve for your time and assistance. Very much appreciated. Here's the latest:

I removed the 9-column composite index that I had created earlier and ran the SQL statement again. As expected, it did not run to completion. I cancelled it after about 12 minutes. Then, I removed the inner join in the subquery as you suggested, Roy. I
also added back in the WHERE T1.Status = 'A' clause so that the SQL statement was identical to Roy's corrected version below.
And wouldn't you know it -- it ran to completion! And this was *without* the additional 9-column index.
Now -- I don't even pretend to understand indexing or optimizing, so I don't know what's going on, but it's working now.
Thanks to both of you for your help.

I'll try to post the query plans this afternoon or tomorrow, Steve. I think I remember trying to do this in the past and having a real tough time because the plan came out in a graphical mode which didn't translate well to a text-based newsgroup, so any
assistance you can render about posting the query plans will be appreciated.

Thanks again -- you guys are great.

Carl

Roy Harvey wrote:
Show quote
> On Wed, 19 Jul 2006 16:51:46 -0400, "Steve Kass" <sk***@drew.edu>
> wrote:
>
>
>>Roy,
>>
>>Your query is logically different from the OP's, I think.  Your
>>query will update all rows of tblProcedureHistory for which
>>there is a matching row in tbl_AHSTDN, and update them
>>with the latest posting date among all the matches.  Carl's will
>>only update those rows of tblProcedureHistory for which there
>>is a matching row in tbl_AHSTDN having status = 'A', and it
>>will update them with the latest status_A posting date.  I think
>>he wants the latter, given that the column being updated is
>>called Status_A, but your query doesn't refer to the Status
>>column.
>
>
> Correct, thanks for catching that.  I needed to preserve the test
> T1.Status = 'A'.
>
> Revised, hopefully correct, version:
>
> UPDATE tblProcedureHistory
> SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
> FROM
>     (
>     SELECT    T1.OFFICE_NUM,
>         T1.PatientID,
>         T1.PatientType,
>         T1.StudentID,
>         T1.ProcedureID,
>         T1.ProcedureSuffix,
>         T1.Tooth,
>         T1.Surface,
>         MAX(T1.PostingDate) AS MaxPostingDate
>
>     FROM tbl_AHSTDN AS T1
>     WHERE T1.Status = 'A' --Was missing
>     GROUP BY
>     T1.OFFICE_NUM,
>     T1.PatientID,
>     T1.PatientType,
>     T1.StudentID,
>     T1.ProcedureID,
>     T1.ProcedureSuffix,
>     T1.Tooth,
>     T1.Surface
>     )
> AS [derived]
>
> WHERE
> tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
> AND tblProcedureHistory.PatientID = [derived].PatientID
> AND tblProcedureHistory.PatientType = [derived].PatientType
> AND tblProcedureHistory.StudentID = [derived].StudentID
> AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
> AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
> AND tblProcedureHistory.Tooth = [derived].Tooth
> AND tblProcedureHistory.Surface = [derived].Surface
>
> Roy
Author
20 Jul 2006 12:05 AM
Steve Kass
Show quote
"Carl Imthurn" <nospam@all.thanks> wrote in message
news:uIxEXA4qGHA.3380@TK2MSFTNGP04.phx.gbl...
> Thanks Roy and Steve for your time and assistance. Very much appreciated.
> Here's the latest:
>
> I removed the 9-column composite index that I had created earlier and ran
> the SQL statement again. As expected, it did not run to completion. I
> cancelled it after about 12 minutes. Then, I removed the inner join in the
> subquery as you suggested, Roy. I also added back in the WHERE T1.Status =
> 'A' clause so that the SQL statement was identical to Roy's corrected
> version below.
> And wouldn't you know it -- it ran to completion! And this was *without*
> the additional 9-column index.
> Now -- I don't even pretend to understand indexing or optimizing, so I
> don't know what's going on, but it's working now.
> Thanks to both of you for your help.
>
> I'll try to post the query plans this afternoon or tomorrow, Steve. I
> think I remember trying to do this in the past and having a real tough
> time because the plan came out in a graphical mode which didn't translate
> well to a text-based newsgroup, so any assistance you can render about
> posting the query plans will be appreciated.

In query analyzer, surround the query with

go
set showplan_text on
go

<the query>

go
set showplan_text off
go

You'll get a text version of the plan in the results/messages pane.

SK

Show quote
>
> Thanks again -- you guys are great.
>
> Carl
>
> Roy Harvey wrote:
>> On Wed, 19 Jul 2006 16:51:46 -0400, "Steve Kass" <sk***@drew.edu>
>> wrote:
>>
>>
>>>Roy,
>>>
>>>Your query is logically different from the OP's, I think.  Your
>>>query will update all rows of tblProcedureHistory for which
>>>there is a matching row in tbl_AHSTDN, and update them
>>>with the latest posting date among all the matches.  Carl's will
>>>only update those rows of tblProcedureHistory for which there
>>>is a matching row in tbl_AHSTDN having status = 'A', and it
>>>will update them with the latest status_A posting date.  I think
>>>he wants the latter, given that the column being updated is
>>>called Status_A, but your query doesn't refer to the Status
>>>column.
>>
>>
>> Correct, thanks for catching that.  I needed to preserve the test
>> T1.Status = 'A'.
>>
>> Revised, hopefully correct, version:
>>
>> UPDATE tblProcedureHistory
>> SET tblProcedureHistory.Status_A = [derived].MaxPostingDate
>> FROM
>> (
>> SELECT T1.OFFICE_NUM,
>> T1.PatientID,
>> T1.PatientType,
>> T1.StudentID,
>> T1.ProcedureID,
>> T1.ProcedureSuffix,
>> T1.Tooth,
>> T1.Surface,
>> MAX(T1.PostingDate) AS MaxPostingDate
>>
>> FROM tbl_AHSTDN AS T1
>> WHERE T1.Status = 'A' --Was missing
>> GROUP BY
>> T1.OFFICE_NUM,
>> T1.PatientID,
>> T1.PatientType,
>> T1.StudentID,
>> T1.ProcedureID,
>> T1.ProcedureSuffix,
>> T1.Tooth,
>> T1.Surface
>> )
>> AS [derived]
>>
>> WHERE
>> tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM
>> AND tblProcedureHistory.PatientID = [derived].PatientID
>> AND tblProcedureHistory.PatientType = [derived].PatientType
>> AND tblProcedureHistory.StudentID = [derived].StudentID
>> AND tblProcedureHistory.ProcedureID = [derived].ProcedureID
>> AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix
>> AND tblProcedureHistory.Tooth = [derived].Tooth
>> AND tblProcedureHistory.Surface = [derived].Surface
>>
>> Roy
Author
20 Jul 2006 12:54 PM
Carl Imthurn
Here are the two execution plans for the SQL Server 7 database and the SQL Server 2000 database.
Sorry about the line wrap. And, it appears that the longer lines have been truncated.
How do I keep them from being truncated in Query Analyzer?
As always, thanks in advance for your help.

Carl

-- SQL Server 7 database (runs to completion):

   |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]), SET:([tblProcedureHistory].[Status_A]=[Expr1002]))
        |--Top(ROWCOUNT est 0)
             |--Parallelism(Gather Streams)
                  |--Hash Match Root(Inner Join, HASH:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], [T1].[PatientType], [T1].[PatientID])=([tblProcedureHistory].[OFFICE_NUM], [tblProcedureHis
                       |--Hash Match Team(Inner Join, HASH:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], [T1].[PatientType], [T1].[PatientID])=([T2].[OFFICE_NUM], [T2].[Surface], [T2].[Tooth]
                       |    |--Parallelism(Repartition Streams, PARTITION COLUMNS:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], [T1].[PatientType], [T1].[PatientID]))
                       |    |    |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]), WHERE:([T1].[Status]='A'))
                       |    |--Parallelism(Repartition Streams, PARTITION COLUMNS:([T2].[OFFICE_NUM], [T2].[Surface], [T2].[Tooth], [T2].[ProcedureSuffix], [T2].[ProcedureID], [T2].[StudentID], [T2].[PatientType], [T2].[PatientID]))
                       |         |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2]))
                       |--Parallelism(Distribute Streams, PARTITION COLUMNS:([tblProcedureHistory].[OFFICE_NUM], [tblProcedureHistory].[Surface], [tblProcedureHistory].[Tooth], [tblProcedureHistory].[ProcedureSuffix], [tblProcedureHistory].[ProcedureID], [t
                            |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]))
--------------------------------------------------------------------------------------------------------------------------------------------------------------

--SQL Server 2000 database (does not run to completion):

Show quote
   |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]), SET:([tblProcedureHistory].[Status_A]=[Expr1002]))
        |--Top(ROWCOUNT est 0)
             |--Stream Aggregate(GROUP BY:([Bmk1003]) DEFINE:([Expr1002]=MAX([T1].[PostingDate])))
                  |--Sort(ORDER BY:([Bmk1003] ASC))
                       |--Filter(WHERE:((((([T1].[PatientID]=[tblProcedureHistory].[PatientID] AND [T1].[PatientType]=[tblProcedureHistory].[PatientType]) AND [T1].[ProcedureSuffix]=[tblProcedureHistory].[ProcedureSuffix]) AND [T1].[Tooth]=[tblProcedureHist
                            |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]) WITH PREFETCH)
                                 |--Nested Loops(Inner Join, OUTER REFERENCES:([tblProcedureHistory].[ProcedureID], [tblProcedureHistory].[StudentID], [tblProcedureHistory].[OFFICE_NUM]) WITH PREFETCH)
                                      |--Hash Match(Inner Join, HASH:([T2].[OFFICE_NUM], [T2].[PatientID], [T2].[PatientType], [T2].[StudentID], [T2].[ProcedureID], [T2].[ProcedureSuffix], [T2].[Tooth], [T2].[Surface])=([tblProcedureHistory].[OFFICE_NUM], [
                                      |    |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2]))
                                      |    |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]))
                                      |--Index Seek(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN].[ATTENDANCE_REPORT_INDEX] AS [T1]), SEEK:([T1].[ProcedureID]=[tblProcedureHistory].[ProcedureID]),  WHERE:([T1].[OFFICE_NUM]=[tblProcedureHistory].[OFFICE_NUM] AN
Author
20 Jul 2006 3:37 PM
Steve Kass
There is a Tools|Options setting for the number of characters to display per
column.  You can set this to 8000 to avoid the truncation.

These plans are quite different, and I would guess that the Bookmark Lookup
is what's killing you. When a query plan needs to go access another table to
find a column value, and it does this one row at a time (as it identifies
those rows elsewhere in the plan), it typically causes very scattered access
to the data pages of the lookup table.  If the rowcount estimates are not
very good (and there are many situations where they will not be, because
statistics just can't provide good estimates, not because there is a flaw in
the product), the optimizer will think only a few lookups will be needed,
but perhaps thousands of times as many will be needed.  This can quite
easily result in a 1000's-fold increase in the time the query takes to
complete.  I can't quite picture the whole query plan from what's below and
without all the CREATE statements for tables and indexes, but typically you
can avoid the bookmark plan with better indexing (a bookmark often means
there is no suitable covering index) or an index hint.  If you are
interested in more detail about the problems that can come up with bookmark
plans, see
http://www.users.drew.edu/skass/SQL/ConnectionsFall2003/Kass_SDB418_Statistics_SQLConnectionsFall2003.zip.

SK



Show quote
"Carl Imthurn" <nospam@all.thanks> wrote in message
news:eioOHt$qGHA.3908@TK2MSFTNGP05.phx.gbl...
> Here are the two execution plans for the SQL Server 7 database and the SQL
> Server 2000 database.
> Sorry about the line wrap. And, it appears that the longer lines have been
> truncated.
> How do I keep them from being truncated in Query Analyzer?
> As always, thanks in advance for your help.
>
> Carl
>
> -- SQL Server 7 database (runs to completion):
>
>   |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]),
> SET:([tblProcedureHistory].[Status_A]=[Expr1002]))
>        |--Top(ROWCOUNT est 0)
>             |--Parallelism(Gather Streams)
>                  |--Hash Match Root(Inner Join, HASH:([T1].[OFFICE_NUM],
> [T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID],
> [T1].[StudentID], [T1].[PatientType],
> [T1].[PatientID])=([tblProcedureHistory].[OFFICE_NUM], [tblProcedureHis
>                       |--Hash Match Team(Inner Join,
> HASH:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth],
> [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID],
> [T1].[PatientType], [T1].[PatientID])=([T2].[OFFICE_NUM], [T2].[Surface],
> [T2].[Tooth]
>                       |    |--Parallelism(Repartition Streams, PARTITION
> COLUMNS:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth],
> [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID],
> [T1].[PatientType], [T1].[PatientID]))
>                       |    |    |--Table
> Scan(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]),
> WHERE:([T1].[Status]='A'))
>                       |    |--Parallelism(Repartition Streams, PARTITION
> COLUMNS:([T2].[OFFICE_NUM], [T2].[Surface], [T2].[Tooth],
> [T2].[ProcedureSuffix], [T2].[ProcedureID], [T2].[StudentID],
> [T2].[PatientType], [T2].[PatientID]))
>                       |         |--Table
> Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2]))
>                       |--Parallelism(Distribute Streams, PARTITION
> COLUMNS:([tblProcedureHistory].[OFFICE_NUM],
> [tblProcedureHistory].[Surface], [tblProcedureHistory].[Tooth],
> [tblProcedureHistory].[ProcedureSuffix],
> [tblProcedureHistory].[ProcedureID], [t
>                            |--Table
> Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]))
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> --SQL Server 2000 database (does not run to completion):
>
>   |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]),
> SET:([tblProcedureHistory].[Status_A]=[Expr1002]))
>        |--Top(ROWCOUNT est 0)
>             |--Stream Aggregate(GROUP BY:([Bmk1003])
> DEFINE:([Expr1002]=MAX([T1].[PostingDate])))
>                  |--Sort(ORDER BY:([Bmk1003] ASC))
>
> |--Filter(WHERE:((((([T1].[PatientID]=[tblProcedureHistory].[PatientID]
> AND [T1].[PatientType]=[tblProcedureHistory].[PatientType]) AND
> [T1].[ProcedureSuffix]=[tblProcedureHistory].[ProcedureSuffix]) AND
> [T1].[Tooth]=[tblProcedureHist
>                            |--Bookmark Lookup(BOOKMARK:([Bmk1000]),
> OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]) WITH PREFETCH)
>                                 |--Nested Loops(Inner Join, OUTER
> REFERENCES:([tblProcedureHistory].[ProcedureID],
> [tblProcedureHistory].[StudentID], [tblProcedureHistory].[OFFICE_NUM])
> WITH PREFETCH)
>                                      |--Hash Match(Inner Join,
> HASH:([T2].[OFFICE_NUM], [T2].[PatientID], [T2].[PatientType],
> [T2].[StudentID], [T2].[ProcedureID], [T2].[ProcedureSuffix],
> [T2].[Tooth], [T2].[Surface])=([tblProcedureHistory].[OFFICE_NUM], [
>                                      |    |--Table
> Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2]))
>                                      |    |--Table
> Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]))
>                                      |--Index
> Seek(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN].[ATTENDANCE_REPORT_INDEX]
> AS [T1]), SEEK:([T1].[ProcedureID]=[tblProcedureHistory].[ProcedureID]),
> WHERE:([T1].[OFFICE_NUM]=[tblProcedureHistory].[OFFICE_NUM] AN
Author
20 Jul 2006 4:22 PM
Carl Imthurn
Thanks for getting back to me Steve. I changed the setting in Tools|Options so that next time I need help I won't truncate the output   ;-)
I downloaded the zip file and it was almost like sitting in your classroom!  BTW, that's a good thing.
Thanks for your time and expertise -- I really appreciate it.

Carl

Steve Kass wrote:
Show quote
> There is a Tools|Options setting for the number of characters to display per
> column.  You can set this to 8000 to avoid the truncation.
>
> These plans are quite different, and I would guess that the Bookmark Lookup
> is what's killing you. When a query plan needs to go access another table to
> find a column value, and it does this one row at a time (as it identifies
> those rows elsewhere in the plan), it typically causes very scattered access
> to the data pages of the lookup table.  If the rowcount estimates are not
> very good (and there are many situations where they will not be, because
> statistics just can't provide good estimates, not because there is a flaw in
> the product), the optimizer will think only a few lookups will be needed,
> but perhaps thousands of times as many will be needed.  This can quite
> easily result in a 1000's-fold increase in the time the query takes to
> complete.  I can't quite picture the whole query plan from what's below and
> without all the CREATE statements for tables and indexes, but typically you
> can avoid the bookmark plan with better indexing (a bookmark often means
> there is no suitable covering index) or an index hint.  If you are
> interested in more detail about the problems that can come up with bookmark
> plans, see
> http://www.users.drew.edu/skass/SQL/ConnectionsFall2003/Kass_SDB418_Statistics_SQLConnectionsFall2003.zip.
>
> SK
>
>
>
> "Carl Imthurn" <nospam@all.thanks> wrote in message
> news:eioOHt$qGHA.3908@TK2MSFTNGP05.phx.gbl...
>
>>Here are the two execution plans for the SQL Server 7 database and the SQL
>>Server 2000 database.
>>Sorry about the line wrap. And, it appears that the longer lines have been
>>truncated.
>>How do I keep them from being truncated in Query Analyzer?
>>As always, thanks in advance for your help.
>>
>>Carl
>>
>>-- SQL Server 7 database (runs to completion):
>>
>>  |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]),
>>SET:([tblProcedureHistory].[Status_A]=[Expr1002]))
>>       |--Top(ROWCOUNT est 0)
>>            |--Parallelism(Gather Streams)
>>                 |--Hash Match Root(Inner Join, HASH:([T1].[OFFICE_NUM],
>>[T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID],
>>[T1].[StudentID], [T1].[PatientType],
>>[T1].[PatientID])=([tblProcedureHistory].[OFFICE_NUM], [tblProcedureHis
>>                      |--Hash Match Team(Inner Join,
>>HASH:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth],
>>[T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID],
>>[T1].[PatientType], [T1].[PatientID])=([T2].[OFFICE_NUM], [T2].[Surface],
>>[T2].[Tooth]
>>                      |    |--Parallelism(Repartition Streams, PARTITION
>>COLUMNS:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth],
>>[T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID],
>>[T1].[PatientType], [T1].[PatientID]))
>>                      |    |    |--Table
>>Scan(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]),
>>WHERE:([T1].[Status]='A'))
>>                      |    |--Parallelism(Repartition Streams, PARTITION
>>COLUMNS:([T2].[OFFICE_NUM], [T2].[Surface], [T2].[Tooth],
>>[T2].[ProcedureSuffix], [T2].[ProcedureID], [T2].[StudentID],
>>[T2].[PatientType], [T2].[PatientID]))
>>                      |         |--Table
>>Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2]))
>>                      |--Parallelism(Distribute Streams, PARTITION
>>COLUMNS:([tblProcedureHistory].[OFFICE_NUM],
>>[tblProcedureHistory].[Surface], [tblProcedureHistory].[Tooth],
>>[tblProcedureHistory].[ProcedureSuffix],
>>[tblProcedureHistory].[ProcedureID], [t
>>                           |--Table
>>Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]))
>>--------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>--SQL Server 2000 database (does not run to completion):
>>
>>  |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]),
>>SET:([tblProcedureHistory].[Status_A]=[Expr1002]))
>>       |--Top(ROWCOUNT est 0)
>>            |--Stream Aggregate(GROUP BY:([Bmk1003])
>>DEFINE:([Expr1002]=MAX([T1].[PostingDate])))
>>                 |--Sort(ORDER BY:([Bmk1003] ASC))
>>
>>|--Filter(WHERE:((((([T1].[PatientID]=[tblProcedureHistory].[PatientID]
>>AND [T1].[PatientType]=[tblProcedureHistory].[PatientType]) AND
>>[T1].[ProcedureSuffix]=[tblProcedureHistory].[ProcedureSuffix]) AND
>>[T1].[Tooth]=[tblProcedureHist
>>                           |--Bookmark Lookup(BOOKMARK:([Bmk1000]),
>>OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]) WITH PREFETCH)
>>                                |--Nested Loops(Inner Join, OUTER
>>REFERENCES:([tblProcedureHistory].[ProcedureID],
>>[tblProcedureHistory].[StudentID], [tblProcedureHistory].[OFFICE_NUM])
>>WITH PREFETCH)
>>                                     |--Hash Match(Inner Join,
>>HASH:([T2].[OFFICE_NUM], [T2].[PatientID], [T2].[PatientType],
>>[T2].[StudentID], [T2].[ProcedureID], [T2].[ProcedureSuffix],
>>[T2].[Tooth], [T2].[Surface])=([tblProcedureHistory].[OFFICE_NUM], [
>>                                     |    |--Table
>>Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2]))
>>                                     |    |--Table
>>Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]))
>>                                     |--Index
>>Seek(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN].[ATTENDANCE_REPORT_INDEX]
>>AS [T1]), SEEK:([T1].[ProcedureID]=[tblProcedureHistory].[ProcedureID]),
>>WHERE:([T1].[OFFICE_NUM]=[tblProcedureHistory].[OFFICE_NUM] AN
>
>
>

AddThis Social Bookmark Button