|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Solution and question about indexes in SQL 7 vs. SQL 2000In a nutshell: I am in the process of converting a database from 7 to 2000. Here's the steps I followed: 1) detached the db from the SQL 7 instance *with* update statistics 2) made a copy of the physical (.mdf) file -- I wanted SQL Server to create a new .ldf file 3) attached the copy to the SQL 2000 instance 4) ran an UPDATE STATISTICS {mytablename} WITH FULLSCAN on the SQL 2000 database for tbl_AHSTDN (see below). After having done that, I started running the stored procedures that comprise a nightly download/import process. They ran fine until I got to one which contained the following code (there's more to the sp, but this statement was the one I narrowed it down to): UPDATE tblProcedureHistory SET tblProcedureHistory.Status_A = [derived].MaxPostingDate FROM ( SELECT T1.OFFICE_NUM, T1.PatientID, T1.PatientType, T1.StudentID, T1.ProcedureID, T1.ProcedureSuffix, T1.Tooth, T1.Surface, MAX(T1.PostingDate) AS MaxPostingDate FROM tbl_AHSTDN AS T1 INNER JOIN tblProcedureHistory AS T2 ON T1.OFFICE_NUM = T2.OFFICE_NUM AND T1.PatientID = T2.PatientID AND T1.PatientType = T2.PatientType AND T1.StudentID = T2.StudentID AND T1.ProcedureID = T2.ProcedureID AND T1.ProcedureSuffix = T2.ProcedureSuffix AND T1.Tooth = T2.Tooth AND T1.Surface = T2.Surface AND T1.Status = 'A' GROUP BY T1.OFFICE_NUM, T1.PatientID, T1.PatientType, T1.StudentID, T1.ProcedureID, T1.ProcedureSuffix, T1.Tooth, T1.Surface ) AS [derived] WHERE tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM AND tblProcedureHistory.PatientID = [derived].PatientID AND tblProcedureHistory.PatientType = [derived].PatientType AND tblProcedureHistory.StudentID = [derived].StudentID AND tblProcedureHistory.ProcedureID = [derived].ProcedureID AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix AND tblProcedureHistory.Tooth = [derived].Tooth AND tblProcedureHistory.Surface = [derived].Surface I know about the proprietary aspects of UPDATE...FROM and the possibility of unexpected update results. *For purposes of this discussion only*, please assume that because of the existing data, that is not an issue. Thanks. The preceding SQL statement ran in about 90 seconds in SQL 7. In the new database, it will run until I cancel execution. I let it run once for about 2 hours just for the sheer thrill of it and had to finally cancel execution. I tried: 1) deleting and re-creating the stored procedure 2) exec sp_updatestats on the database 3) deleting and re-creating the database and re-attaching the .mdf file Here's what I finally tried that worked: I created a new composite index in tbl_AHSTDN on the following columns: OFFICE_NUM, PatientID, PatientType, StudentID, ProcedureID, ProcedureSuffix, Tooth, Surface, Status and the waters parted, and the angels sang, and the SQL statement above ran in 21 seconds. This raised a number of questions in my mind: First, why in the world would this work in SQL 7 *without* that index but SQL 2000 had to have that index? Second, why would the sp not run to completion at all? I can understand the possibility of significantly decreased performance (taking 5 minutes to run as opposed to 21 seconds) but not running to completion? Makes no sense to me . . . Third, most (but not all) of the 9 columns in the new composite index were already indexed individually in tbl_AHSTDN. Do I have to have a covering index for every possible combination of columns referenced in the WHERE clause and JOIN operations of every stored procedure I run? ANY insight into this issue will be incredibly helpful. This drove me crazy, and I was able to figure it out, but I have *absolutely* no idea why this happened, nor why the index fixed it, and in a way that drives me even crazier . . . Thanks in advance -- Carl Either your query found a bug in SQL Server 2000, or SQL Server is
generating a simply terrible execution plan - until you added the index, that is. Looking at your query, I can not see any purpose for joining to tblProcedureHistory in the derived table. No columns are referenced other than in the join process, and the outer WHERE clause joining accomplishes the same thing. I believe it could be simplified (below) and just maybe the optimizer would approach it differently even without the index. UPDATE tblProcedureHistory SET tblProcedureHistory.Status_A = [derived].MaxPostingDate FROM ( SELECT T1.OFFICE_NUM, T1.PatientID, T1.PatientType, T1.StudentID, T1.ProcedureID, T1.ProcedureSuffix, T1.Tooth, T1.Surface, MAX(T1.PostingDate) AS MaxPostingDate FROM tbl_AHSTDN AS T1 GROUP BY T1.OFFICE_NUM, T1.PatientID, T1.PatientType, T1.StudentID, T1.ProcedureID, T1.ProcedureSuffix, T1.Tooth, T1.Surface ) AS [derived] WHERE tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM AND tblProcedureHistory.PatientID = [derived].PatientID AND tblProcedureHistory.PatientType = [derived].PatientType AND tblProcedureHistory.StudentID = [derived].StudentID AND tblProcedureHistory.ProcedureID = [derived].ProcedureID AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix AND tblProcedureHistory.Tooth = [derived].Tooth AND tblProcedureHistory.Surface = [derived].Surface Roy Harvey Beacon Falls, CT On Wed, 19 Jul 2006 10:35:35 -0700, Carl Imthurn <nospam@all.thanks> wrote: Show quote >I solved the following problem posted a while back, but I now have big questions in my mind about indexes . . . > >In a nutshell: >I am in the process of converting a database from 7 to 2000. Here's the steps I followed: > >1) detached the db from the SQL 7 instance *with* update statistics >2) made a copy of the physical (.mdf) file -- I wanted SQL Server to create a new .ldf file >3) attached the copy to the SQL 2000 instance >4) ran an UPDATE STATISTICS {mytablename} WITH FULLSCAN on the SQL 2000 database for tbl_AHSTDN (see below). > >After having done that, I started running the stored procedures that comprise a nightly download/import process. >They ran fine until I got to one which contained the following code (there's more to the sp, but this statement was the one I narrowed it down to): > >UPDATE tblProcedureHistory >SET tblProcedureHistory.Status_A = [derived].MaxPostingDate >FROM > ( > SELECT T1.OFFICE_NUM, > T1.PatientID, > T1.PatientType, > T1.StudentID, > T1.ProcedureID, > T1.ProcedureSuffix, > T1.Tooth, > T1.Surface, > MAX(T1.PostingDate) AS MaxPostingDate > > FROM tbl_AHSTDN AS T1 > INNER JOIN tblProcedureHistory AS T2 > ON T1.OFFICE_NUM = T2.OFFICE_NUM > AND T1.PatientID = T2.PatientID > AND T1.PatientType = T2.PatientType > AND T1.StudentID = T2.StudentID > AND T1.ProcedureID = T2.ProcedureID > AND T1.ProcedureSuffix = T2.ProcedureSuffix > AND T1.Tooth = T2.Tooth > AND T1.Surface = T2.Surface > AND T1.Status = 'A' > > GROUP BY > T1.OFFICE_NUM, > T1.PatientID, > T1.PatientType, > T1.StudentID, > T1.ProcedureID, > T1.ProcedureSuffix, > T1.Tooth, > T1.Surface > ) >AS [derived] > >WHERE >tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM >AND tblProcedureHistory.PatientID = [derived].PatientID >AND tblProcedureHistory.PatientType = [derived].PatientType >AND tblProcedureHistory.StudentID = [derived].StudentID >AND tblProcedureHistory.ProcedureID = [derived].ProcedureID >AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix >AND tblProcedureHistory.Tooth = [derived].Tooth >AND tblProcedureHistory.Surface = [derived].Surface > >I know about the proprietary aspects of UPDATE...FROM and the possibility of unexpected update results. >*For purposes of this discussion only*, please assume that because of the existing data, that is not an issue. Thanks. > >The preceding SQL statement ran in about 90 seconds in SQL 7. In the new database, it will run until I cancel execution. >I let it run once for about 2 hours just for the sheer thrill of it and had to finally cancel execution. > >I tried: >1) deleting and re-creating the stored procedure >2) exec sp_updatestats on the database >3) deleting and re-creating the database and re-attaching the .mdf file > >Here's what I finally tried that worked: I created a new composite index in tbl_AHSTDN on the following columns: >OFFICE_NUM, PatientID, PatientType, StudentID, ProcedureID, ProcedureSuffix, Tooth, Surface, Status > >and the waters parted, and the angels sang, and the SQL statement above ran in 21 seconds. > >This raised a number of questions in my mind: >First, why in the world would this work in SQL 7 *without* that index but SQL 2000 had to have that index? >Second, why would the sp not run to completion at all? I can understand the possibility of significantly decreased performance (taking 5 minutes to run as opposed to 21 seconds) but not running to completion? Makes no sense to me . . . >Third, most (but not all) of the 9 columns in the new composite index were already indexed individually in tbl_AHSTDN. Do I have to have a covering index for every possible combination of columns referenced in the WHERE clause and JOIN operations of every >stored procedure I run? > >ANY insight into this issue will be incredibly helpful. This drove me crazy, and I was able to figure it out, but I have *absolutely* no idea why this happened, nor why the index fixed it, and in a way that drives me even crazier . . . >Thanks in advance -- > >Carl Thanks for taking the time to reply Roy -- I see what you're saying about joining to tblProcedureHistory.
I will make some time to test that theory (delete the index, modify the stored procedure, see if it runs to completion). Roy Harvey wrote: Show quote > Either your query found a bug in SQL Server 2000, or SQL Server is > generating a simply terrible execution plan - until you added the > index, that is. > > Looking at your query, I can not see any purpose for joining to > tblProcedureHistory in the derived table. No columns are referenced > other than in the join process, and the outer WHERE clause joining > accomplishes the same thing. I believe it could be simplified (below) > and just maybe the optimizer would approach it differently even > without the index. > > UPDATE tblProcedureHistory > SET tblProcedureHistory.Status_A = [derived].MaxPostingDate > FROM > ( > SELECT T1.OFFICE_NUM, > T1.PatientID, > T1.PatientType, > T1.StudentID, > T1.ProcedureID, > T1.ProcedureSuffix, > T1.Tooth, > T1.Surface, > MAX(T1.PostingDate) AS MaxPostingDate > > FROM tbl_AHSTDN AS T1 > GROUP BY > T1.OFFICE_NUM, > T1.PatientID, > T1.PatientType, > T1.StudentID, > T1.ProcedureID, > T1.ProcedureSuffix, > T1.Tooth, > T1.Surface > ) > AS [derived] > > WHERE > tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM > AND tblProcedureHistory.PatientID = [derived].PatientID > AND tblProcedureHistory.PatientType = [derived].PatientType > AND tblProcedureHistory.StudentID = [derived].StudentID > AND tblProcedureHistory.ProcedureID = [derived].ProcedureID > AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix > AND tblProcedureHistory.Tooth = [derived].Tooth > AND tblProcedureHistory.Surface = [derived].Surface > > Roy Harvey > Beacon Falls, CT > > On Wed, 19 Jul 2006 10:35:35 -0700, Carl Imthurn <nospam@all.thanks> > wrote: > Roy,
Your query is logically different from the OP's, I think. Your query will update all rows of tblProcedureHistory for which there is a matching row in tbl_AHSTDN, and update them with the latest posting date among all the matches. Carl's will only update those rows of tblProcedureHistory for which there is a matching row in tbl_AHSTDN having status = 'A', and it will update them with the latest status_A posting date. I think he wants the latter, given that the column being updated is called Status_A, but your query doesn't refer to the Status column. I can't guess why the query ran so slowly after the upgrade, but if Carl posts the query plans, I could hazard a guess. Steve Kass Drew University www.stevekass.com Show quote "Roy Harvey" <roy_har***@snet.net> wrote in message news:mk3tb2pb88t9h6mcplh22tonl3je4osbg8@4ax.com... > Either your query found a bug in SQL Server 2000, or SQL Server is > generating a simply terrible execution plan - until you added the > index, that is. > > Looking at your query, I can not see any purpose for joining to > tblProcedureHistory in the derived table. No columns are referenced > other than in the join process, and the outer WHERE clause joining > accomplishes the same thing. I believe it could be simplified (below) > and just maybe the optimizer would approach it differently even > without the index. > > UPDATE tblProcedureHistory > SET tblProcedureHistory.Status_A = [derived].MaxPostingDate > FROM > ( > SELECT T1.OFFICE_NUM, > T1.PatientID, > T1.PatientType, > T1.StudentID, > T1.ProcedureID, > T1.ProcedureSuffix, > T1.Tooth, > T1.Surface, > MAX(T1.PostingDate) AS MaxPostingDate > > FROM tbl_AHSTDN AS T1 > GROUP BY > T1.OFFICE_NUM, > T1.PatientID, > T1.PatientType, > T1.StudentID, > T1.ProcedureID, > T1.ProcedureSuffix, > T1.Tooth, > T1.Surface > ) > AS [derived] > > WHERE > tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM > AND tblProcedureHistory.PatientID = [derived].PatientID > AND tblProcedureHistory.PatientType = [derived].PatientType > AND tblProcedureHistory.StudentID = [derived].StudentID > AND tblProcedureHistory.ProcedureID = [derived].ProcedureID > AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix > AND tblProcedureHistory.Tooth = [derived].Tooth > AND tblProcedureHistory.Surface = [derived].Surface > > Roy Harvey > Beacon Falls, CT > > On Wed, 19 Jul 2006 10:35:35 -0700, Carl Imthurn <nospam@all.thanks> > wrote: > >>I solved the following problem posted a while back, but I now have big >>questions in my mind about indexes . . . >> >>In a nutshell: >>I am in the process of converting a database from 7 to 2000. Here's the >>steps I followed: >> >>1) detached the db from the SQL 7 instance *with* update statistics >>2) made a copy of the physical (.mdf) file -- I wanted SQL Server to >>create a new .ldf file >>3) attached the copy to the SQL 2000 instance >>4) ran an UPDATE STATISTICS {mytablename} WITH FULLSCAN on the SQL 2000 >>database for tbl_AHSTDN (see below). >> >>After having done that, I started running the stored procedures that >>comprise a nightly download/import process. >>They ran fine until I got to one which contained the following code >>(there's more to the sp, but this statement was the one I narrowed it down >>to): >> >>UPDATE tblProcedureHistory >>SET tblProcedureHistory.Status_A = [derived].MaxPostingDate >>FROM >> ( >> SELECT T1.OFFICE_NUM, >> T1.PatientID, >> T1.PatientType, >> T1.StudentID, >> T1.ProcedureID, >> T1.ProcedureSuffix, >> T1.Tooth, >> T1.Surface, >> MAX(T1.PostingDate) AS MaxPostingDate >> >> FROM tbl_AHSTDN AS T1 >> INNER JOIN tblProcedureHistory AS T2 >> ON T1.OFFICE_NUM = T2.OFFICE_NUM >> AND T1.PatientID = T2.PatientID >> AND T1.PatientType = T2.PatientType >> AND T1.StudentID = T2.StudentID >> AND T1.ProcedureID = T2.ProcedureID >> AND T1.ProcedureSuffix = T2.ProcedureSuffix >> AND T1.Tooth = T2.Tooth >> AND T1.Surface = T2.Surface >> AND T1.Status = 'A' >> >> GROUP BY >> T1.OFFICE_NUM, >> T1.PatientID, >> T1.PatientType, >> T1.StudentID, >> T1.ProcedureID, >> T1.ProcedureSuffix, >> T1.Tooth, >> T1.Surface >> ) >>AS [derived] >> >>WHERE >>tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM >>AND tblProcedureHistory.PatientID = [derived].PatientID >>AND tblProcedureHistory.PatientType = [derived].PatientType >>AND tblProcedureHistory.StudentID = [derived].StudentID >>AND tblProcedureHistory.ProcedureID = [derived].ProcedureID >>AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix >>AND tblProcedureHistory.Tooth = [derived].Tooth >>AND tblProcedureHistory.Surface = [derived].Surface >> >>I know about the proprietary aspects of UPDATE...FROM and the possibility >>of unexpected update results. >>*For purposes of this discussion only*, please assume that because of the >>existing data, that is not an issue. Thanks. >> >>The preceding SQL statement ran in about 90 seconds in SQL 7. In the new >>database, it will run until I cancel execution. >>I let it run once for about 2 hours just for the sheer thrill of it and >>had to finally cancel execution. >> >>I tried: >>1) deleting and re-creating the stored procedure >>2) exec sp_updatestats on the database >>3) deleting and re-creating the database and re-attaching the .mdf file >> >>Here's what I finally tried that worked: I created a new composite index >>in tbl_AHSTDN on the following columns: >>OFFICE_NUM, PatientID, PatientType, StudentID, ProcedureID, >>ProcedureSuffix, Tooth, Surface, Status >> >>and the waters parted, and the angels sang, and the SQL statement above >>ran in 21 seconds. >> >>This raised a number of questions in my mind: >>First, why in the world would this work in SQL 7 *without* that index but >>SQL 2000 had to have that index? >>Second, why would the sp not run to completion at all? I can understand >>the possibility of significantly decreased performance (taking 5 minutes >>to run as opposed to 21 seconds) but not running to completion? Makes no >>sense to me . . . >>Third, most (but not all) of the 9 columns in the new composite index were >>already indexed individually in tbl_AHSTDN. Do I have to have a covering >>index for every possible combination of columns referenced in the WHERE >>clause and JOIN operations of every >>stored procedure I run? >> >>ANY insight into this issue will be incredibly helpful. This drove me >>crazy, and I was able to figure it out, but I have *absolutely* no idea >>why this happened, nor why the index fixed it, and in a way that drives me >>even crazier . . . >>Thanks in advance -- >> >>Carl
Show quote
On Wed, 19 Jul 2006 16:51:46 -0400, "Steve Kass" <sk***@drew.edu> Correct, thanks for catching that. I needed to preserve the testwrote: >Roy, > >Your query is logically different from the OP's, I think. Your >query will update all rows of tblProcedureHistory for which >there is a matching row in tbl_AHSTDN, and update them >with the latest posting date among all the matches. Carl's will >only update those rows of tblProcedureHistory for which there >is a matching row in tbl_AHSTDN having status = 'A', and it >will update them with the latest status_A posting date. I think >he wants the latter, given that the column being updated is >called Status_A, but your query doesn't refer to the Status >column. T1.Status = 'A'. Revised, hopefully correct, version: UPDATE tblProcedureHistory SET tblProcedureHistory.Status_A = [derived].MaxPostingDate FROM ( SELECT T1.OFFICE_NUM, T1.PatientID, T1.PatientType, T1.StudentID, T1.ProcedureID, T1.ProcedureSuffix, T1.Tooth, T1.Surface, MAX(T1.PostingDate) AS MaxPostingDate FROM tbl_AHSTDN AS T1 WHERE T1.Status = 'A' --Was missing GROUP BY T1.OFFICE_NUM, T1.PatientID, T1.PatientType, T1.StudentID, T1.ProcedureID, T1.ProcedureSuffix, T1.Tooth, T1.Surface ) AS [derived] WHERE tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM AND tblProcedureHistory.PatientID = [derived].PatientID AND tblProcedureHistory.PatientType = [derived].PatientType AND tblProcedureHistory.StudentID = [derived].StudentID AND tblProcedureHistory.ProcedureID = [derived].ProcedureID AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix AND tblProcedureHistory.Tooth = [derived].Tooth AND tblProcedureHistory.Surface = [derived].Surface Roy Thanks Roy and Steve for your time and assistance. Very much appreciated. Here's the latest:
I removed the 9-column composite index that I had created earlier and ran the SQL statement again. As expected, it did not run to completion. I cancelled it after about 12 minutes. Then, I removed the inner join in the subquery as you suggested, Roy. I also added back in the WHERE T1.Status = 'A' clause so that the SQL statement was identical to Roy's corrected version below. And wouldn't you know it -- it ran to completion! And this was *without* the additional 9-column index. Now -- I don't even pretend to understand indexing or optimizing, so I don't know what's going on, but it's working now. Thanks to both of you for your help. I'll try to post the query plans this afternoon or tomorrow, Steve. I think I remember trying to do this in the past and having a real tough time because the plan came out in a graphical mode which didn't translate well to a text-based newsgroup, so any assistance you can render about posting the query plans will be appreciated. Thanks again -- you guys are great. Carl Roy Harvey wrote: Show quote > On Wed, 19 Jul 2006 16:51:46 -0400, "Steve Kass" <sk***@drew.edu> > wrote: > > >>Roy, >> >>Your query is logically different from the OP's, I think. Your >>query will update all rows of tblProcedureHistory for which >>there is a matching row in tbl_AHSTDN, and update them >>with the latest posting date among all the matches. Carl's will >>only update those rows of tblProcedureHistory for which there >>is a matching row in tbl_AHSTDN having status = 'A', and it >>will update them with the latest status_A posting date. I think >>he wants the latter, given that the column being updated is >>called Status_A, but your query doesn't refer to the Status >>column. > > > Correct, thanks for catching that. I needed to preserve the test > T1.Status = 'A'. > > Revised, hopefully correct, version: > > UPDATE tblProcedureHistory > SET tblProcedureHistory.Status_A = [derived].MaxPostingDate > FROM > ( > SELECT T1.OFFICE_NUM, > T1.PatientID, > T1.PatientType, > T1.StudentID, > T1.ProcedureID, > T1.ProcedureSuffix, > T1.Tooth, > T1.Surface, > MAX(T1.PostingDate) AS MaxPostingDate > > FROM tbl_AHSTDN AS T1 > WHERE T1.Status = 'A' --Was missing > GROUP BY > T1.OFFICE_NUM, > T1.PatientID, > T1.PatientType, > T1.StudentID, > T1.ProcedureID, > T1.ProcedureSuffix, > T1.Tooth, > T1.Surface > ) > AS [derived] > > WHERE > tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM > AND tblProcedureHistory.PatientID = [derived].PatientID > AND tblProcedureHistory.PatientType = [derived].PatientType > AND tblProcedureHistory.StudentID = [derived].StudentID > AND tblProcedureHistory.ProcedureID = [derived].ProcedureID > AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix > AND tblProcedureHistory.Tooth = [derived].Tooth > AND tblProcedureHistory.Surface = [derived].Surface > > Roy
Show quote
"Carl Imthurn" <nospam@all.thanks> wrote in message In query analyzer, surround the query withnews:uIxEXA4qGHA.3380@TK2MSFTNGP04.phx.gbl... > Thanks Roy and Steve for your time and assistance. Very much appreciated. > Here's the latest: > > I removed the 9-column composite index that I had created earlier and ran > the SQL statement again. As expected, it did not run to completion. I > cancelled it after about 12 minutes. Then, I removed the inner join in the > subquery as you suggested, Roy. I also added back in the WHERE T1.Status = > 'A' clause so that the SQL statement was identical to Roy's corrected > version below. > And wouldn't you know it -- it ran to completion! And this was *without* > the additional 9-column index. > Now -- I don't even pretend to understand indexing or optimizing, so I > don't know what's going on, but it's working now. > Thanks to both of you for your help. > > I'll try to post the query plans this afternoon or tomorrow, Steve. I > think I remember trying to do this in the past and having a real tough > time because the plan came out in a graphical mode which didn't translate > well to a text-based newsgroup, so any assistance you can render about > posting the query plans will be appreciated. go set showplan_text on go <the query> go set showplan_text off go You'll get a text version of the plan in the results/messages pane. SK Show quote > > Thanks again -- you guys are great. > > Carl > > Roy Harvey wrote: >> On Wed, 19 Jul 2006 16:51:46 -0400, "Steve Kass" <sk***@drew.edu> >> wrote: >> >> >>>Roy, >>> >>>Your query is logically different from the OP's, I think. Your >>>query will update all rows of tblProcedureHistory for which >>>there is a matching row in tbl_AHSTDN, and update them >>>with the latest posting date among all the matches. Carl's will >>>only update those rows of tblProcedureHistory for which there >>>is a matching row in tbl_AHSTDN having status = 'A', and it >>>will update them with the latest status_A posting date. I think >>>he wants the latter, given that the column being updated is >>>called Status_A, but your query doesn't refer to the Status >>>column. >> >> >> Correct, thanks for catching that. I needed to preserve the test >> T1.Status = 'A'. >> >> Revised, hopefully correct, version: >> >> UPDATE tblProcedureHistory >> SET tblProcedureHistory.Status_A = [derived].MaxPostingDate >> FROM >> ( >> SELECT T1.OFFICE_NUM, >> T1.PatientID, >> T1.PatientType, >> T1.StudentID, >> T1.ProcedureID, >> T1.ProcedureSuffix, >> T1.Tooth, >> T1.Surface, >> MAX(T1.PostingDate) AS MaxPostingDate >> >> FROM tbl_AHSTDN AS T1 >> WHERE T1.Status = 'A' --Was missing >> GROUP BY >> T1.OFFICE_NUM, >> T1.PatientID, >> T1.PatientType, >> T1.StudentID, >> T1.ProcedureID, >> T1.ProcedureSuffix, >> T1.Tooth, >> T1.Surface >> ) >> AS [derived] >> >> WHERE >> tblProcedureHistory.OFFICE_NUM = [derived].OFFICE_NUM >> AND tblProcedureHistory.PatientID = [derived].PatientID >> AND tblProcedureHistory.PatientType = [derived].PatientType >> AND tblProcedureHistory.StudentID = [derived].StudentID >> AND tblProcedureHistory.ProcedureID = [derived].ProcedureID >> AND tblProcedureHistory.ProcedureSuffix = [derived].ProcedureSuffix >> AND tblProcedureHistory.Tooth = [derived].Tooth >> AND tblProcedureHistory.Surface = [derived].Surface >> >> Roy Here are the two execution plans for the SQL Server 7 database and the SQL Server 2000 database.
Sorry about the line wrap. And, it appears that the longer lines have been truncated. How do I keep them from being truncated in Query Analyzer? As always, thanks in advance for your help. Carl -- SQL Server 7 database (runs to completion): |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]), SET:([tblProcedureHistory].[Status_A]=[Expr1002])) --SQL Server 2000 database (does not run to completion):|--Top(ROWCOUNT est 0) |--Parallelism(Gather Streams) |--Hash Match Root(Inner Join, HASH:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], [T1].[PatientType], [T1].[PatientID])=([tblProcedureHistory].[OFFICE_NUM], [tblProcedureHis |--Hash Match Team(Inner Join, HASH:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], [T1].[PatientType], [T1].[PatientID])=([T2].[OFFICE_NUM], [T2].[Surface], [T2].[Tooth] | |--Parallelism(Repartition Streams, PARTITION COLUMNS:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], [T1].[PatientType], [T1].[PatientID])) | | |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]), WHERE:([T1].[Status]='A')) | |--Parallelism(Repartition Streams, PARTITION COLUMNS:([T2].[OFFICE_NUM], [T2].[Surface], [T2].[Tooth], [T2].[ProcedureSuffix], [T2].[ProcedureID], [T2].[StudentID], [T2].[PatientType], [T2].[PatientID])) | |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2])) |--Parallelism(Distribute Streams, PARTITION COLUMNS:([tblProcedureHistory].[OFFICE_NUM], [tblProcedureHistory].[Surface], [tblProcedureHistory].[Tooth], [tblProcedureHistory].[ProcedureSuffix], [tblProcedureHistory].[ProcedureID], [t |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory])) -------------------------------------------------------------------------------------------------------------------------------------------------------------- Show quote |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]), SET:([tblProcedureHistory].[Status_A]=[Expr1002])) |--Top(ROWCOUNT est 0) |--Stream Aggregate(GROUP BY:([Bmk1003]) DEFINE:([Expr1002]=MAX([T1].[PostingDate]))) |--Sort(ORDER BY:([Bmk1003] ASC)) |--Filter(WHERE:((((([T1].[PatientID]=[tblProcedureHistory].[PatientID] AND [T1].[PatientType]=[tblProcedureHistory].[PatientType]) AND [T1].[ProcedureSuffix]=[tblProcedureHistory].[ProcedureSuffix]) AND [T1].[Tooth]=[tblProcedureHist |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]) WITH PREFETCH) |--Nested Loops(Inner Join, OUTER REFERENCES:([tblProcedureHistory].[ProcedureID], [tblProcedureHistory].[StudentID], [tblProcedureHistory].[OFFICE_NUM]) WITH PREFETCH) |--Hash Match(Inner Join, HASH:([T2].[OFFICE_NUM], [T2].[PatientID], [T2].[PatientType], [T2].[StudentID], [T2].[ProcedureID], [T2].[ProcedureSuffix], [T2].[Tooth], [T2].[Surface])=([tblProcedureHistory].[OFFICE_NUM], [ | |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2])) | |--Table Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory])) |--Index Seek(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN].[ATTENDANCE_REPORT_INDEX] AS [T1]), SEEK:([T1].[ProcedureID]=[tblProcedureHistory].[ProcedureID]), WHERE:([T1].[OFFICE_NUM]=[tblProcedureHistory].[OFFICE_NUM] AN There is a Tools|Options setting for the number of characters to display per
column. You can set this to 8000 to avoid the truncation. These plans are quite different, and I would guess that the Bookmark Lookup is what's killing you. When a query plan needs to go access another table to find a column value, and it does this one row at a time (as it identifies those rows elsewhere in the plan), it typically causes very scattered access to the data pages of the lookup table. If the rowcount estimates are not very good (and there are many situations where they will not be, because statistics just can't provide good estimates, not because there is a flaw in the product), the optimizer will think only a few lookups will be needed, but perhaps thousands of times as many will be needed. This can quite easily result in a 1000's-fold increase in the time the query takes to complete. I can't quite picture the whole query plan from what's below and without all the CREATE statements for tables and indexes, but typically you can avoid the bookmark plan with better indexing (a bookmark often means there is no suitable covering index) or an index hint. If you are interested in more detail about the problems that can come up with bookmark plans, see http://www.users.drew.edu/skass/SQL/ConnectionsFall2003/Kass_SDB418_Statistics_SQLConnectionsFall2003.zip. SK Show quote "Carl Imthurn" <nospam@all.thanks> wrote in message news:eioOHt$qGHA.3908@TK2MSFTNGP05.phx.gbl... > Here are the two execution plans for the SQL Server 7 database and the SQL > Server 2000 database. > Sorry about the line wrap. And, it appears that the longer lines have been > truncated. > How do I keep them from being truncated in Query Analyzer? > As always, thanks in advance for your help. > > Carl > > -- SQL Server 7 database (runs to completion): > > |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]), > SET:([tblProcedureHistory].[Status_A]=[Expr1002])) > |--Top(ROWCOUNT est 0) > |--Parallelism(Gather Streams) > |--Hash Match Root(Inner Join, HASH:([T1].[OFFICE_NUM], > [T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID], > [T1].[StudentID], [T1].[PatientType], > [T1].[PatientID])=([tblProcedureHistory].[OFFICE_NUM], [tblProcedureHis > |--Hash Match Team(Inner Join, > HASH:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], > [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], > [T1].[PatientType], [T1].[PatientID])=([T2].[OFFICE_NUM], [T2].[Surface], > [T2].[Tooth] > | |--Parallelism(Repartition Streams, PARTITION > COLUMNS:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], > [T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], > [T1].[PatientType], [T1].[PatientID])) > | | |--Table > Scan(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]), > WHERE:([T1].[Status]='A')) > | |--Parallelism(Repartition Streams, PARTITION > COLUMNS:([T2].[OFFICE_NUM], [T2].[Surface], [T2].[Tooth], > [T2].[ProcedureSuffix], [T2].[ProcedureID], [T2].[StudentID], > [T2].[PatientType], [T2].[PatientID])) > | |--Table > Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2])) > |--Parallelism(Distribute Streams, PARTITION > COLUMNS:([tblProcedureHistory].[OFFICE_NUM], > [tblProcedureHistory].[Surface], [tblProcedureHistory].[Tooth], > [tblProcedureHistory].[ProcedureSuffix], > [tblProcedureHistory].[ProcedureID], [t > |--Table > Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory])) > -------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --SQL Server 2000 database (does not run to completion): > > |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]), > SET:([tblProcedureHistory].[Status_A]=[Expr1002])) > |--Top(ROWCOUNT est 0) > |--Stream Aggregate(GROUP BY:([Bmk1003]) > DEFINE:([Expr1002]=MAX([T1].[PostingDate]))) > |--Sort(ORDER BY:([Bmk1003] ASC)) > > |--Filter(WHERE:((((([T1].[PatientID]=[tblProcedureHistory].[PatientID] > AND [T1].[PatientType]=[tblProcedureHistory].[PatientType]) AND > [T1].[ProcedureSuffix]=[tblProcedureHistory].[ProcedureSuffix]) AND > [T1].[Tooth]=[tblProcedureHist > |--Bookmark Lookup(BOOKMARK:([Bmk1000]), > OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]) WITH PREFETCH) > |--Nested Loops(Inner Join, OUTER > REFERENCES:([tblProcedureHistory].[ProcedureID], > [tblProcedureHistory].[StudentID], [tblProcedureHistory].[OFFICE_NUM]) > WITH PREFETCH) > |--Hash Match(Inner Join, > HASH:([T2].[OFFICE_NUM], [T2].[PatientID], [T2].[PatientType], > [T2].[StudentID], [T2].[ProcedureID], [T2].[ProcedureSuffix], > [T2].[Tooth], [T2].[Surface])=([tblProcedureHistory].[OFFICE_NUM], [ > | |--Table > Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2])) > | |--Table > Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory])) > |--Index > Seek(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN].[ATTENDANCE_REPORT_INDEX] > AS [T1]), SEEK:([T1].[ProcedureID]=[tblProcedureHistory].[ProcedureID]), > WHERE:([T1].[OFFICE_NUM]=[tblProcedureHistory].[OFFICE_NUM] AN Thanks for getting back to me Steve. I changed the setting in Tools|Options so that next time I need help I won't truncate the output ;-)
I downloaded the zip file and it was almost like sitting in your classroom! BTW, that's a good thing. Thanks for your time and expertise -- I really appreciate it. Carl Steve Kass wrote: Show quote > There is a Tools|Options setting for the number of characters to display per > column. You can set this to 8000 to avoid the truncation. > > These plans are quite different, and I would guess that the Bookmark Lookup > is what's killing you. When a query plan needs to go access another table to > find a column value, and it does this one row at a time (as it identifies > those rows elsewhere in the plan), it typically causes very scattered access > to the data pages of the lookup table. If the rowcount estimates are not > very good (and there are many situations where they will not be, because > statistics just can't provide good estimates, not because there is a flaw in > the product), the optimizer will think only a few lookups will be needed, > but perhaps thousands of times as many will be needed. This can quite > easily result in a 1000's-fold increase in the time the query takes to > complete. I can't quite picture the whole query plan from what's below and > without all the CREATE statements for tables and indexes, but typically you > can avoid the bookmark plan with better indexing (a bookmark often means > there is no suitable covering index) or an index hint. If you are > interested in more detail about the problems that can come up with bookmark > plans, see > http://www.users.drew.edu/skass/SQL/ConnectionsFall2003/Kass_SDB418_Statistics_SQLConnectionsFall2003.zip. > > SK > > > > "Carl Imthurn" <nospam@all.thanks> wrote in message > news:eioOHt$qGHA.3908@TK2MSFTNGP05.phx.gbl... > >>Here are the two execution plans for the SQL Server 7 database and the SQL >>Server 2000 database. >>Sorry about the line wrap. And, it appears that the longer lines have been >>truncated. >>How do I keep them from being truncated in Query Analyzer? >>As always, thanks in advance for your help. >> >>Carl >> >>-- SQL Server 7 database (runs to completion): >> >> |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]), >>SET:([tblProcedureHistory].[Status_A]=[Expr1002])) >> |--Top(ROWCOUNT est 0) >> |--Parallelism(Gather Streams) >> |--Hash Match Root(Inner Join, HASH:([T1].[OFFICE_NUM], >>[T1].[Surface], [T1].[Tooth], [T1].[ProcedureSuffix], [T1].[ProcedureID], >>[T1].[StudentID], [T1].[PatientType], >>[T1].[PatientID])=([tblProcedureHistory].[OFFICE_NUM], [tblProcedureHis >> |--Hash Match Team(Inner Join, >>HASH:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], >>[T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], >>[T1].[PatientType], [T1].[PatientID])=([T2].[OFFICE_NUM], [T2].[Surface], >>[T2].[Tooth] >> | |--Parallelism(Repartition Streams, PARTITION >>COLUMNS:([T1].[OFFICE_NUM], [T1].[Surface], [T1].[Tooth], >>[T1].[ProcedureSuffix], [T1].[ProcedureID], [T1].[StudentID], >>[T1].[PatientType], [T1].[PatientID])) >> | | |--Table >>Scan(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]), >>WHERE:([T1].[Status]='A')) >> | |--Parallelism(Repartition Streams, PARTITION >>COLUMNS:([T2].[OFFICE_NUM], [T2].[Surface], [T2].[Tooth], >>[T2].[ProcedureSuffix], [T2].[ProcedureID], [T2].[StudentID], >>[T2].[PatientType], [T2].[PatientID])) >> | |--Table >>Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2])) >> |--Parallelism(Distribute Streams, PARTITION >>COLUMNS:([tblProcedureHistory].[OFFICE_NUM], >>[tblProcedureHistory].[Surface], [tblProcedureHistory].[Tooth], >>[tblProcedureHistory].[ProcedureSuffix], >>[tblProcedureHistory].[ProcedureID], [t >> |--Table >>Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory])) >>-------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >>--SQL Server 2000 database (does not run to completion): >> >> |--Table Update(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory]), >>SET:([tblProcedureHistory].[Status_A]=[Expr1002])) >> |--Top(ROWCOUNT est 0) >> |--Stream Aggregate(GROUP BY:([Bmk1003]) >>DEFINE:([Expr1002]=MAX([T1].[PostingDate]))) >> |--Sort(ORDER BY:([Bmk1003] ASC)) >> >>|--Filter(WHERE:((((([T1].[PatientID]=[tblProcedureHistory].[PatientID] >>AND [T1].[PatientType]=[tblProcedureHistory].[PatientType]) AND >>[T1].[ProcedureSuffix]=[tblProcedureHistory].[ProcedureSuffix]) AND >>[T1].[Tooth]=[tblProcedureHist >> |--Bookmark Lookup(BOOKMARK:([Bmk1000]), >>OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN] AS [T1]) WITH PREFETCH) >> |--Nested Loops(Inner Join, OUTER >>REFERENCES:([tblProcedureHistory].[ProcedureID], >>[tblProcedureHistory].[StudentID], [tblProcedureHistory].[OFFICE_NUM]) >>WITH PREFETCH) >> |--Hash Match(Inner Join, >>HASH:([T2].[OFFICE_NUM], [T2].[PatientID], [T2].[PatientType], >>[T2].[StudentID], [T2].[ProcedureID], [T2].[ProcedureSuffix], >>[T2].[Tooth], [T2].[Surface])=([tblProcedureHistory].[OFFICE_NUM], [ >> | |--Table >>Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory] AS [T2])) >> | |--Table >>Scan(OBJECT:([QuickRecovery].[dbo].[tblProcedureHistory])) >> |--Index >>Seek(OBJECT:([QuickRecovery].[dbo].[tbl_AHSTDN].[ATTENDANCE_REPORT_INDEX] >>AS [T1]), SEEK:([T1].[ProcedureID]=[tblProcedureHistory].[ProcedureID]), >>WHERE:([T1].[OFFICE_NUM]=[tblProcedureHistory].[OFFICE_NUM] AN > > > |
|||||||||||||||||||||||