|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Slow SELECT INPlease can you tell me why the following query takes 4-5 mins: SELECT * FROM db1.dbo.table1 WHERE column1 IN (SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x') The IN () query SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x' gives: field a1 a2 a3 a4 Whereas the query below takes 0-1 sec.? SELECT * FROM db1.dbo.table1 WHERE column1 IN ('a1','a2','a3','a4') db1.dbo.table1 has 25,000,000 rows, an indexed identity column and column1 is indexed db2.dbo.table1 has 1,300 rows and no indices. Both dbs are on the same disk system (RAID5) and cannot go anywhere else. Server is MS SBS 2003 (SP1), Twin 3.2gHz Xeon, 2gB RAM, 4x160gB SCSI RAID5 I worked around it using a stored procedure: Building a string variable via a cursor from the IN () query. The string variable is then used in the main query. Running this stored procedure takes a second. Cheers, Simon. Simon,
Have you tried using EXISTS instead of IN? HTH Jerry Show quote "SimonC" <Sim***@discussions.microsoft.com> wrote in message news:6231236C-1AFD-44B6-892F-22541B50B6F0@microsoft.com... > Dear All, > > Please can you tell me why the following query takes 4-5 mins: > SELECT * FROM db1.dbo.table1 > WHERE column1 IN (SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x') > > The IN () query > SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x' > gives: > field > a1 > a2 > a3 > a4 > > Whereas the query below takes 0-1 sec.? > SELECT * FROM db1.dbo.table1 > WHERE column1 IN ('a1','a2','a3','a4') > > db1.dbo.table1 has 25,000,000 rows, an indexed identity column and column1 > is indexed > db2.dbo.table1 has 1,300 rows and no indices. > > Both dbs are on the same disk system (RAID5) and cannot go anywhere else. > Server is MS SBS 2003 (SP1), Twin 3.2gHz Xeon, 2gB RAM, 4x160gB SCSI RAID5 > > I worked around it using a stored procedure: > Building a string variable via a cursor from the IN () query. > The string variable is then used in the main query. > Running this stored procedure takes a second. > > Cheers, > Simon. It will speed up a lot if you can add index for column2 in db2.dbo.table1.
Perayu Show quote "SimonC" <Sim***@discussions.microsoft.com> wrote in message news:6231236C-1AFD-44B6-892F-22541B50B6F0@microsoft.com... > Dear All, > > Please can you tell me why the following query takes 4-5 mins: > SELECT * FROM db1.dbo.table1 > WHERE column1 IN (SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x') > > The IN () query > SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x' > gives: > field > a1 > a2 > a3 > a4 > > Whereas the query below takes 0-1 sec.? > SELECT * FROM db1.dbo.table1 > WHERE column1 IN ('a1','a2','a3','a4') > > db1.dbo.table1 has 25,000,000 rows, an indexed identity column and column1 > is indexed > db2.dbo.table1 has 1,300 rows and no indices. > > Both dbs are on the same disk system (RAID5) and cannot go anywhere else. > Server is MS SBS 2003 (SP1), Twin 3.2gHz Xeon, 2gB RAM, 4x160gB SCSI RAID5 > > I worked around it using a stored procedure: > Building a string variable via a cursor from the IN () query. > The string variable is then used in the main query. > Running this stored procedure takes a second. > > Cheers, > Simon. PS
Index on identity col in db1.dbo.table1 is clustered, column1 index is ordinary. Did execution plans...: Quick one (1sec): Non Clustered Index Seek ->13k rows (0%) then Bookmark Lookup -> 13k rows(100%) Slow one (4min): Branch1- Table Scan of db2.dbo.table1 -> 4 rows Branch2- Clustered Index Scan -> 25m rows (76%) Streams together then Hash Match/Right Semi Join ->13k rows (24%) Show quote "SimonC" wrote: > Dear All, > > Please can you tell me why the following query takes 4-5 mins: > SELECT * FROM db1.dbo.table1 > WHERE column1 IN (SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x') > > The IN () query > SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x' > gives: > field > a1 > a2 > a3 > a4 > > Whereas the query below takes 0-1 sec.? > SELECT * FROM db1.dbo.table1 > WHERE column1 IN ('a1','a2','a3','a4') > > db1.dbo.table1 has 25,000,000 rows, an indexed identity column and column1 > is indexed > db2.dbo.table1 has 1,300 rows and no indices. > > Both dbs are on the same disk system (RAID5) and cannot go anywhere else. > Server is MS SBS 2003 (SP1), Twin 3.2gHz Xeon, 2gB RAM, 4x160gB SCSI RAID5 > > I worked around it using a stored procedure: > Building a string variable via a cursor from the IN () query. > The string variable is then used in the main query. > Running this stored procedure takes a second. > > Cheers, > Simon. Thanks for suggestions so far, but...
EXISTS gave same execution plan and time as IN (plus I had to think a bit more about syntax!) Indices on columns in db2.dbo.table1 had no effect. Show quote "SimonC" wrote: > PS > Index on identity col in db1.dbo.table1 is clustered, column1 index is > ordinary. > Did execution plans...: > Quick one (1sec): > Non Clustered Index Seek ->13k rows (0%) > then Bookmark Lookup -> 13k rows(100%) > Slow one (4min): > Branch1- Table Scan of db2.dbo.table1 -> 4 rows > Branch2- Clustered Index Scan -> 25m rows (76%) > Streams together > then Hash Match/Right Semi Join ->13k rows (24%) > > > "SimonC" wrote: > > > Dear All, > > > > Please can you tell me why the following query takes 4-5 mins: > > SELECT * FROM db1.dbo.table1 > > WHERE column1 IN (SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x') > > > > The IN () query > > SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x' > > gives: > > field > > a1 > > a2 > > a3 > > a4 > > > > Whereas the query below takes 0-1 sec.? > > SELECT * FROM db1.dbo.table1 > > WHERE column1 IN ('a1','a2','a3','a4') > > > > db1.dbo.table1 has 25,000,000 rows, an indexed identity column and column1 > > is indexed > > db2.dbo.table1 has 1,300 rows and no indices. > > > > Both dbs are on the same disk system (RAID5) and cannot go anywhere else. > > Server is MS SBS 2003 (SP1), Twin 3.2gHz Xeon, 2gB RAM, 4x160gB SCSI RAID5 > > > > I worked around it using a stored procedure: > > Building a string variable via a cursor from the IN () query. > > The string variable is then used in the main query. > > Running this stored procedure takes a second. > > > > Cheers, > > Simon. The reason you're query is slow is that both IN and EXISTS are correlated
subqueries. To speed this up I would try adding a covering index on db2.dbo.table1(column2, column1). You could also try OPTION(HASH JOIN) to coerce the optimizer to flatten the correlated subquery into a join. Show quote "SimonC" <Sim***@discussions.microsoft.com> wrote in message news:6231236C-1AFD-44B6-892F-22541B50B6F0@microsoft.com... > Dear All, > > Please can you tell me why the following query takes 4-5 mins: > SELECT * FROM db1.dbo.table1 > WHERE column1 IN (SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x') > > The IN () query > SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x' > gives: > field > a1 > a2 > a3 > a4 > > Whereas the query below takes 0-1 sec.? > SELECT * FROM db1.dbo.table1 > WHERE column1 IN ('a1','a2','a3','a4') > > db1.dbo.table1 has 25,000,000 rows, an indexed identity column and column1 > is indexed > db2.dbo.table1 has 1,300 rows and no indices. > > Both dbs are on the same disk system (RAID5) and cannot go anywhere else. > Server is MS SBS 2003 (SP1), Twin 3.2gHz Xeon, 2gB RAM, 4x160gB SCSI RAID5 > > I worked around it using a stored procedure: > Building a string variable via a cursor from the IN () query. > The string variable is then used in the main query. > Running this stored procedure takes a second. > > Cheers, > Simon. Brian,
Tried index on db2.dbo.table1(column2) and one on db2.dbo.table1(column1) haven't tried db2.dbo.table1(column2,column1) Tried doing join instead of subquery but had same (long) time Show quote "Brian Selzer" wrote: > The reason you're query is slow is that both IN and EXISTS are correlated > subqueries. To speed this up I would try adding a covering index on > db2.dbo.table1(column2, column1). You could also try OPTION(HASH JOIN) to > coerce the optimizer to flatten the correlated subquery into a join. > > "SimonC" <Sim***@discussions.microsoft.com> wrote in message > news:6231236C-1AFD-44B6-892F-22541B50B6F0@microsoft.com... > > Dear All, > > > > Please can you tell me why the following query takes 4-5 mins: > > SELECT * FROM db1.dbo.table1 > > WHERE column1 IN (SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x') > > > > The IN () query > > SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x' > > gives: > > field > > a1 > > a2 > > a3 > > a4 > > > > Whereas the query below takes 0-1 sec.? > > SELECT * FROM db1.dbo.table1 > > WHERE column1 IN ('a1','a2','a3','a4') > > > > db1.dbo.table1 has 25,000,000 rows, an indexed identity column and column1 > > is indexed > > db2.dbo.table1 has 1,300 rows and no indices. > > > > Both dbs are on the same disk system (RAID5) and cannot go anywhere else. > > Server is MS SBS 2003 (SP1), Twin 3.2gHz Xeon, 2gB RAM, 4x160gB SCSI RAID5 > > > > I worked around it using a stored procedure: > > Building a string variable via a cursor from the IN () query. > > The string variable is then used in the main query. > > Running this stored procedure takes a second. > > > > Cheers, > > Simon. > > > Simon,
you could try this: - Make sure there is an index on db2.dbo.table1(column2, column1), or a nonclustered index on db2.dbo.table1(column2) and a clustered index on db2.dbo.table1(column1). - Make sure there is an index on db1.dbo.table1(column1), preferably clustered - Rewrite the query as below: SELECT db1.* FROM ( SELECT column1 FROM db2.dbo.table1 WHERE column2='x' ) AS db2 INNER JOIN db1.dbo.table1 AS db1 ON db1.Column1=db2.Column1 If the performance is still 4-5 minutes, then you could try to force a particular plan, in this case: SELECT db1.* FROM ( SELECT column1 FROM db2.dbo.table1 WHERE column2='x' ) AS db2 INNER LOOP JOIN db1.dbo.table1 AS db1 ON db1.Column1=db2.Column1 OPTION (force order) Hope this helps, Gert-Jan SimonC wrote: Show quote > > Dear All, > > Please can you tell me why the following query takes 4-5 mins: > SELECT * FROM db1.dbo.table1 > WHERE column1 IN (SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x') > > The IN () query > SELECT column1 FROM db2.dbo.table1 WHERE column2 = 'x' > gives: > field > a1 > a2 > a3 > a4 > > Whereas the query below takes 0-1 sec.? > SELECT * FROM db1.dbo.table1 > WHERE column1 IN ('a1','a2','a3','a4') > > db1.dbo.table1 has 25,000,000 rows, an indexed identity column and column1 > is indexed > db2.dbo.table1 has 1,300 rows and no indices. > > Both dbs are on the same disk system (RAID5) and cannot go anywhere else. > Server is MS SBS 2003 (SP1), Twin 3.2gHz Xeon, 2gB RAM, 4x160gB SCSI RAID5 > > I worked around it using a stored procedure: > Building a string variable via a cursor from the IN () query. > The string variable is then used in the main query. > Running this stored procedure takes a second. > > Cheers, > Simon. |
|||||||||||||||||||||||