|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
A question about execution plansI have a large table Call_Record (5 million rows) that has three indexes: 1. A clustered index on account_no ASC, date_start DESC 2. A non-clustered index on date_end DESC 3. A non-clustered index on call_record_id ASC I'm querying it for failed calls in the last hour: SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name, D.description AS disconnect_reason FROM dbo.Call_Record c INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code WHERE (c.date_end >= (GetUtcDate() - (1.0/24.0)) ) AND D.is_failure = 1 GROUP BY C.master_id_carrier, C.location_name, D.description The problem is that the execution plan shows that it is using index 1 to perform this query, whereas index 2 is clearly the best choice. If I replace the call to GetUtcDate() with a literal date constant like so: SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name, D.description AS disconnect_reason FROM dbo.Call_Record c INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code WHERE (c.date_end >= '2005/11/23') AND D.is_failure = 1 GROUP BY C.master_id_carrier, C.location_name, D.description then it does use index (2) as expected, and executes in a fraction of the time. My question is, why does it pick the "incorrect" index for the first query, and is there any way to force it to pick index (2)? Mike Mike
I tried to rewrite a little bit your SELECT DBCC FREEPROCCACHE GO SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name, D.description AS disconnect_reason FROM dbo.Call_Record c INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code WHERE c.date_end >=dateadd(hour,-1,GetUtcDate()) and c.date_end < dateadd(day,+1,GetUtcDate()) ----replace with the date that is relevant for the searching (GetUtcDate() - (1.0/24.0)) ) AND D.is_failure = 1 GROUP BY C.master_id_carrier, C.location_name, D.description Do you see now any changes in the execution plan , I'd put the CI on date_end column since your criteria is based on range date seraching and CI is probably a good choice for it, but you'll have to test it. Show quote "Mike Chamberlain" <n***@hotmail.com> wrote in message news:%23ibexQI8FHA.2676@TK2MSFTNGP15.phx.gbl... > Hello, I'm using SQL 2000 with the latest updates. > > I have a large table Call_Record (5 million rows) that has three indexes: > > 1. A clustered index on account_no ASC, date_start DESC > 2. A non-clustered index on date_end DESC > 3. A non-clustered index on call_record_id ASC > > I'm querying it for failed calls in the last hour: > > SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name, > D.description AS disconnect_reason > FROM dbo.Call_Record c > INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code > WHERE (c.date_end >= (GetUtcDate() - (1.0/24.0)) ) > AND D.is_failure = 1 > GROUP BY C.master_id_carrier, C.location_name, D.description > > The problem is that the execution plan shows that it is using index 1 to > perform this query, whereas index 2 is clearly the best choice. If I > replace the call to GetUtcDate() with a literal date constant like so: > > SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name, > D.description AS disconnect_reason > FROM dbo.Call_Record c > INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code > WHERE (c.date_end >= '2005/11/23') > AND D.is_failure = 1 > GROUP BY C.master_id_carrier, C.location_name, D.description > > then it does use index (2) as expected, and executes in a fraction of the > time. My question is, why does it pick the "incorrect" index for the > first query, and is there any way to force it to pick index (2)? > > Mike Mike Chamberlain (n***@hotmail.com) writes:
Show quote > I have a large table Call_Record (5 million rows) that has three indexes: When making the choice between scanning a clustered index, or using a> > 1. A clustered index on account_no ASC, date_start DESC > 2. A non-clustered index on date_end DESC > 3. A non-clustered index on call_record_id ASC > > I'm querying it for failed calls in the last hour: > > SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name, > D.description AS disconnect_reason > FROM dbo.Call_Record c > INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code > WHERE (c.date_end >= (GetUtcDate() - (1.0/24.0)) ) > AND D.is_failure = 1 > GROUP BY C.master_id_carrier, C.location_name, D.description > > The problem is that the execution plan shows that it is using index 1 to > perform this query, whereas index 2 is clearly the best choice. If I > replace the call to GetUtcDate() with a literal date constant like so: > > SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name, > D.description AS disconnect_reason > FROM dbo.Call_Record c > INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code > WHERE (c.date_end >= '2005/11/23') > AND D.is_failure = 1 > GROUP BY C.master_id_carrier, C.location_name, D.description > > then it does use index (2) as expected, and executes in a fraction of > the time. My question is, why does it pick the "incorrect" index for > the first query, and is there any way to force it to pick index (2)? non-clustered index + bookmark lookup, the optimizer always have a delicate choice. If the condition on the column in the NC-index hits few rows is small, the NC index is good. But if the condition hits many rows, the NC index is a lot worse than the table scan, as SQL Server would have to access many data pages more than once. To determine which to use, SQL Server makes estimates from statistics saved for the table. When you put in a date literal, SQL Server can see that the query will only hit a small number of rows, and thus the index is good. But for the first query, the problem is that getutcdate() is a non- deterministic function, and thus will return different values each time. I guess, therefore, the optimizer does not care about the expression, but uses the clustered index instead. Since you have a condition with >= there could potentially be many rows that are hit in the condition. In a situation like this an index hint may be a good idea: FROM dbo.Call_Record c WITH (INDEX = DateEnd_ix) -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx |
|||||||||||||||||||||||