Home All Groups Group Topic Archive Search About
Author
21 Jul 2006 8:58 PM
Kevin Jackson
SQL Server 2005 SP1

I have a table with a datetime column called CreatedOn .  CreatedOn is
indexed.

This query uses the index and is lightning fast

select
    sum(compressedfilesize)  / (1024 * 1024) as 'Repo growth today'
from
    dmsrepositorydigitalassets
where
    createdon > '2006-07-21 00:00:00'
    and createdon <= '2006-07-21 23:59:59'

This query does a clustered index scan on the primary key and is slower than
pond scum

declare @StartDate datetime
set @StartDate = '2006-07-21 00:00:00'
declare @EndDate datetime
set @EndDate = '2006-07-21 23:59:59'

select
  sum(compressedfilesize) / (1024 * 1024) as 'Repo growth today'
from
  dmsrepositorydigitalassets
where
  createdon > @StartDate
  and createdon <= @EndDate

WHY????

Author
21 Jul 2006 10:18 PM
Erland Sommarskog
Kevin Jackson (kjack***@powerwayinc.com) writes:
Show quote
> I have a table with a datetime column called CreatedOn .  CreatedOn is
> indexed.
>
> This query uses the index and is lightning fast
>
> select
>     sum(compressedfilesize)  / (1024 * 1024) as 'Repo growth today'
> from
>     dmsrepositorydigitalassets
> where
>     createdon > '2006-07-21 00:00:00'
>     and createdon <= '2006-07-21 23:59:59'
>
> This query does a clustered index scan on the primary key and is slower
> than pond scum
>
> declare @StartDate datetime
> set @StartDate = '2006-07-21 00:00:00'
> declare @EndDate datetime
> set @EndDate = '2006-07-21 23:59:59'
>
> select
>   sum(compressedfilesize) / (1024 * 1024) as 'Repo growth today'
> from
>   dmsrepositorydigitalassets
> where
>   createdon > @StartDate
>   and createdon <= @EndDate
>
> WHY????

Because in the first query, the optimizer knows exactly which values
you run the query for, and can determine that the query will only
hit a minor share of the rows, and thus that the non-clustered index
is the best choice.

In the latter case, SQL Server has no idea what's the variables, because
the batch is optimised before it is executed, and thus the variable
values are unknown. Instead the optimizer makes a standard assumption
that you will hit, say, 20% of the rows. And for that hit rate, using
the non-clustered index will give a response time which is far higher
than what the clustered index scan gives.

You can add

   OPTION (RECOMPILE)

at the end of the query. This will force the optimizer to recompile
the statement each time is executed. Then the query will be compiled
when the parameter values are known.



--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

AddThis Social Bookmark Button