|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Performance and IndexHi,
In my stored procedures I make selections based on a 'combined' key. I.e.: WHERE [Postcode] = @PostCode AND [No] = @No 1) When there are less [No]'s is it then smart to select first the [No] and then [PostCode]? I.e.: WHERE [No] = @No AND [Postcode] = @PostCode 2) How do I make the index for this? I.e.: for both one? Or one and then first select [No] and then [Postcode]? 3) Do I need to 'index' uniqueidentifier columns? Thanks! You would create an index on (PostCode, No) or (No, PostCode). Make the
more-selective column the first in the index. The order in which you reference the columns in the WHERE clause is immaterial. You *may* need to index a uniqueidentifier column if you are going to be making retrievals based on that column. -- Tom ---------------------------------------------------- Thomas A. Moreau, BSc, PhD, MCSE, MCDBA SQL Server MVP Toronto, ON Canada .. "Arjen" <boah***@hotmail.com> wrote in message In my stored procedures I make selections based on a 'combined' key.news:ecus9m$rio$1@news5.zwoll1.ov.home.nl... Hi, I.e.: WHERE [Postcode] = @PostCode AND [No] = @No 1) When there are less [No]'s is it then smart to select first the [No] and then [PostCode]? I.e.: WHERE [No] = @No AND [Postcode] = @PostCode 2) How do I make the index for this? I.e.: for both one? Or one and then first select [No] and then [Postcode]? 3) Do I need to 'index' uniqueidentifier columns? Thanks! Okay, thanks all.
If SQL server is only using one index then I will create one index for (a part of) the where statement. Sounds like this is the best. @Tom: what do you mean exactly with *more-selective column*? (My selection is always based on the same columns, so in this case it does not matter?) Thanks, Arjen Tom Moreau wrote: Show quote > You would create an index on (PostCode, No) or (No, PostCode). Make the > more-selective column the first in the index. The order in which you > reference the columns in the WHERE clause is immaterial. > > You *may* need to index a uniqueidentifier column if you are going to be > making retrievals based on that column. > > -- > Tom > > ---------------------------------------------------- > Thomas A. Moreau, BSc, PhD, MCSE, MCDBA > SQL Server MVP > Toronto, ON Canada > . > "Arjen" <boah***@hotmail.com> wrote in message > news:ecus9m$rio$1@news5.zwoll1.ov.home.nl... > Hi, > > In my stored procedures I make selections based on a 'combined' key. > I.e.: WHERE [Postcode] = @PostCode AND [No] = @No > > > 1) When there are less [No]'s is it then smart to select first the [No] and > then [PostCode]? > I.e.: WHERE [No] = @No AND [Postcode] = @PostCode > > 2) How do I make the index for this? > I.e.: for both one? Or one and then first select [No] and then [Postcode]? > > 3) Do I need to 'index' uniqueidentifier columns? > > Thanks! Check out p 885 of Inside SQL Server 2000 by Kalen Delaney for all the
details. www.insidesqlserver.com. -- Tom ---------------------------------------------------- Thomas A. Moreau, BSc, PhD, MCSE, MCDBA SQL Server MVP Toronto, ON Canada .. <boah***@hotmail.com> wrote in message news:1156795808.230974.251760@i3g2000cwc.googlegroups.com... Okay, thanks all.If SQL server is only using one index then I will create one index for (a part of) the where statement. Sounds like this is the best. @Tom: what do you mean exactly with *more-selective column*? (My selection is always based on the same columns, so in this case it does not matter?) Thanks, Arjen Tom Moreau wrote: Show quote > You would create an index on (PostCode, No) or (No, PostCode). Make the > more-selective column the first in the index. The order in which you > reference the columns in the WHERE clause is immaterial. > > You *may* need to index a uniqueidentifier column if you are going to be > making retrievals based on that column. > > -- > Tom > > ---------------------------------------------------- > Thomas A. Moreau, BSc, PhD, MCSE, MCDBA > SQL Server MVP > Toronto, ON Canada > . > "Arjen" <boah***@hotmail.com> wrote in message > news:ecus9m$rio$1@news5.zwoll1.ov.home.nl... > Hi, > > In my stored procedures I make selections based on a 'combined' key. > I.e.: WHERE [Postcode] = @PostCode AND [No] = @No > > > 1) When there are less [No]'s is it then smart to select first the [No] > and > then [PostCode]? > I.e.: WHERE [No] = @No AND [Postcode] = @PostCode > > 2) How do I make the index for this? > I.e.: for both one? Or one and then first select [No] and then [Postcode]? > > 3) Do I need to 'index' uniqueidentifier columns? > > Thanks! Hi Tom,
But Kalen doesn't explain why defining the most selective column first would be beneficial for queries where all indexed columns are filtered in the WHERE clause, like in the example query of the OP (and in the example below). To test things, I created a 10M row table (2.2 GB) with a clustered index on a unique integer column. Next I created two nonclustered indexes. One on (low, high) and one on (high, low). Low is a tinyint column with 103 distinct values. High is an int column with some 8.3M distinct values. There is no relation between the two column's values. Next I ran queries like this on SQL Server 2000 EE: SELECT * FROM MyTestTable (index=my_index) WHERE low = 46 AND high = 376607327 I used values for low in "the middle" range, near "the end" and near "the beginning". I did the same for high, and in combination. Each query was run against both indexes. In all cases, the statistics io showed the same amount of logical reads for the two indexes. In other words, they had the same performance. The index on (high, low) did not outperform the index on (low, high). Basically, all that this experiment proves is that index pages do not contain incomplete compound index keys. This is also proved by the fact that both indexes use the exact same amount of disk space (otherwise the index on (low, high) should have been smaller). Because of that, some smart mathematician should be able to prove that there can be no performance difference between the two indexes (in the above scenario). From personal experience, I must admit that I know of one situation in SQL Server 7 where putting the most selective column first is more efficient (an outer join on a compound primary key), but that is basically because of a flaw, which has been fixed in SQL Server 2000. Arjen, my advice would be to put the column that you are least likely to filter on at the end of the index definition. In your case you don't really have to choose based on which theory you believe, because column No is probably the least selective column as well as the column you are least likely to filter on. So you should make Postcode the first column in the index's definition. Gert-Jan Tom Moreau wrote: Show quote > > Check out p 885 of Inside SQL Server 2000 by Kalen Delaney for all the > details. www.insidesqlserver.com. > > -- > Tom > > ---------------------------------------------------- > Thomas A. Moreau, BSc, PhD, MCSE, MCDBA > SQL Server MVP > Toronto, ON Canada > . > <boah***@hotmail.com> wrote in message > news:1156795808.230974.251760@i3g2000cwc.googlegroups.com... > Okay, thanks all. > > If SQL server is only using one index then I will create one index for > (a part of) the where statement. Sounds like this is the best. > > @Tom: what do you mean exactly with *more-selective column*? > (My selection is always based on the same columns, so in this case it > does not matter?) > > Thanks, > Arjen > > Tom Moreau wrote: > > You would create an index on (PostCode, No) or (No, PostCode). Make the > > more-selective column the first in the index. The order in which you > > reference the columns in the WHERE clause is immaterial. > > > > You *may* need to index a uniqueidentifier column if you are going to be > > making retrievals based on that column. > > > > -- > > Tom > > > > ---------------------------------------------------- > > Thomas A. Moreau, BSc, PhD, MCSE, MCDBA > > SQL Server MVP > > Toronto, ON Canada > > . > > "Arjen" <boah***@hotmail.com> wrote in message > > news:ecus9m$rio$1@news5.zwoll1.ov.home.nl... > > Hi, > > > > In my stored procedures I make selections based on a 'combined' key. > > I.e.: WHERE [Postcode] = @PostCode AND [No] = @No > > > > > > 1) When there are less [No]'s is it then smart to select first the [No] > > and > > then [PostCode]? > > I.e.: WHERE [No] = @No AND [Postcode] = @PostCode > > > > 2) How do I make the index for this? > > I.e.: for both one? Or one and then first select [No] and then [Postcode]? > > > > 3) Do I need to 'index' uniqueidentifier columns? > > > > Thanks! Gert-Jan,
I would agree. More to the point, in some cases placing a less selective column first is actually a better choice. Suppose ProvinceCode is less selective than Postal. Suppose you have a frequently run query SELECT ProvinceCode, Postal FROM yourTable WHERE ProvinceCode = @ProvinceCode An index on (ProvinceCode, Postal) will be more useful *for this particular query* than the one on (Postal, ProvinceCode), right? So, if this query is a significant part of your workload, you *might* be better off breaking the rule of thumb and having an index on (ProvinceCode, Postal). For some cases, you will find that if you have a (very) non-selective
leading column (for example, gender), then you can end up negating much of the benefit of an index for seeking since you have to seek and then scan many more pages. If you want to do equality searching for multiple columns (as in this example), putting the most selective field first will cause the seek to have to scan fewer pages when the second column is a range predicate. Another side effect that is not as easy to see is that SQL Server only has histograms for the first column of an index (or any multi-column statistic, which can also be created manually). A non-selective leading key will cause the histogram to not be very useful to the optimizer, and the cardinality estimations may be further off as a result. All of this assumes that you are filtering on those columns - as you point out, if you don't ever filter on a column, putting it as a leading column in an index makes less sense (it still could be useful, but not as much) You should also check the query plans for this case. The "*" may be causing bookmark lookups if your index is not covering. (I can't tell from what you posted). Conor Show quote "Gert-Jan Strik" <sorry@toomuchspamalready.nl> wrote in message news:44F36E4A.27DD83AA@toomuchspamalready.nl... > Hi Tom, > > But Kalen doesn't explain why defining the most selective column first > would be beneficial for queries where all indexed columns are filtered > in the WHERE clause, like in the example query of the OP (and in the > example below). > > To test things, I created a 10M row table (2.2 GB) with a clustered > index on a unique integer column. > Next I created two nonclustered indexes. One on (low, high) and one on > (high, low). Low is a tinyint column with 103 distinct values. High is > an int column with some 8.3M distinct values. There is no relation > between the two column's values. > > Next I ran queries like this on SQL Server 2000 EE: > > SELECT * > FROM MyTestTable (index=my_index) > WHERE low = 46 > AND high = 376607327 > > I used values for low in "the middle" range, near "the end" and near > "the beginning". I did the same for high, and in combination. Each query > was run against both indexes. > > In all cases, the statistics io showed the same amount of logical reads > for the two indexes. In other words, they had the same performance. The > index on (high, low) did not outperform the index on (low, high). > > > Basically, all that this experiment proves is that index pages do not > contain incomplete compound index keys. This is also proved by the fact > that both indexes use the exact same amount of disk space (otherwise the > index on (low, high) should have been smaller). Because of that, some > smart mathematician should be able to prove that there can be no > performance difference between the two indexes (in the above scenario). > > > From personal experience, I must admit that I know of one situation in > SQL Server 7 where putting the most selective column first is more > efficient (an outer join on a compound primary key), but that is > basically because of a flaw, which has been fixed in SQL Server 2000. > > > Arjen, my advice would be to put the column that you are least likely to > filter on at the end of the index definition. In your case you don't > really have to choose based on which theory you believe, because column > No is probably the least selective column as well as the column you are > least likely to filter on. So you should make Postcode the first column > in the index's definition. > > Gert-Jan > > > > Tom Moreau wrote: >> >> Check out p 885 of Inside SQL Server 2000 by Kalen Delaney for all the >> details. www.insidesqlserver.com. >> >> -- >> Tom >> >> ---------------------------------------------------- >> Thomas A. Moreau, BSc, PhD, MCSE, MCDBA >> SQL Server MVP >> Toronto, ON Canada >> . >> <boah***@hotmail.com> wrote in message >> news:1156795808.230974.251760@i3g2000cwc.googlegroups.com... >> Okay, thanks all. >> >> If SQL server is only using one index then I will create one index for >> (a part of) the where statement. Sounds like this is the best. >> >> @Tom: what do you mean exactly with *more-selective column*? >> (My selection is always based on the same columns, so in this case it >> does not matter?) >> >> Thanks, >> Arjen >> >> Tom Moreau wrote: >> > You would create an index on (PostCode, No) or (No, PostCode). Make >> > the >> > more-selective column the first in the index. The order in which you >> > reference the columns in the WHERE clause is immaterial. >> > >> > You *may* need to index a uniqueidentifier column if you are going to >> > be >> > making retrievals based on that column. >> > >> > -- >> > Tom >> > >> > ---------------------------------------------------- >> > Thomas A. Moreau, BSc, PhD, MCSE, MCDBA >> > SQL Server MVP >> > Toronto, ON Canada >> > . >> > "Arjen" <boah***@hotmail.com> wrote in message >> > news:ecus9m$rio$1@news5.zwoll1.ov.home.nl... >> > Hi, >> > >> > In my stored procedures I make selections based on a 'combined' key. >> > I.e.: WHERE [Postcode] = @PostCode AND [No] = @No >> > >> > >> > 1) When there are less [No]'s is it then smart to select first the [No] >> > and >> > then [PostCode]? >> > I.e.: WHERE [No] = @No AND [Postcode] = @PostCode >> > >> > 2) How do I make the index for this? >> > I.e.: for both one? Or one and then first select [No] and then >> > [Postcode]? >> > >> > 3) Do I need to 'index' uniqueidentifier columns? >> > >> > Thanks! "Conor Cunningham [MS]" wrote: If all columns in the index are filtered (for equality), for example> > For some cases, you will find that if you have a (very) non-selective > leading column (for example, gender), then you can end up negating much of > the benefit of an index for seeking since you have to seek and then scan > many more pages. "WHERE low = 8 AND high = 376607327" on a table with index on (low, high) then there is no scanning involved. Even if the low column is the low selectivy column with many index pages for one particular low value. The seek will be on both columns. Each index page will know exactly which (low,high) ranges are covered. So an exact match seek will navigate down the index tree and only read one page of each level. It involves no scanning. A seek based on an index on (high, low) will do exactly the same, the index will have the same index depth, for each level one index page will be read, and so the result will be the same number of reads. > If you want to do equality searching for multiple columns Yes, if the second column is a range predicate, then it is an entirely> (as in this example), putting the most selective field first will cause the > seek to have to scan fewer pages when the second column is a range > predicate. different story, because then it would involve (partial) index scanning. > Another side effect that is not as easy to see is that SQL Server only has That is a good point.> histograms for the first column of an index (or any multi-column statistic, > which can also be created manually). A non-selective leading key will cause > the histogram to not be very useful to the optimizer, and the cardinality > estimations may be further off as a result. When the optimizer is determining whether or not to use the index, then *in general* the estimation based on the index that starts with the most selective column will be more accurate. But a bit more in-depth, it depends. In the experiment I mentioned, the column "low" had 103 distinct values. In a 10 million row table, that means very poor selectivity (especially when compared to the 8.3 million distinct values in column "high"). But the statistics histogram of index (high, low) has no more than 124 groups. That is just 20 percent more than the 103 for the (low, high) index. Now the AVG_RANGE_ROWS value adds a lot of value to the "high" histogram (because it gives an estimate of the density within the group), but will only be accurate if the data distribution in that group is uniform. So the (high, low) index would still give better estimates ... .... unless the low column has a non-uniform data distribution and the high column a uniform data distribution. In that case an index on (low, high) should do better. The histogram of the low column should take care of any distribution oddities and the density estimate for the related high values would be accurate. If the (high, low) index were used in this example, the high value would probably only be found in a range group in the histogram and the use of the density / AVG_RANGE_ROWS information would result in bad estimates for many values because it does not (cannot) take the skewed data distribution of the low column into account. Gert-Jan Show quote > All of this assumes that you are filtering on those columns - as you point > out, if you don't ever filter on a column, putting it as a leading column in > an index makes less sense (it still could be useful, but not as much) > > You should also check the query plans for this case. The "*" may be causing > bookmark lookups if your index is not covering. (I can't tell from what you > posted). > > Conor [snip] 1 and 2: the order of conditions in the WHERE clause is irrelevant to whether
the index will be used. Whether an index is going to be used depends on the selectivity of the data (among other things). 3. If you need to query the data by specifying a particular column in a condition (WHERE, ON) an index may improve performance. Again, this depends on other factors. Indexes are explained in more detail in Books Online. ML --- http://milambda.blogspot.com/ My guess is that the question was not whether the index is going to be
used or not, but more like if there is going to be a performance difference by switching the order of the parameters in the where clause assuming that the index is (or is not) used in both scenarios. And also what would produce better results: defining a composite index on both columns or defining two separate ones on each column? *** Sent via Developersdex http://www.developersdex.com *** How the index is or is not used is dependent on a lot of things, so testing
is the only way to be certain of what works and what does not, but in general... If your criteria is on two columns, and both criteria are equalities, then using one index on both columns would be the most efficient way to retrieve the results. SQL Server will determine on its own whether or not to use the index, but it will use only one index when searching the table, not two. If you create two indexes, one on each column, then SQL Server may use one or the other, or neither, but never both. Of course, if you change the query to use ORs or unions, SQL Server could decide to query the table twice, once with each index, but this has nothing to do with the proposed scenario. Show quote "mEmENT0m0RI" <nospam@devdex.com> wrote in message news:%23GBsTBsyGHA.4844@TK2MSFTNGP04.phx.gbl... > My guess is that the question was not whether the index is going to be > used or not, but more like if there is going to be a performance > difference by switching the order of the parameters in the where clause > assuming that the index is (or is not) used in both scenarios. > And also what would produce better results: defining a composite index > on both columns or defining two separate ones on each column? > > > *** Sent via Developersdex http://www.developersdex.com *** SQL Server will use both indexes if it decides it is more efficient. For
example, if I run: Use Northwind go Set ShowPlan_Text On go Select CustomerID, PostalCode, Region From Customers Where Region = 'SP' Or PostalCode = '1010' go Set ShowPlan_Text Off go The query plan generated is: StmtText -------------------------------------------------------------------------------------------------------------------------------------------- |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([Northwind].[dbo].[Customers])) |--Sort(DISTINCT ORDER BY:([Bmk1000] ASC)) Seek(OBJECT:([Northwind].[dbo].[Customers].[Region]), |--Concatenation |--Index SEEK:([Customers].[Region]='SP') ORDERED FORWARD) |--Index Seek(OBJECT:([Northwind].[dbo].[Customers].[PostalCode]), SEEK:([Customers].[PostalCode]='1010') ORDERED FORWARD) which is using both the Region and PostalCode indexes on the Customers table. Tom Show quote "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message news:uCRPbRsyGHA.2036@TK2MSFTNGP05.phx.gbl... > How the index is or is not used is dependent on a lot of things, so > testing > is the only way to be certain of what works and what does not, but in > general... > > If your criteria is on two columns, and both criteria are equalities, then > using one index on both columns would be the most efficient way to > retrieve > the results. SQL Server will determine on its own whether or not to use > the > index, but it will use only one index when searching the table, not two. > If you create two indexes, one on each column, then SQL Server may use one > or the other, or neither, but never both. > > Of course, if you change the query to use ORs or unions, SQL Server could > decide to query the table twice, once with each index, but this has > nothing > to do with the proposed scenario. > > "mEmENT0m0RI" <nospam@devdex.com> wrote in message > news:%23GBsTBsyGHA.4844@TK2MSFTNGP04.phx.gbl... >> My guess is that the question was not whether the index is going to be >> used or not, but more like if there is going to be a performance >> difference by switching the order of the parameters in the where clause >> assuming that the index is (or is not) used in both scenarios. >> And also what would produce better results: defining a composite index >> on both columns or defining two separate ones on each column? >> >> >> *** Sent via Developersdex http://www.developersdex.com *** > > yup, that falls into the category of using ORs/unions which allows SQL
server to break the query into two queries then concatenate the results. When using two criteria with AND, however, I can't think of a case where this would be possible. Show quote "Tom Cooper" <tomcooper@comcast.no.spam.please.net> wrote in message news:IMednXOqvvKlo27ZnZ2dnUVZ_sudnZ2d@comcast.com... > SQL Server will use both indexes if it decides it is more efficient. For > example, if I run: > Use Northwind > go > Set ShowPlan_Text On > go > Select CustomerID, PostalCode, Region > From Customers > Where Region = 'SP' > Or PostalCode = '1010' > go > Set ShowPlan_Text Off > go > > The query plan generated is: > > StmtText > -------------------------------------------------------------------------- ------------------------------------------------------------------ > |--Bookmark Lookup(BOOKMARK:([Bmk1000]), > OBJECT:([Northwind].[dbo].[Customers])) > |--Sort(DISTINCT ORDER BY:([Bmk1000] ASC)) > |--Concatenation > |--Index > Seek(OBJECT:([Northwind].[dbo].[Customers].[Region]), > SEEK:([Customers].[Region]='SP') ORDERED FORWARD) > |--Index > Seek(OBJECT:([Northwind].[dbo].[Customers].[PostalCode]), > SEEK:([Customers].[PostalCode]='1010') ORDERED FORWARD) > > which is using both the Region and PostalCode indexes on the Customers > table. > > Tom > > "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message > news:uCRPbRsyGHA.2036@TK2MSFTNGP05.phx.gbl... > > How the index is or is not used is dependent on a lot of things, so > > testing > > is the only way to be certain of what works and what does not, but in > > general... > > > > If your criteria is on two columns, and both criteria are equalities, then > > using one index on both columns would be the most efficient way to > > retrieve > > the results. SQL Server will determine on its own whether or not to use > > the > > index, but it will use only one index when searching the table, not two. > > If you create two indexes, one on each column, then SQL Server may use one > > or the other, or neither, but never both. > > > > Of course, if you change the query to use ORs or unions, SQL Server could > > decide to query the table twice, once with each index, but this has > > nothing > > to do with the proposed scenario. > > > > "mEmENT0m0RI" <nospam@devdex.com> wrote in message > > news:%23GBsTBsyGHA.4844@TK2MSFTNGP04.phx.gbl... > >> My guess is that the question was not whether the index is going to be > >> used or not, but more like if there is going to be a performance > >> difference by switching the order of the parameters in the where clause > >> assuming that the index is (or is not) used in both scenarios. > >> And also what would produce better results: defining a composite index > >> on both columns or defining two separate ones on each column? > >> > >> > >> *** Sent via Developersdex http://www.developersdex.com *** > > > > > > Jim Underwood wrote:
> yup, that falls into the category of using ORs/unions which allows SQL Not exactly - I think it merges 2 sets of bookmarks, then does bookmark> server to break the query into two queries then concatenate the results. > When using two criteria with AND, however, I can't think of a case where > this would be possible. lookup just once per row. It is called index ORing. Index ANDing is also quite possible - it joins 2 sets of bookmarks first, then uses the result set to perform bookmark lookups. Are you certain about the index ANDing? It seems to me that it would work
something like this... Select column1, column2 --(each column has its own index) from SomeTable where column1 = 10 and column2 = 'blue' Lets say the result is 50 rows of data, but 100 rows match the 1st criteria and 100 match the second criteria. The optimizer would join 200 rows of bookmarks, rather than filter only 100 from one index or the other? It seems that this would always be slower than using only one index. Or am I simply missing another situation in which this would be efficient? Show quote "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message news:1156794157.300651.223530@74g2000cwt.googlegroups.com... > Jim Underwood wrote: > > yup, that falls into the category of using ORs/unions which allows SQL > > server to break the query into two queries then concatenate the results. > > When using two criteria with AND, however, I can't think of a case where > > this would be possible. > > Not exactly - I think it merges 2 sets of bookmarks, then does bookmark > lookup just once per row. It is called index ORing. Index ANDing is > also quite possible - it joins 2 sets of bookmarks first, then uses the > result set to perform bookmark lookups. > Jim,
the optimizer will join bookmarks from one index (selectivity S1) against bookmarks from the other (selectivity S2). It will expect to get S1*S2 selectivity SET SHOWPLAN_TEXT ON go SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' AND Make = 'Honda' ------------------ StmtText -------------------------------------------------------------------------------------------------------------------------------- |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([Sandbox].[dbo].[Vehicle])) |--Hash Match(Inner Join, HASH:([Bmk1000])=([Bmk1000]), RESIDUAL:([Bmk1000]=[Bmk1000])) |--Index Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleColor]),SEEK:([Vehicle].[Color]='Red') ORDERED FORWARD) |--Index Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleMake]),SEEK:([Vehicle].[Make]='Honda') ORDERED FORWARD) In this particular case the selectivity of Color = 'Red' alone is low, SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' executes as a clustered index scan. Similarly, the selectivity of Make = 'Honda' alone is low, SELECT Make, Color, SalePrice FROM Vehicle WHERE Make = 'Honda' also executes as a clustered index scan. Thanks for the example, I think I just figured out why this is faster...
The problem was I couldn't figure out how joining two sets of bookmarks would be more efficient than filtering a single set, but I was not considering the timing of each action and the resources involved. If I understand correctly, it can efficiently check two indexes and join the bookmarks together without ever reading a row of the actual table. This could result in a lot less IO than having to read every row in order to apply the filter. Do I have that right? Show quote "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message news:1156796206.350067.294390@i3g2000cwc.googlegroups.com... > Jim, > > the optimizer will join bookmarks from one index (selectivity S1) > against bookmarks from the other (selectivity S2). It will expect to > get S1*S2 selectivity > > SET SHOWPLAN_TEXT ON > go > SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' AND Make > = 'Honda' > > ------------------ > > StmtText > > -------------------------------------------------------------------------- ------------------------------------------------------ > > |--Bookmark Lookup(BOOKMARK:([Bmk1000]), > OBJECT:([Sandbox].[dbo].[Vehicle])) > |--Hash Match(Inner Join, HASH:([Bmk1000])=([Bmk1000]), > RESIDUAL:([Bmk1000]=[Bmk1000])) > |--Index > Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleColor]), > SEEK:([Vehicle].[Color]='Red') ORDERED FORWARD) > |--Index > Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleMake]), > SEEK:([Vehicle].[Make]='Honda') ORDERED FORWARD) > > In this particular case the selectivity of Color = 'Red' alone is low, > > SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' > > executes as a clustered index scan. Similarly, the selectivity of Make > = 'Honda' alone is low, > > SELECT Make, Color, SalePrice FROM Vehicle WHERE Make = 'Honda' > > also executes as a clustered index scan. > > Do I have that right? Yep.-- Show quoteTibor Karaszi, SQL Server MVP http://www.karaszi.com/sqlserver/default.asp http://www.solidqualitylearning.com/ "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message news:eOBdzFuyGHA.1256@TK2MSFTNGP04.phx.gbl... > Thanks for the example, I think I just figured out why this is faster... > > The problem was I couldn't figure out how joining two sets of bookmarks > would be more efficient than filtering a single set, but I was not > considering the timing of each action and the resources involved. > > If I understand correctly, it can efficiently check two indexes and join the > bookmarks together without ever reading a row of the actual table. This > could result in a lot less IO than having to read every row in order to > apply the filter. > > Do I have that right? > > "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message > news:1156796206.350067.294390@i3g2000cwc.googlegroups.com... >> Jim, >> >> the optimizer will join bookmarks from one index (selectivity S1) >> against bookmarks from the other (selectivity S2). It will expect to >> get S1*S2 selectivity >> >> SET SHOWPLAN_TEXT ON >> go >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' AND Make >> = 'Honda' >> >> ------------------ >> >> StmtText >> >> -------------------------------------------------------------------------- > ------------------------------------------------------ >> >> |--Bookmark Lookup(BOOKMARK:([Bmk1000]), >> OBJECT:([Sandbox].[dbo].[Vehicle])) >> |--Hash Match(Inner Join, HASH:([Bmk1000])=([Bmk1000]), >> RESIDUAL:([Bmk1000]=[Bmk1000])) >> |--Index >> Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleColor]), >> SEEK:([Vehicle].[Color]='Red') ORDERED FORWARD) >> |--Index >> Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleMake]), >> SEEK:([Vehicle].[Make]='Honda') ORDERED FORWARD) >> >> In this particular case the selectivity of Color = 'Red' alone is low, >> >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' >> >> executes as a clustered index scan. Similarly, the selectivity of Make >> = 'Honda' alone is low, >> >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Make = 'Honda' >> >> also executes as a clustered index scan. >> > > As always, thank you for the insight. I would never learn half as much in
any course as I do in this newsgroup. Show quote "Tibor Karaszi" <tibor_please.no.email_kara***@hotmail.nomail.com> wrote in message news:euQGtKuyGHA.4816@TK2MSFTNGP06.phx.gbl... > > Do I have that right? > > Yep. > > -- > Tibor Karaszi, SQL Server MVP > http://www.karaszi.com/sqlserver/default.asp > http://www.solidqualitylearning.com/ > > > "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message > news:eOBdzFuyGHA.1256@TK2MSFTNGP04.phx.gbl... > > Thanks for the example, I think I just figured out why this is faster... > > > > The problem was I couldn't figure out how joining two sets of bookmarks > > would be more efficient than filtering a single set, but I was not > > considering the timing of each action and the resources involved. > > > > If I understand correctly, it can efficiently check two indexes and join the > > bookmarks together without ever reading a row of the actual table. This > > could result in a lot less IO than having to read every row in order to > > apply the filter. > > > > Do I have that right? > > > > "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message > > news:1156796206.350067.294390@i3g2000cwc.googlegroups.com... > >> Jim, > >> > >> the optimizer will join bookmarks from one index (selectivity S1) > >> against bookmarks from the other (selectivity S2). It will expect to > >> get S1*S2 selectivity > >> > >> SET SHOWPLAN_TEXT ON > >> go > >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' AND Make > >> = 'Honda' > >> > >> ------------------ > >> > >> StmtText > >> > >> ------------------------------------------------------------------------- - > > ------------------------------------------------------ > >> > >> |--Bookmark Lookup(BOOKMARK:([Bmk1000]), > >> OBJECT:([Sandbox].[dbo].[Vehicle])) > >> |--Hash Match(Inner Join, HASH:([Bmk1000])=([Bmk1000]), > >> RESIDUAL:([Bmk1000]=[Bmk1000])) > >> |--Index > >> Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleColor]), > >> SEEK:([Vehicle].[Color]='Red') ORDERED FORWARD) > >> |--Index > >> Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleMake]), > >> SEEK:([Vehicle].[Make]='Honda') ORDERED FORWARD) > >> > >> In this particular case the selectivity of Color = 'Red' alone is low, > >> > >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' > >> > >> executes as a clustered index scan. Similarly, the selectivity of Make > >> = 'Honda' alone is low, > >> > >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Make = 'Honda' > >> > >> also executes as a clustered index scan. > >> > > > > > > I would never learn half as much in As a trainer, how should I respond to that? <g>> any course as I do in this newsgroup. I hope all learn as much as I do in these newsgroups, with the possible exception of Joe Celko, who pick up material for his books. -- Show quoteTibor Karaszi, SQL Server MVP http://www.karaszi.com/sqlserver/default.asp http://www.solidqualitylearning.com/ "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message news:OfgfdOuyGHA.4596@TK2MSFTNGP02.phx.gbl... > As always, thank you for the insight. I would never learn half as much in > any course as I do in this newsgroup. > > "Tibor Karaszi" <tibor_please.no.email_kara***@hotmail.nomail.com> wrote in > message news:euQGtKuyGHA.4816@TK2MSFTNGP06.phx.gbl... >> > Do I have that right? >> >> Yep. >> >> -- >> Tibor Karaszi, SQL Server MVP >> http://www.karaszi.com/sqlserver/default.asp >> http://www.solidqualitylearning.com/ >> >> >> "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message >> news:eOBdzFuyGHA.1256@TK2MSFTNGP04.phx.gbl... >> > Thanks for the example, I think I just figured out why this is faster... >> > >> > The problem was I couldn't figure out how joining two sets of bookmarks >> > would be more efficient than filtering a single set, but I was not >> > considering the timing of each action and the resources involved. >> > >> > If I understand correctly, it can efficiently check two indexes and join > the >> > bookmarks together without ever reading a row of the actual table. This >> > could result in a lot less IO than having to read every row in order to >> > apply the filter. >> > >> > Do I have that right? >> > >> > "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message >> > news:1156796206.350067.294390@i3g2000cwc.googlegroups.com... >> >> Jim, >> >> >> >> the optimizer will join bookmarks from one index (selectivity S1) >> >> against bookmarks from the other (selectivity S2). It will expect to >> >> get S1*S2 selectivity >> >> >> >> SET SHOWPLAN_TEXT ON >> >> go >> >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' AND Make >> >> = 'Honda' >> >> >> >> ------------------ >> >> >> >> StmtText >> >> >> >>> ------------------------------------------------------------------------- > - >> > ------------------------------------------------------ >> >> >> >> |--Bookmark Lookup(BOOKMARK:([Bmk1000]), >> >> OBJECT:([Sandbox].[dbo].[Vehicle])) >> >> |--Hash Match(Inner Join, HASH:([Bmk1000])=([Bmk1000]), >> >> RESIDUAL:([Bmk1000]=[Bmk1000])) >> >> |--Index >> >> Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleColor]), >> >> SEEK:([Vehicle].[Color]='Red') ORDERED FORWARD) >> >> |--Index >> >> Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleMake]), >> >> SEEK:([Vehicle].[Make]='Honda') ORDERED FORWARD) >> >> >> >> In this particular case the selectivity of Color = 'Red' alone is low, >> >> >> >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' >> >> >> >> executes as a clustered index scan. Similarly, the selectivity of Make >> >> = 'Honda' alone is low, >> >> >> >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Make = 'Honda' >> >> >> >> also executes as a clustered index scan. >> >> >> > >> > >> > > Right, the optimizer AND's the bookmarks retrieved from each indexes and
only goes to get the actual row if the rows appears in the result from both indexes. In some cases, it can even do better than that. If, between the two indexes (and the columns in the clustered index), the query is covered, it nevered needs to go to the table at all. So, if in Alexander's example, the clustered index was VIN (the vehicle id number) and the query was SELECT VIN FROM Vehicle WHERE Color = 'Red AND Make = 'Honda' than that query is covered by the two indexes and access to the table is not needed. For example, try running the following script (WARNING, it creates a database named Foo, loads data into it and then drops that database, so be careful where you run it - if for example, you already have a database named Foo, this script would drop it). Running this script takes about 3 minutes on my server. It creates a database, loads a table in that database with 1,000,000 or so rows and a clustered index on the primary key (PK), sets up two columns (Status and Region) each with evenly distributed values of about 0-1000, then creates nonclustered indexes on those two columns. It then shows the query play that the optimizer would pick if you did Select PK, Status, Region From FooTable Where Region = 54 And Status = 123 It then actually does this query using index hints - once using both indexes, once using the clustered index, and once using only one of the indexes. As you will see, using both indexes is the fastest. Set NoCount On Create Database Foo go Use Foo go Create Table FooTable (PK int Primary Key Identity, Status int Null, Region int Null, CharColumn char(100)) go Declare @Loop int Insert FooTable (Status, Region, CharColumn) Values (Null, Null, '') Set @Loop = 0 While @Loop < 20 Begin Insert FooTable (Status, Region, CharColumn) Select Region, Status, CharColumn From FooTable Set @Loop = @Loop + 1 End Select Count(*) From FooTable Update FooTable Set Status = PK % 1000, Region = PK % 1001 Create Index Status On FooTable (Status) Create Index Region On FooTable (Region) go Set ShowPlan_Text On go Select PK, Status, Region From FooTable Where Region = 54 And Status = 123 go Set ShowPlan_Text Off go Declare @Time1 datetime Declare @Time2 datetime Declare @Time3 datetime Declare @Time4 datetime Select @Time1 = GetDate() Select PK, Status, Region From FooTable (index (Region, Status)) Where Region = 54 And Status = 123 Select @Time2 = GetDate() Select PK, Status, Region From FooTable (index=1) Where Region = 54 And Status = 123 Select @Time3 = GetDate() Select PK, Status, Region From FooTable (index (Region)) Where Region = 54 And Status = 123 Select @Time4 = GetDate() Select DateDiff(ms, @Time1, @Time2) As BothIndexes, DateDiff(ms, @Time2, @Time3) As ClusteredIndex, DateDiff(ms, @Time3, @Time4) As OneIndex go Use master go Drop Database Foo Tom Show quote "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message news:eOBdzFuyGHA.1256@TK2MSFTNGP04.phx.gbl... > Thanks for the example, I think I just figured out why this is faster... > > The problem was I couldn't figure out how joining two sets of bookmarks > would be more efficient than filtering a single set, but I was not > considering the timing of each action and the resources involved. > > If I understand correctly, it can efficiently check two indexes and join > the > bookmarks together without ever reading a row of the actual table. This > could result in a lot less IO than having to read every row in order to > apply the filter. > > Do I have that right? > > "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message > news:1156796206.350067.294390@i3g2000cwc.googlegroups.com... >> Jim, >> >> the optimizer will join bookmarks from one index (selectivity S1) >> against bookmarks from the other (selectivity S2). It will expect to >> get S1*S2 selectivity >> >> SET SHOWPLAN_TEXT ON >> go >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' AND Make >> = 'Honda' >> >> ------------------ >> >> StmtText >> >> -------------------------------------------------------------------------- > ------------------------------------------------------ >> >> |--Bookmark Lookup(BOOKMARK:([Bmk1000]), >> OBJECT:([Sandbox].[dbo].[Vehicle])) >> |--Hash Match(Inner Join, HASH:([Bmk1000])=([Bmk1000]), >> RESIDUAL:([Bmk1000]=[Bmk1000])) >> |--Index >> Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleColor]), >> SEEK:([Vehicle].[Color]='Red') ORDERED FORWARD) >> |--Index >> Seek(OBJECT:([Sandbox].[dbo].[Vehicle].[VehicleMake]), >> SEEK:([Vehicle].[Make]='Honda') ORDERED FORWARD) >> >> In this particular case the selectivity of Color = 'Red' alone is low, >> >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Color = 'Red' >> >> executes as a clustered index scan. Similarly, the selectivity of Make >> = 'Honda' alone is low, >> >> SELECT Make, Color, SalePrice FROM Vehicle WHERE Make = 'Honda' >> >> also executes as a clustered index scan. >> > > > If I understand correctly, it can efficiently check two indexes and join the Jim,> bookmarks together without ever reading a row of the actual table. This > could result in a lot less IO than having to read every row in order to > apply the filter. > > Do I have that right? Yes, you got it. Using your own words, "it *could* result in a lot less IO", with a very strong emphasis on could. Suppose the selectivities for both conditions are 0.1. The optimizer will assume that the selectivity for both conditions together is 0.01 (derived as 0.1*0.1). Suppose Make and COlor are not correlated, then the estimate of selectivity is more or less correct and the plan involving index ANDing is good. Very frequently that's just the case. But. SOmetimes such optimizer's logic just does not work right. For instance, Make and Model are strongly correlated. Suppose that selectivity of condition (Model = 'Accord') is 0.05. The optimizer will assume that the selectivity for (Make = 'Honda' AND Model = 'Accord') is 0.1 times 0.05 is 0.005 when in fact it is 0.05. (We humans know that all Accords are Hondas, the optimizer does not know it). The consequenses are dire: selecting 5% rows using index ANDing is much slower as compared to selecting the same set of rows by just scanning the whole clustered index. The optimizer's estimate of selectivity was a magnitude of order wrong - 0.005 instead of 0.05. Optimizing queries with condition on correlated columns is a very interesting challenge. If both columns are in one table, sometimes I use an index on a computed column MakeModel AS (Make + ' '+ Model) In other cases sometimes saving intermediate results in a temporary table is an easy way out. Alexander Kuznetsov wrote:
[snip] > Optimizing queries with condition on correlated columns is a very If you know you will be querying this combination frequently, then why> interesting challenge. > If both columns are in one table, sometimes I use an index on a > computed column MakeModel AS (Make + ' '+ Model) > In other cases sometimes saving intermediate results in a temporary > table is an easy way out. not create a compound index? A compound index will outperform single index AND'ing any day of the week. In fact, for index AND'ing to be worthwile, each index needs to be in a specific selectivity range (quite selective, but not too selective). I see use in it if you have many columns indexed, and you cannot predict which combinations will be queries on. Just my 5 cents, Gert-Jan Gert-Jan,
I oversimplified my example. In fact I was selecting from several tables, for instance: SELECT Make, Color, SalePrice, RepairType FROM Vehicle JOIN Repair ON Vehicle.VIN = Repair.VIN WHERE Model = 'Accord' AND Make = 'Honda' It should be a merge join or a hash join, but the optimizer was expecting just a handful of rows and went for nested loops. Selecting a lot of rows with a nested loops plan was very slow. I was not able to have the optimizer use a compound index and correctly estimate number of rows matching both conditions (Model = 'Accord' AND Make = 'Honda'). I handled such situations several times. In some cases an index on computed column 'Honda Accord' helped. In other cases the solution was to save intermediate results in a temporary table and build an index on it - that means statistics, therefore a very good estimate of cardinality. SELECT VIN, Make, Color, SalePrice INTO #t FROM Vehicle WHERE Mileage BETWEEN 30000 AND 50000 AND Price BETWEEN 7000 AND 15000 Unlike the previous example with Make and Model, Mileage and Price are also strongly correlated, but they have a lot of distinct values. In this case the only approach which I know that works is to save intermediate results in a temporary table and build an index on it. Makes sense? See inline
Alexander Kuznetsov wrote: Show quote > If the table has a skewed distribution this can happen. Especially with> Gert-Jan, > > I oversimplified my example. In fact I was selecting from several > tables, for instance: > > SELECT Make, Color, SalePrice, RepairType FROM Vehicle JOIN Repair ON > Vehicle.VIN = Repair.VIN > WHERE Model = 'Accord' AND Make = 'Honda' > > It should be a merge join or a hash join, but the optimizer was > expecting just a handful of rows and went for nested loops. > Selecting a lot of rows with a nested loops plan was very slow. I was > not able to have the optimizer use a compound index and > correctly estimate number of rows matching both conditions (Model = > 'Accord' AND Make = 'Honda'). > I handled such situations several times. In some cases an index on > computed column 'Honda Accord' helped. > In other cases the solution was to save intermediate results in a > temporary table and build an index on it - that means statistics, > therefore a very good estimate of cardinality. stored procedures in combination with parameter sniffing. In those cases, creating the stored procedure WITH RECOMPILE can help. Also, the use of the compound index requires that its statistics are up to date. > SELECT VIN, Make, Color, SalePrice Yes. Although you if the values are fixed, then you should consider an> INTO #t > FROM Vehicle WHERE Mileage BETWEEN 30000 AND 50000 AND Price BETWEEN > 7000 AND 15000 > > Unlike the previous example with Make and Model, Mileage and Price are > also strongly correlated, but they have a lot of distinct values. In > this case the only approach which I know that works is to save > intermediate results in a temporary table and build an index on it. > > Makes sense? indexed view instead of a temporary table. This way the data will always be up to date, and saves you a lot of coding and debugging. Gert-Jan > If the table has a skewed distribution this can happen. Especially with No! The reason is entirely different. Suppose the threshold between> stored procedures in combination with parameter sniffing. In those > cases, creating the stored procedure WITH RECOMPILE can help. > > Also, the use of the compound index requires that its statistics are up > to date. non-clustered index seek and clustered index scan is 0.015 selectivity. Suppose you have 10 different (Make, Model) pairs, 10% rows each, and different models have different makes. There is no data skew at all - each Make has selectivity 0.1, and each Model also has selectivity 0.1. The problem is that the optimizer does not know that there are only 10 different (Make, Model) pairs - it assumes that there are 100 different (Make, Model) pairs. It assumes that 1% rows are (Honda, Accord), and 1% rows are (Toyota, Accord). Let me copy and paste from Conor Cunningham's post in the same thread: "SQL Server only has histograms for the first column of an index (or any multi-column statistic, which can also be created manually)" This means that the optimizer cannot derive from statistics on an index on (Model, Make) that there are 10% Honda Accords. This is why I am using a computed column on Model+' '+Make. > I am always considering indexed views, but in OLTP encironments I am> Yes. Although you if the values are fixed, then you should consider an > indexed view instead of a temporary table. This way the data will always > be up to date, and saves you a lot of coding and debugging. > usually deciding against them, because they introduce lock contention. Suppose Vehicle.Model and Repair.Type are correlated. Without an indexed view containing both columns, 2 connections can simultaneously modify Connection 1: Vehicle with VIN=123456 Connection 2: Repairs for VIN=123456 Once you have created an indexed view containing both Vehicle.Model and Repair.Type, Connection 1 will acquire an update lock on the corresponding row in the indexed view, adn Connection 2 will have to wait. Also in many cases the query I need to optimize is not run too frequently. As such, it might be overall cheaper to not maintain an indexed view necessary to speed up one query only. Alexander Kuznetsov wrote:
Show quote > There is some truth to this, but it is not as clear cut as you put it.> > If the table has a skewed distribution this can happen. Especially with > > stored procedures in combination with parameter sniffing. In those > > cases, creating the stored procedure WITH RECOMPILE can help. > > > > Also, the use of the compound index requires that its statistics are up > > to date. > > No! The reason is entirely different. Suppose the threshold between > non-clustered index seek and clustered index scan is 0.015 selectivity. > Suppose you have 10 different (Make, Model) pairs, 10% rows each, and > different models have different makes. > There is no data skew at all - each Make has selectivity 0.1, and each > Model also has selectivity 0.1. > The problem is that the optimizer does not know that there are only 10 > different (Make, Model) pairs - > it assumes that there are 100 different (Make, Model) pairs. It assumes > that 1% rows are (Honda, Accord), > and 1% rows are (Toyota, Accord). Let me copy and paste from Conor > Cunningham's post in the same thread: > > "SQL Server only has histograms for the first column of an index (or > any multi-column statistic, > which can also be created manually)" > > This means that the optimizer cannot derive from statistics on an index > on (Model, Make) that there are 10% Honda Accords. > This is why I am using a computed column on Model+' '+Make. [snip] See my response to Conor. But your example is not a good one, because the implementation script of it (below) shows a 0.1 density estimation for both the "Make" and the "Make, Model" combination. So the optimizer could (and should) know that there is perfect correlation between Make and Model. Gert-Jan --drop table Vehicle create table Vehicle (Make char(10) not null ,Model char(10) not null ,constraint PK_Vehicle primary key clustered (Make, Model) ) insert into Vehicle values ('Honda', 'Accord') insert into Vehicle values ('Toyota','Corolla') insert into Vehicle values ('c','c') insert into Vehicle values ('d','d') insert into Vehicle values ('e','e') insert into Vehicle values ('f','f') insert into Vehicle values ('g','g') insert into Vehicle values ('h','h') insert into Vehicle values ('i','i') insert into Vehicle values ('j','j') go update statistics Vehicle go dbcc show_statistics (Vehicle,PK_Vehicle) > But your example is not a good one, because the implementation script of you are right. I inflated Vehicle table to 100K rows, and the optimizer> it (below) shows a 0.1 density estimation for both the "Make" and the > "Make, Model" combination. So the optimizer could (and should) know that > there is perfect correlation between Make and Model. Gert-Jan, did recognize the 100% correlation. Thanks for the correction. mEmENT0m0RI wrote:
> My guess is that the question was not whether the index is going to be Thats correct!> used or not, but more like if there is going to be a performance > difference by switching the order of the parameters in the where clause > assuming that the index is (or is not) used in both scenarios. > And also what would produce better results: defining a composite index > on both columns or defining two separate ones on each column? > > > *** Sent via Developersdex http://www.developersdex.com *** Arjen |
|||||||||||||||||||||||