|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Basic Indexing questionQuick question about indexes. I've never really understood this so I'm hoping someone can help me out. If I have a table with 4 columns like this. Column1 Column2 Column3 Column4 ....and this table has an index that covers Columns 1,2 & 3 Question is, does a query that has a Where clause only on columns 1 & 2 use such an index? eg. Select * From Table Where Column1 = "dog" and Column2 = "cat" Notice there is no reference to column3. Does the query also need to reference Column3 in order to use this index? Does it matter the order in which the columns in the Where clause are? eg. Is Select * From Table Where Column1 = "dog" and Column2 = "cat" the same as Select * From Table Where Column2 = "cat" and Column1 = "dog" Thanks Answers in line.
<rod.w***@gmail.com> wrote in message Show quote news:1156478106.472527.75770@p79g2000cwp.googlegroups.com... Yes, they are treated the same> Hi there, > > Quick question about indexes. I've never really understood this so I'm > hoping someone can help me out. > > If I have a table with 4 columns like this. > > Column1 > Column2 > Column3 > Column4 > > ...and this table has an index that covers Columns 1,2 & 3 > > Question is, does a query that has a Where clause only on columns 1 & 2 > use such an index? Yes > > eg. > Select * From Table Where Column1 = "dog" and Column2 = "cat" > > Notice there is no reference to column3. Does the query also need to > reference Column3 in order to use this index? No > Does it matter the order in which the columns in the Where clause are? > eg. > > Is > > Select * From Table Where Column1 = "dog" and Column2 = "cat" > the same as > Select * From Table Where Column2 = "cat" and Column1 = "dog" > Tom> Thanks > Hi Tom
Your first 'Yes' is not really accurate. It would be more correct so state that it *might* use the index, depending on the selectivity of the query predicates (where clause) For example, if every row in the table has the values "dog" for column1 and "cat" for column2, then it's more likely that SQL Server will scan the table (or its clustered index if it has one). It is also likely to make this decision if a reasonably high proportion of the rows in the table meet this criteria.. The reason for this is that the query is requesting all columns be returned ("SELECT * "). If all / many rows are to be returned, the index only covers three columns but the query is requesting all 4 be returned (due to the "SELECT * "). If the query was driven via the index, it would then have to look up the base table to retrieve Column4 FOR EVERY ROW, via either a bookmark lookup if a clustered index exists, via page pointers to the heap storage if no clustered index exists or other merge / hash operation. This plan would generally (depending on how many rows met the predicates) be far less efficient than simply scanning the table once. At the other extreme, if a unique constraint exists on column1, column2, then it's very likely that the index will be seek'd & a bookmark / page lookup performed to retrieved column4. When evaluating index usefulness, it is critical to take into account ALL columns involved in the query, not just those in the WHERE clause. This is why using "SELECT * " is such a bad practice. It's also important to consider the selectivity of the query predicates as this can also be a big factor. Regards, Greg Linwood SQL Server MVP Show quote "Tom Cooper" <tomcooper@comcast.no.spam.please.net> wrote in message news:8N6dnTB3jfnw6HPZnZ2dnUVZ_vWdnZ2d@comcast.com... > Answers in line. > > <rod.w***@gmail.com> wrote in message > news:1156478106.472527.75770@p79g2000cwp.googlegroups.com... >> Hi there, >> >> Quick question about indexes. I've never really understood this so I'm >> hoping someone can help me out. >> >> If I have a table with 4 columns like this. >> >> Column1 >> Column2 >> Column3 >> Column4 >> >> ...and this table has an index that covers Columns 1,2 & 3 >> >> Question is, does a query that has a Where clause only on columns 1 & 2 >> use such an index? > Yes > >> >> eg. >> Select * From Table Where Column1 = "dog" and Column2 = "cat" >> >> Notice there is no reference to column3. Does the query also need to >> reference Column3 in order to use this index? > No > >> Does it matter the order in which the columns in the Where clause are? >> eg. >> >> Is >> >> Select * From Table Where Column1 = "dog" and Column2 = "cat" >> the same as >> Select * From Table Where Column2 = "cat" and Column1 = "dog" > Yes, they are treated the same > >> >> Thanks >> > > Tom > Hi Greg,
You are, or course, correct. And I knew that. I just thought, that from the OP's questions, he was concerned that the fact that he did not use column3 in the WHERE clause would prevent the optimizer from using the index. The point I was trying to make is that SQL Server would be able to use this index even though column3 was not used in the WHERE clause. I didn't want to go too deep for fear of confusing the OP who, from his questions, seemed to me to be relatively new to how SQL Server uses indexes. Of course, the optimizer won't use the index if it decides there is a faster way to retrieve the result set and that is true whether or not the WHERE clause uses all the columns in the index. It's always hard to decide where to stop in answering questions like this. For instance, we could add to your answer a discussion of statistics, problems that occur when the statistics are out of date, methods for updataing statistics, estimated and actual execution plans, etc, etc. But the time you get done, you have written a book. Tom Show quote "Greg Linwood" <g_linw***@hotmail.com> wrote in message news:eOMWwjByGHA.1304@TK2MSFTNGP05.phx.gbl... > Hi Tom > > Your first 'Yes' is not really accurate. It would be more correct so state > that it *might* use the index, depending on the selectivity of the query > predicates (where clause) > > For example, if every row in the table has the values "dog" for column1 > and "cat" for column2, then it's more likely that SQL Server will scan the > table (or its clustered index if it has one). It is also likely to make > this decision if a reasonably high proportion of the rows in the table > meet this criteria.. > > The reason for this is that the query is requesting all columns be > returned ("SELECT * "). If all / many rows are to be returned, the index > only covers three columns but the query is requesting all 4 be returned > (due to the "SELECT * "). If the query was driven via the index, it would > then have to look up the base table to retrieve Column4 FOR EVERY ROW, via > either a bookmark lookup if a clustered index exists, via page pointers to > the heap storage if no clustered index exists or other merge / hash > operation. This plan would generally (depending on how many rows met the > predicates) be far less efficient than simply scanning the table once. > > At the other extreme, if a unique constraint exists on column1, column2, > then it's very likely that the index will be seek'd & a bookmark / page > lookup performed to retrieved column4. > > When evaluating index usefulness, it is critical to take into account ALL > columns involved in the query, not just those in the WHERE clause. This is > why using "SELECT * " is such a bad practice. It's also important to > consider the selectivity of the query predicates as this can also be a big > factor. > > Regards, > Greg Linwood > SQL Server MVP > > "Tom Cooper" <tomcooper@comcast.no.spam.please.net> wrote in message > news:8N6dnTB3jfnw6HPZnZ2dnUVZ_vWdnZ2d@comcast.com... >> Answers in line. >> >> <rod.w***@gmail.com> wrote in message >> news:1156478106.472527.75770@p79g2000cwp.googlegroups.com... >>> Hi there, >>> >>> Quick question about indexes. I've never really understood this so I'm >>> hoping someone can help me out. >>> >>> If I have a table with 4 columns like this. >>> >>> Column1 >>> Column2 >>> Column3 >>> Column4 >>> >>> ...and this table has an index that covers Columns 1,2 & 3 >>> >>> Question is, does a query that has a Where clause only on columns 1 & 2 >>> use such an index? >> Yes >> >>> >>> eg. >>> Select * From Table Where Column1 = "dog" and Column2 = "cat" >>> >>> Notice there is no reference to column3. Does the query also need to >>> reference Column3 in order to use this index? >> No >> >>> Does it matter the order in which the columns in the Where clause are? >>> eg. >>> >>> Is >>> >>> Select * From Table Where Column1 = "dog" and Column2 = "cat" >>> the same as >>> Select * From Table Where Column2 = "cat" and Column1 = "dog" >> Yes, they are treated the same >> >>> >>> Thanks >>> >> >> Tom >> > > In hear you Tom & I agree in general about trying to provide simple answers
where appropriate but in cases like this, an over-simplified answer can easily be more mis-leading than a more thorough one, especially to someone who's new to indexing. You're also correct to point out that there are other factors - I was actually trying to keep things as simple as possible (c: Bottom line with this particular scenario is that the optimiser could easily choose to either use the index or not & the decision would be made entirely based on statistics. I felt this was worth pointing out.. Regards, Greg Linwood SQL Server MVP Show quote "Tom Cooper" <tomcooper@comcast.no.spam.please.net> wrote in message news:quqdnY-IUdkOh3LZnZ2dnUVZ_umdnZ2d@comcast.com... > Hi Greg, > > You are, or course, correct. And I knew that. I just thought, that from > the OP's questions, he was concerned that the fact that he did not use > column3 in the WHERE clause would prevent the optimizer from using the > index. The point I was trying to make is that SQL Server would be able to > use this index even though column3 was not used in the WHERE clause. I > didn't want to go too deep for fear of confusing the OP who, from his > questions, seemed to me to be relatively new to how SQL Server uses > indexes. > > Of course, the optimizer won't use the index if it decides there is a > faster way to retrieve the result set and that is true whether or not the > WHERE clause uses all the columns in the index. > > It's always hard to decide where to stop in answering questions like this. > For instance, we could add to your answer a discussion of statistics, > problems that occur when the statistics are out of date, methods for > updataing statistics, estimated and actual execution plans, etc, etc. But > the time you get done, you have written a book. > > Tom > > "Greg Linwood" <g_linw***@hotmail.com> wrote in message > news:eOMWwjByGHA.1304@TK2MSFTNGP05.phx.gbl... >> Hi Tom >> >> Your first 'Yes' is not really accurate. It would be more correct so >> state that it *might* use the index, depending on the selectivity of the >> query predicates (where clause) >> >> For example, if every row in the table has the values "dog" for column1 >> and "cat" for column2, then it's more likely that SQL Server will scan >> the table (or its clustered index if it has one). It is also likely to >> make this decision if a reasonably high proportion of the rows in the >> table meet this criteria.. >> >> The reason for this is that the query is requesting all columns be >> returned ("SELECT * "). If all / many rows are to be returned, the index >> only covers three columns but the query is requesting all 4 be returned >> (due to the "SELECT * "). If the query was driven via the index, it would >> then have to look up the base table to retrieve Column4 FOR EVERY ROW, >> via either a bookmark lookup if a clustered index exists, via page >> pointers to the heap storage if no clustered index exists or other merge >> / hash operation. This plan would generally (depending on how many rows >> met the predicates) be far less efficient than simply scanning the table >> once. >> >> At the other extreme, if a unique constraint exists on column1, column2, >> then it's very likely that the index will be seek'd & a bookmark / page >> lookup performed to retrieved column4. >> >> When evaluating index usefulness, it is critical to take into account ALL >> columns involved in the query, not just those in the WHERE clause. This >> is why using "SELECT * " is such a bad practice. It's also important to >> consider the selectivity of the query predicates as this can also be a >> big factor. >> >> Regards, >> Greg Linwood >> SQL Server MVP >> >> "Tom Cooper" <tomcooper@comcast.no.spam.please.net> wrote in message >> news:8N6dnTB3jfnw6HPZnZ2dnUVZ_vWdnZ2d@comcast.com... >>> Answers in line. >>> >>> <rod.w***@gmail.com> wrote in message >>> news:1156478106.472527.75770@p79g2000cwp.googlegroups.com... >>>> Hi there, >>>> >>>> Quick question about indexes. I've never really understood this so I'm >>>> hoping someone can help me out. >>>> >>>> If I have a table with 4 columns like this. >>>> >>>> Column1 >>>> Column2 >>>> Column3 >>>> Column4 >>>> >>>> ...and this table has an index that covers Columns 1,2 & 3 >>>> >>>> Question is, does a query that has a Where clause only on columns 1 & 2 >>>> use such an index? >>> Yes >>> >>>> >>>> eg. >>>> Select * From Table Where Column1 = "dog" and Column2 = "cat" >>>> >>>> Notice there is no reference to column3. Does the query also need to >>>> reference Column3 in order to use this index? >>> No >>> >>>> Does it matter the order in which the columns in the Where clause are? >>>> eg. >>>> >>>> Is >>>> >>>> Select * From Table Where Column1 = "dog" and Column2 = "cat" >>>> the same as >>>> Select * From Table Where Column2 = "cat" and Column1 = "dog" >>> Yes, they are treated the same >>> >>>> >>>> Thanks >>>> >>> >>> Tom >>> >> >> > > Hi Greg and Tom,
Many thanks for your answers. This is really helpful information. I wish someone would write a book specifically on indexes, with plenty of examples of both good and bad indexes for a given query. There seems to be a lot of chapters written on indexes within just about every SQL book I've read, but I really think that a dedicated book on this would be great. I'd certainly purchase it. |
|||||||||||||||||||||||