|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
SP clause - where (field = @Param or @param is null) is this fast?It would SEEM to me that this is a rather neat way to write SPs Create proc mysp @LastName varchar(32) = NULL @Date datetime As Select Fieldlist from tblOrders where TranDate > @Date and ( @Lastname is null or name = @LastName) i..e if @Lastname is provided then filter on it otherwise do not filter on Name field. This is obvioulsy good for SP length (lines of code) but does it have a performance hit? >From my testing doesn't seem to on SQL2000 - am I missing something? before I write all my SPs like this..Am 3 Dec 2005 10:21:21 -0800 schrieb yitzak:
Show quote > Hi Hi,> > It would SEEM to me that this is a rather neat way to write SPs > > Create proc mysp > > @LastName varchar(32) = NULL > @Date datetime > > As > > Select Fieldlist from tblOrders > where TranDate > @Date > and ( @Lastname is null or name = @LastName) > > i..e if @Lastname is provided then filter on it otherwise do not filter > on Name field. > > This is obvioulsy good for SP length (lines of code) but does it have a > performance hit? > >>From my testing doesn't seem to on SQL2000 - am I missing something? > before I write all my SPs like this.. i make the same in some of my procedures and never saw a performance problem. I think, the query optimizer can work very well with this condition, especially if the NULL test is the first part. But i am interested too, what specialists can tell us about this. bye, Helmut I think it depends on your situation. In some cases
if (@Lastname is null) Select Fieldlist from tblOrders where TranDate > @Date else Select Fieldlist from tblOrders where TranDate > @Date and (name = @LastName) will give you 2 different plans for 2 branches of the if statement. This way you might get better performance... --CELKO-- wrote:
> name = COALESCE (@my_name, name) Keep forgetting about Coalesce - nice idea. Definetly use it for some> > Is a little neater code. SPs. But would it be as fast doing a separate branch of execution? e.g. IF @Param is null don't use it else use Param in where clause I suggest you read: http://www.sommarskog.se/dyn-search.html. Pay attention
to the section, Umachandar's Bag of Tricks -- Anith Neater? Does it matter if it's neat if it causes performance to go into the
toilet? name = COALESCE(@my_name, name) causes a table or clustered index scan in SQL Server. So will @my_name IS NULL OR name = @my_name. Both can have a drastic negative impact on performance. The correct way to handle this is with an IF statement. Show quote "--CELKO--" <jcelko***@earthlink.net> wrote in message news:1133661115.432608.245470@g49g2000cwa.googlegroups.com... > name = COALESCE (@my_name, name) > > Is a little neater code. > The problem with « @my_name IS NULL OR name = @my_name » is not necessarily
true in the case of SQL-2000 is there is a clustered index on the column name and is no longer true with SQL-2005 even if there is no clustered index on the column name. However, it's true in all cases with the Coalesce() function. Show quote "Brian Selzer" <br***@selzer-software.com> wrote in message news:e93WRKO%23FHA.1148@tk2msftngp13.phx.gbl... > Neater? Does it matter if it's neat if it causes performance to go into > the toilet? > > name = COALESCE(@my_name, name) causes a table or clustered index scan in > SQL Server. So will @my_name IS NULL OR name = @my_name. Both can have a > drastic negative impact on performance. The correct way to handle this is > with an IF statement. > > > "--CELKO--" <jcelko***@earthlink.net> wrote in message > news:1133661115.432608.245470@g49g2000cwa.googlegroups.com... >> name = COALESCE (@my_name, name) >> >> Is a little neater code. >> > > Here's the results I received on SQL 2000, SP4:
I get the exact same execution plan for both « @my_name IS NULL OR name = @my_name » and COALESCE; however, it appears that « @my_name IS NULL OR name = @my_name » short circuits, whereas COALESCE doesn't. The IF statement generated a clustered index scan and a clustered index seek, which makes sense. With COALESCE the elapsed time was 220ms, with « @my_name IS NULL OR name = @my_name » the elapsed time was 156ms, and with the IF statement, the elapsed time was 0ms (negligible). I have a clustered index on the primary key column for the table, which has roughly 450,000 rows. The comparisons go against the primary key column. I haven't tested this on SQL 2005. Show quote "Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)> wrote in message news:Oeb3eUW%23FHA.3296@TK2MSFTNGP10.phx.gbl... > The problem with « @my_name IS NULL OR name = @my_name » is not > necessarily true in the case of SQL-2000 is there is a clustered index on > the column name and is no longer true with SQL-2005 even if there is no > clustered index on the column name. > > However, it's true in all cases with the Coalesce() function. > > -- > Sylvain Lafontaine, ing. > MVP - Technologies Virtual-PC > E-mail: http://cerbermail.com/?QugbLEWINF > > > "Brian Selzer" <br***@selzer-software.com> wrote in message > news:e93WRKO%23FHA.1148@tk2msftngp13.phx.gbl... >> Neater? Does it matter if it's neat if it causes performance to go into >> the toilet? >> >> name = COALESCE(@my_name, name) causes a table or clustered index scan in >> SQL Server. So will @my_name IS NULL OR name = @my_name. Both can have >> a drastic negative impact on performance. The correct way to handle this >> is with an IF statement. >> >> >> "--CELKO--" <jcelko***@earthlink.net> wrote in message >> news:1133661115.432608.245470@g49g2000cwa.googlegroups.com... >>> name = COALESCE (@my_name, name) >>> >>> Is a little neater code. >>> >> >> > > Oh, I'm sorry to have not been more explicit: I was thinking that with a
clustered index on the Name column, only a partial range clustering index scan would have been done by SQL-Server, therefore giving the same result as with the IF statement. However, having a design where the clustering index is set for the Name column instead of setting it on the primary key would probably be a bad design for many cases; not because it is fundamentally flawed by itself but because it is very likely that the Name column won't be the only column for which a searching function is required. But the case for SQL-Server is much more interesting. With the WITH COMPILE option set ON, the clustered index on the primary key and a secondary index on the Name column; SQL 2005 clearly perform a Clusterex Index Scan when the Null value is passed as the parameter and an Index Scan on the secondary index followed by a Clustered Index Seek (to retrieve the other columns) when something else is passed as the parameters. Of course, doing this could become very bad if we don't use the WITH COMPILE and the argument passed for the first time is the Null value followed by something and much less bad if we go in the opposite direction (the query with the Null value coming into second position) because usually we will be less concerned by performance when all the rows must be retrieved anyway. Thankfully, SQL-2005 is also providing new query optimizer hints to alleviate this problem. Show quote "Brian Selzer" <br***@selzer-software.com> wrote in message news:%23Qx7LOZ%23FHA.2324@TK2MSFTNGP11.phx.gbl... > Here's the results I received on SQL 2000, SP4: > > I get the exact same execution plan for both « @my_name IS NULL OR name = > @my_name » and COALESCE; however, it appears that « @my_name IS NULL OR > name = @my_name » short circuits, whereas COALESCE doesn't. The IF > statement generated a clustered index scan and a clustered index seek, > which makes sense. > > With COALESCE the elapsed time was 220ms, with « @my_name IS NULL OR name > = @my_name » the elapsed time was 156ms, and with the IF statement, the > elapsed time was 0ms (negligible). > > I have a clustered index on the primary key column for the table, which > has roughly 450,000 rows. The comparisons go against the primary key > column. > > I haven't tested this on SQL 2005. > > "Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)> > wrote in message news:Oeb3eUW%23FHA.3296@TK2MSFTNGP10.phx.gbl... >> The problem with « @my_name IS NULL OR name = @my_name » is not >> necessarily true in the case of SQL-2000 is there is a clustered index on >> the column name and is no longer true with SQL-2005 even if there is no >> clustered index on the column name. >> >> However, it's true in all cases with the Coalesce() function. >> >> -- >> Sylvain Lafontaine, ing. >> MVP - Technologies Virtual-PC >> E-mail: http://cerbermail.com/?QugbLEWINF >> >> >> "Brian Selzer" <br***@selzer-software.com> wrote in message >> news:e93WRKO%23FHA.1148@tk2msftngp13.phx.gbl... >>> Neater? Does it matter if it's neat if it causes performance to go into >>> the toilet? >>> >>> name = COALESCE(@my_name, name) causes a table or clustered index scan >>> in SQL Server. So will @my_name IS NULL OR name = @my_name. Both can >>> have a drastic negative impact on performance. The correct way to >>> handle this is with an IF statement. >>> >>> >>> "--CELKO--" <jcelko***@earthlink.net> wrote in message >>> news:1133661115.432608.245470@g49g2000cwa.googlegroups.com... >>>> name = COALESCE (@my_name, name) >>>> >>>> Is a little neater code. >>>> >>> >>> >> >> > > >> Of course, doing this could become very bad if we don't use the WITH COMPILE and the argument passed for the first time is the Null value followedby something and much less bad if we go in the opposite direction (the query with the Null value coming into second position) because usually we will be less concerned by performance when all the rows must be retrieved anyway. << I would respectfully disagree. In many cases a table scan can easily be 5 or more times more efficient than reading all the rows via a bookmark. And I mean a 4 byte integer bookmark, about as efficient as it can be. The difference in response time between 15 seconds and a minute may matter, right? Not necessarily, it depends on your application and of the use it makes of
the queries. For example, usually when you don't have Null values in your parameters it's because your users are using forms in the application and want to see/edit the data about a customer in the fastest way possible. However, when a lot of Null are passed, it's often because reports are generated and then if this is really the case, it doesn't matter if the query take a few more seconds to execute when the printing process for the reports may take 15 minutes, 1 hour or worse. This is only a crude example but I think it may be safe to say that the chance of having complaints from your client if much less likely if you have a SP that has been optimized for particular values and sometimes is used for all values then the inverse. Of course, we can say that the best option is to always use the IF clause; however, it's not always feasible: when you have a query with a lot of possible filters; you cannot undertake the task of writing all the possible combinations way, say, something like ten levels of imbricated IF ... ELSE. For cases like this, it is often suggested to use instead dynamically build SQL strings and EXEC them so to make sure to have the best query plan possible for each case. However, SQL-Server 2005 seems to give us another solution in this regard that will be much more elegant than building string. Show quote "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message news:1133825144.992178.231980@f14g2000cwb.googlegroups.com... >>> Of course, doing this could become very bad if we don't use the WITH >>> COMPILE > and the argument passed for the first time is the Null value followed > by > something and much less bad if we go in the opposite direction (the > query > with the Null value coming into second position) because usually we > will be > less concerned by performance when all the rows must be retrieved > anyway. << > > I would respectfully disagree. In many cases a table scan can easily be > 5 or more times more efficient than reading all the rows via a > bookmark. And I mean a 4 byte integer bookmark, about as efficient as > it can be. > The difference in response time between 15 seconds and a minute may > matter, right? > Thanks EVERYONE!! BRain, Alexander - I know have some topics to read up
- have ordered book :-) I cannot comment on this becuase I'd never heard of param sniffing etc. Sylvain - mine are report SPs and I cannot do all these nested IFs - would get too complicated.. OFF topic - but I got much better performance using Dynamic SQL in an SP - (I know all the reasons why its bad). E.G. set @Paramlist = '1,2,3,4' IF @Paramlist = '' @SSQL = @SQL + ' shopID in ( ' + @ParamList + ')' much faster than joing to a temp table conatining 1,2,3 or joining to a table returning function. I never understood this - I would like to use table returning funcs for this and not use dynamic SQL - but its slow (well I've found it slow) thanks Sylvain Lafontaine (fill the blanks, no spam please) wrote: Show quote > Not necessarily, it depends on your application and of the use it makes of > the queries. For example, usually when you don't have Null values in your > parameters it's because your users are using forms in the application and > want to see/edit the data about a customer in the fastest way possible. > > However, when a lot of Null are passed, it's often because reports are > generated and then if this is really the case, it doesn't matter if the > query take a few more seconds to execute when the printing process for the > reports may take 15 minutes, 1 hour or worse. > > This is only a crude example but I think it may be safe to say that the > chance of having complaints from your client if much less likely if you have > a SP that has been optimized for particular values and sometimes is used for > all values then the inverse. > > Of course, we can say that the best option is to always use the IF clause; > however, it's not always feasible: when you have a query with a lot of > possible filters; you cannot undertake the task of writing all the possible > combinations way, say, something like ten levels of imbricated IF ... ELSE. > For cases like this, it is often suggested to use instead dynamically build > SQL strings and EXEC them so to make sure to have the best query plan > possible for each case. However, SQL-Server 2005 seems to give us another > solution in this regard that will be much more elegant than building string. > > -- > Sylvain Lafontaine, ing. > MVP - Technologies Virtual-PC > E-mail: http://cerbermail.com/?QugbLEWINF > > > "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message > news:1133825144.992178.231980@f14g2000cwb.googlegroups.com... > >>> Of course, doing this could become very bad if we don't use the WITH > >>> COMPILE > > and the argument passed for the first time is the Null value followed > > by > > something and much less bad if we go in the opposite direction (the > > query > > with the Null value coming into second position) because usually we > > will be > > less concerned by performance when all the rows must be retrieved > > anyway. << > > > > I would respectfully disagree. In many cases a table scan can easily be > > 5 or more times more efficient than reading all the rows via a > > bookmark. And I mean a 4 byte integer bookmark, about as efficient as > > it can be. > > The difference in response time between 15 seconds and a minute may > > matter, right? > > Dynamic SQL is not bad at all; this is a misconception about the fact that
they must be recompiled every time for each use (with the exception of the effect of cached parameterized queries) while SP are not. However, you must take into account that it's probably a lesser job for the SQL-Server to recompile a query than to execute a bad query plan. Perfect SP - SP that take into account every possibility with IF blocks - are probably always the best thing over Dynamic SQL but you must be able to write them; which is not always possible for various reasons. If you want to understand why doing something in a way will be faster or slower than doing it in another way (as for exemple your use of Dynamic SQL with @Paramlist) then you must start studying query plans. This is the only way of truly understanding many aspects of SQL-Server. However, it looks like that many changes have been made between SQL-2000 and 2005 in this regard; so some of the conclusions that you will draw from a study of SQL-2000 won't be valid under SQL-2005. Show quote "yitzak" <terrysha***@yahoo.co.uk> wrote in message news:1133878655.295465.74400@o13g2000cwo.googlegroups.com... > Thanks EVERYONE!! BRain, Alexander - I know have some topics to read up > - have ordered book :-) > > I cannot comment on this becuase I'd never heard of param sniffing etc. > > Sylvain - mine are report SPs and I cannot do all these nested IFs - > would get too complicated.. > > OFF topic - but I got much better performance using Dynamic SQL in an > SP - (I know all the reasons why its bad). > > E.G. > set @Paramlist = '1,2,3,4' > > IF @Paramlist = '' > @SSQL = @SQL + ' shopID in ( ' + @ParamList + ')' > > much faster than joing to a temp table conatining 1,2,3 or joining to a > table returning function. > > I never understood this - I would like to use table returning funcs for > this and not use dynamic SQL - but its slow (well I've found it slow) > > thanks > > > > Sylvain Lafontaine (fill the blanks, no spam please) wrote: >> Not necessarily, it depends on your application and of the use it makes >> of >> the queries. For example, usually when you don't have Null values in >> your >> parameters it's because your users are using forms in the application and >> want to see/edit the data about a customer in the fastest way possible. >> >> However, when a lot of Null are passed, it's often because reports are >> generated and then if this is really the case, it doesn't matter if the >> query take a few more seconds to execute when the printing process for >> the >> reports may take 15 minutes, 1 hour or worse. >> >> This is only a crude example but I think it may be safe to say that the >> chance of having complaints from your client if much less likely if you >> have >> a SP that has been optimized for particular values and sometimes is used >> for >> all values then the inverse. >> >> Of course, we can say that the best option is to always use the IF >> clause; >> however, it's not always feasible: when you have a query with a lot of >> possible filters; you cannot undertake the task of writing all the >> possible >> combinations way, say, something like ten levels of imbricated IF ... >> ELSE. >> For cases like this, it is often suggested to use instead dynamically >> build >> SQL strings and EXEC them so to make sure to have the best query plan >> possible for each case. However, SQL-Server 2005 seems to give us >> another >> solution in this regard that will be much more elegant than building >> string. >> >> -- >> Sylvain Lafontaine, ing. >> MVP - Technologies Virtual-PC >> E-mail: http://cerbermail.com/?QugbLEWINF >> >> >> "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message >> news:1133825144.992178.231980@f14g2000cwb.googlegroups.com... >> >>> Of course, doing this could become very bad if we don't use the WITH >> >>> COMPILE >> > and the argument passed for the first time is the Null value followed >> > by >> > something and much less bad if we go in the opposite direction (the >> > query >> > with the Null value coming into second position) because usually we >> > will be >> > less concerned by performance when all the rows must be retrieved >> > anyway. << >> > >> > I would respectfully disagree. In many cases a table scan can easily be >> > 5 or more times more efficient than reading all the rows via a >> > bookmark. And I mean a 4 byte integer bookmark, about as efficient as >> > it can be. >> > The difference in response time between 15 seconds and a minute may >> > matter, right? >> > > A simple example when it is not feasible to "take into account every
possibility with IF blocks": where order_date between @date_from and @date_to Should the optimizer use a non-clustered index on order_date? If @date_from and @date_to are 2 days apart (0.1% selectivity), then yes. If they are 2 years apart (35% selectivity) then no. If the table is big, and the price of running the wrong plan is high, it is cheaper to provide the optimizer with literals: where order_date between '20050101' and '20050104' Another example: Where last_name like 'S%' is not selective enough to justify the use of an index on last_name, while Where last_name like 'Kuznets%' is very selective. yitzak,
Using an IF statement to handle this will result in good performance, but if you use it in a stored procedure, you might have to use local variables to avoid parameter sniffing. If you know @my_name will never use 'special' characters and the column name does not contain NULLs, then you could also consider the following approach, because this will also perform very well, and you will only have to write your query once. WHERE name LIKE COALESCE(@my_name,'%') HTH, Gert-Jan Brian Selzer wrote: Show quote > > Neater? Does it matter if it's neat if it causes performance to go into the > toilet? > > name = COALESCE(@my_name, name) causes a table or clustered index scan in > SQL Server. So will @my_name IS NULL OR name = @my_name. Both can have a > drastic negative impact on performance. The correct way to handle this is > with an IF statement. > > "--CELKO--" <jcelko***@earthlink.net> wrote in message > news:1133661115.432608.245470@g49g2000cwa.googlegroups.com... > > name = COALESCE (@my_name, name) > > > > Is a little neater code. > > > this will also perform very well, and you will only Gert-Jan,> have to write your query once. > WHERE name LIKE COALESCE(@my_name,'%') I don't see that performing well in all the cases, unless there is a clustering index on name. Depending on @my_name, it may be more efficient either to scan the whole table, not using the index on name at all (name like '%'), or use the index on name to select a small percentage of rows (name like 'Gran%'). Makes sense? Indeed, if this is the only predicate in the query, then returning all
rows might not be efficient. But it depends on the situation. In many situations where you don't know up front whether there will be a search criterium or not (the case of the OP), returning more than a few hundred rows will cause performance problems anyway, and this is often countered by selecting only the TOP x rows with a message that there were more results. And in that scenario, it is not unusual to have the results sorted by search column. If this is for a report, and all rows are necessary, then using the IF statement will be faster. But then again: in that case the developer/DBA should already know upfront whether or not a search criterium is used. Gert-Jan Alexander Kuznetsov wrote: Show quote > > > this will also perform very well, and you will only > > have to write your query once. > > > WHERE name LIKE COALESCE(@my_name,'%') > > Gert-Jan, > > I don't see that performing well in all the cases, unless there is a > clustering index on name. Depending on @my_name, it may be more > efficient either to scan the whole table, not using the index on name > at all (name like '%'), or use the index on name to select a small > percentage of rows (name like 'Gran%'). > > Makes sense? For simple queries like that, you'll never see a performance problem.
However, for more complex queries, you may be hit by a big performance penalty if the SQL-Server chose a bad execution plan. This may be particularly true if you don't use the option WITH RECOMPILE because then it will use the Null value for its « parameters sniffing » and its execution plan will be skewed toward the use of the Null value, even if you pass another value instead. See the following article from Ken Henderson: http://blogs.msdn.com/khen1234/archive/2005/06/02/424228.aspx You can keep the test « ... @Lastname is null ... » but if I were you, I would remove the default value of Null. (BTW, its always a better idea to explicitely set all required parameters when calling a function or a SP; make things clearer when you re-read yourself a few months later.) Show quote "yitzak" <terrysha***@yahoo.co.uk> wrote in message news:1133634081.845124.272450@g14g2000cwa.googlegroups.com... > Hi > > It would SEEM to me that this is a rather neat way to write SPs > > Create proc mysp > > @LastName varchar(32) = NULL > @Date datetime > > As > > Select Fieldlist from tblOrders > where TranDate > @Date > and ( @Lastname is null or name = @LastName) > > i..e if @Lastname is provided then filter on it otherwise do not filter > on Name field. > > This is obvioulsy good for SP length (lines of code) but does it have a > performance hit? > >>From my testing doesn't seem to on SQL2000 - am I missing something? > before I write all my SPs like this.. > |
|||||||||||||||||||||||