|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
I smell a big, fat compiler bugSQLServer performance tuning. The last couple of weeks I've been doing this at a place using a whole lot of inline table-valued UDFs, an issue both in itself and because the database does a LOT of joining, and in this place, I've been finding one technique is working for me just a little too often. Time and again, there is some complex query that looks like it should be OK, but is running 100k's of logical reads and taking ten, twenty, thirty seconds to return a handfull of records. No doubt someone who can really, really read execution plans might be able to glance at the plan and see what's wrong, but I haven't reached quite that point yet. Time and again, what I find is that if I break out a chunk of code and run it stand-alone in QA, suddenly it runs "correctly", that is, with 99% fewer reads and 80-99% faster execution. OK, we've all seen that, we understand how much more the optimizer knows with explicit constants and such, and how SPs have to live with "parameter sniffing" and such. But I've gone the next step, wrapped ten or twenty or fifty lines in a relatively small SP or UDF, and darned if it doesn't still run fast, with all the same parameters and variables that it had in the original big SP. So, I accuse the compiler/optimizer of SQLServer 2000 sp 3x: given a large SP, it is not properly resetting the flags or counters or other internal doodads it uses to compute optimizations. Suggestion to speed freaks: break apart large queries inside the SP, often that works, but when that doesn't work, try moving chunks into separately compiled SPs, and you may find significantly better performance falls your way. Microsoft, take a look, wouldya? J. Hi
Without seeing the procedures then it is hard to comment on this. You may want to read Kalens articles on Reuse of Query Plans and Conditional Recompilation in SQL Server Magazine also http://www.sql-server-performance.com/mw_sql_server_udfs.asp may be useful reading. John Show quote "JXStern" wrote: > Last two or three years now I've been spending a lot of time doing > SQLServer performance tuning. The last couple of weeks I've been > doing this at a place using a whole lot of inline table-valued UDFs, > an issue both in itself and because the database does a LOT of > joining, and in this place, I've been finding one technique is working > for me just a little too often. > > Time and again, there is some complex query that looks like it should > be OK, but is running 100k's of logical reads and taking ten, twenty, > thirty seconds to return a handfull of records. No doubt someone who > can really, really read execution plans might be able to glance at the > plan and see what's wrong, but I haven't reached quite that point yet. > Time and again, what I find is that if I break out a chunk of code and > run it stand-alone in QA, suddenly it runs "correctly", that is, with > 99% fewer reads and 80-99% faster execution. OK, we've all seen that, > we understand how much more the optimizer knows with explicit > constants and such, and how SPs have to live with "parameter sniffing" > and such. But I've gone the next step, wrapped ten or twenty or fifty > lines in a relatively small SP or UDF, and darned if it doesn't still > run fast, with all the same parameters and variables that it had in > the original big SP. > > So, I accuse the compiler/optimizer of SQLServer 2000 sp 3x: given a > large SP, it is not properly resetting the flags or counters or other > internal doodads it uses to compute optimizations. > > Suggestion to speed freaks: break apart large queries inside the SP, > often that works, but when that doesn't work, try moving chunks into > separately compiled SPs, and you may find significantly better > performance falls your way. > > Microsoft, take a look, wouldya? > > J. > > On Thu, 11 Aug 2005 23:31:16 -0700, John Bell
<jbellnewspo***@hotmail.com> wrote: >Without seeing the procedures then it is hard to comment on this. You may Yes, I'm aware of the threat and menace of scalar UDFs in where>want to read Kalens articles on Reuse of Query Plans and Conditional >Recompilation in SQL Server Magazine also >http://www.sql-server-performance.com/mw_sql_server_udfs.asp may be useful >reading. clauses and the like, but that isn't what I'm talking about here. I'm not going to try to post actual code and plans from the app, cuz (even if the code were not proprietary) it would take the schema and stats as well to make any sense out of it. What I'm fishing for here are similar stories and suspicions, or any official or unofficial further descriptions of the problem or workarounds. If I find myself with lots of time, I may try to synthesize an entire duplicatable scenario, but it's liable to require a bunch of tables and data to get the thing off the ground. J. Hi
Well, can you post some data that we will be able to reprocude the problem/bug? Also, at least, can you show an execution plan of the query? Show quote "JXStern" <JXSternChange***@gte.net> wrote in message news:aieof11bdp2vfid400cslukgm94ivkqnjh@4ax.com... > Last two or three years now I've been spending a lot of time doing > SQLServer performance tuning. The last couple of weeks I've been > doing this at a place using a whole lot of inline table-valued UDFs, > an issue both in itself and because the database does a LOT of > joining, and in this place, I've been finding one technique is working > for me just a little too often. > > Time and again, there is some complex query that looks like it should > be OK, but is running 100k's of logical reads and taking ten, twenty, > thirty seconds to return a handfull of records. No doubt someone who > can really, really read execution plans might be able to glance at the > plan and see what's wrong, but I haven't reached quite that point yet. > Time and again, what I find is that if I break out a chunk of code and > run it stand-alone in QA, suddenly it runs "correctly", that is, with > 99% fewer reads and 80-99% faster execution. OK, we've all seen that, > we understand how much more the optimizer knows with explicit > constants and such, and how SPs have to live with "parameter sniffing" > and such. But I've gone the next step, wrapped ten or twenty or fifty > lines in a relatively small SP or UDF, and darned if it doesn't still > run fast, with all the same parameters and variables that it had in > the original big SP. > > So, I accuse the compiler/optimizer of SQLServer 2000 sp 3x: given a > large SP, it is not properly resetting the flags or counters or other > internal doodads it uses to compute optimizations. > > Suggestion to speed freaks: break apart large queries inside the SP, > often that works, but when that doesn't work, try moving chunks into > separately compiled SPs, and you may find significantly better > performance falls your way. > > Microsoft, take a look, wouldya? > > J. > Yes , I also experienced this. I have a sp 1000 lines long and when I run it
took too much time and when I devided it in 4 small sps it run about 150% faster. Even execution plan is almost same I can not find what causes the sp to run faster when it is in small code. May be locks are not released when they should be ? Show quote "Uri Dimant" <u***@iscar.co.il> wrote in message news:eZssQiwnFHA.1948@TK2MSFTNGP12.phx.gbl... > Hi > Well, can you post some data that we will be able to reprocude the > problem/bug? > > Also, at least, can you show an execution plan of the query? > > > > > > "JXStern" <JXSternChange***@gte.net> wrote in message > news:aieof11bdp2vfid400cslukgm94ivkqnjh@4ax.com... > > Last two or three years now I've been spending a lot of time doing > > SQLServer performance tuning. The last couple of weeks I've been > > doing this at a place using a whole lot of inline table-valued UDFs, > > an issue both in itself and because the database does a LOT of > > joining, and in this place, I've been finding one technique is working > > for me just a little too often. > > > > Time and again, there is some complex query that looks like it should > > be OK, but is running 100k's of logical reads and taking ten, twenty, > > thirty seconds to return a handfull of records. No doubt someone who > > can really, really read execution plans might be able to glance at the > > plan and see what's wrong, but I haven't reached quite that point yet. > > Time and again, what I find is that if I break out a chunk of code and > > run it stand-alone in QA, suddenly it runs "correctly", that is, with > > 99% fewer reads and 80-99% faster execution. OK, we've all seen that, > > we understand how much more the optimizer knows with explicit > > constants and such, and how SPs have to live with "parameter sniffing" > > and such. But I've gone the next step, wrapped ten or twenty or fifty > > lines in a relatively small SP or UDF, and darned if it doesn't still > > run fast, with all the same parameters and variables that it had in > > the original big SP. > > > > So, I accuse the compiler/optimizer of SQLServer 2000 sp 3x: given a > > large SP, it is not properly resetting the flags or counters or other > > internal doodads it uses to compute optimizations. > > > > Suggestion to speed freaks: break apart large queries inside the SP, > > often that works, but when that doesn't work, try moving chunks into > > separately compiled SPs, and you may find significantly better > > performance falls your way. > > > > Microsoft, take a look, wouldya? > > > > J. > > > > Hi
You may want to look at http://www.sql-server-performance.com/rd_optimizing_sp_recompiles.asp and subsequent links if you are getting recompiles. John Show quote "AM" wrote: > > > Yes , I also experienced this. I have a sp 1000 lines long and when I run it > took too much time and when I devided it in 4 small sps it run about 150% > faster. > Even execution plan is almost same I can not find what causes the sp to run > faster when it is in small code. > May be locks are not released when they should be ? > > "Uri Dimant" <u***@iscar.co.il> wrote in message > news:eZssQiwnFHA.1948@TK2MSFTNGP12.phx.gbl... > > Hi > > Well, can you post some data that we will be able to reprocude the > > problem/bug? > > > > Also, at least, can you show an execution plan of the query? > > > > > > > > > > > > "JXStern" <JXSternChange***@gte.net> wrote in message > > news:aieof11bdp2vfid400cslukgm94ivkqnjh@4ax.com... > > > Last two or three years now I've been spending a lot of time doing > > > SQLServer performance tuning. The last couple of weeks I've been > > > doing this at a place using a whole lot of inline table-valued UDFs, > > > an issue both in itself and because the database does a LOT of > > > joining, and in this place, I've been finding one technique is working > > > for me just a little too often. > > > > > > Time and again, there is some complex query that looks like it should > > > be OK, but is running 100k's of logical reads and taking ten, twenty, > > > thirty seconds to return a handfull of records. No doubt someone who > > > can really, really read execution plans might be able to glance at the > > > plan and see what's wrong, but I haven't reached quite that point yet. > > > Time and again, what I find is that if I break out a chunk of code and > > > run it stand-alone in QA, suddenly it runs "correctly", that is, with > > > 99% fewer reads and 80-99% faster execution. OK, we've all seen that, > > > we understand how much more the optimizer knows with explicit > > > constants and such, and how SPs have to live with "parameter sniffing" > > > and such. But I've gone the next step, wrapped ten or twenty or fifty > > > lines in a relatively small SP or UDF, and darned if it doesn't still > > > run fast, with all the same parameters and variables that it had in > > > the original big SP. > > > > > > So, I accuse the compiler/optimizer of SQLServer 2000 sp 3x: given a > > > large SP, it is not properly resetting the flags or counters or other > > > internal doodads it uses to compute optimizations. > > > > > > Suggestion to speed freaks: break apart large queries inside the SP, > > > often that works, but when that doesn't work, try moving chunks into > > > separately compiled SPs, and you may find significantly better > > > performance falls your way. > > > > > > Microsoft, take a look, wouldya? > > > > > > J. > > > > > > > > > > I had much the same problem on a project a couple of years ago. It
usually turned out that a less than optimal plan was cached for the original SP - mostly due to the SP being run the first time with non-typical params. I used DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS to get a more accurate benchmark. Good luck. Payson Thanks for your feedback.
I will try to give you a few pieces of information that may help explain what you are seeing (which, as others have mentioned, would be more definitive with the plan + additional information in question) - so, I am guessing. User latency for ad hoc queries is generally a function of compilation time + execution time. If your query is cached and run multiple times, it is likely a function of execution time alone. In either case, there are a number of algorithmic problems that can impact plan choice and thus user latency, even if the query is run multiple times. For example, if you join N tables together, there are a large number of ways to actually execute this query based on the join order picked. Once you have a sufficiently large number of tables, it actually becomes algorithmically intractible to search the complete set of possible cases (in a reasonable time - however, eventually it just becomes impossible). So, heuristics are used to try to find a good join order relatively quickly. This may or may not be the exact, optimal plan for a particular machine. Alternatives are considered from this "spot" in the universe of possible join orders. In some cases, it may be that the initial heuristic order is not "close" to the optimal choice and significant searching is required to get to the right plan. Additionally, it may be that the optimal plan is so far away from the optimal choice as to make it impossible to consider that choice in a reasonable amount of time using the heuristics I mentioned. If your query is one of these cases, manually rewriting the query to remove some of the joins could possibly make things run faster for your query + data. It's not impossible to have such a case. I'll suggest that you can also use query hints, in many cases, to force the join order into one that may be more optimal for your query. Root causes for picking an initial, heuristic order that is suboptimal include missing statistics, statistics with a sample rate that is too low, physical grouping of values on disk that causes problems in sampled statistics interpolation, correlations between columns that cause the optimizer to over- or under-estimate cardinality for a join or filter, or others. In some cases, the technique you describe (moving part of the query into a temporary result) can correct errors because part of the query becomes materialized - the cardinality of that intermediate result thus does not contain some of the errors and the rest of the query may have enough information to pick a good join order. Join order is one way in which this condition can happen - others also exist, but the basic impact is generally the same. I hope that this gives you some insight into the complexities involved in this process. Sincerely, Conor Cunningham SQL Server Query Optimization Development Lead Show quote "JXStern" <JXSternChange***@gte.net> wrote in message news:aieof11bdp2vfid400cslukgm94ivkqnjh@4ax.com... > Last two or three years now I've been spending a lot of time doing > SQLServer performance tuning. The last couple of weeks I've been > doing this at a place using a whole lot of inline table-valued UDFs, > an issue both in itself and because the database does a LOT of > joining, and in this place, I've been finding one technique is working > for me just a little too often. > > Time and again, there is some complex query that looks like it should > be OK, but is running 100k's of logical reads and taking ten, twenty, > thirty seconds to return a handfull of records. No doubt someone who > can really, really read execution plans might be able to glance at the > plan and see what's wrong, but I haven't reached quite that point yet. > Time and again, what I find is that if I break out a chunk of code and > run it stand-alone in QA, suddenly it runs "correctly", that is, with > 99% fewer reads and 80-99% faster execution. OK, we've all seen that, > we understand how much more the optimizer knows with explicit > constants and such, and how SPs have to live with "parameter sniffing" > and such. But I've gone the next step, wrapped ten or twenty or fifty > lines in a relatively small SP or UDF, and darned if it doesn't still > run fast, with all the same parameters and variables that it had in > the original big SP. > > So, I accuse the compiler/optimizer of SQLServer 2000 sp 3x: given a > large SP, it is not properly resetting the flags or counters or other > internal doodads it uses to compute optimizations. > > Suggestion to speed freaks: break apart large queries inside the SP, > often that works, but when that doesn't work, try moving chunks into > separately compiled SPs, and you may find significantly better > performance falls your way. > > Microsoft, take a look, wouldya? > > J. > Connor,
Thank you for your response. I'll comment on a few of your notes below. However, my accusation remains in place, that I am seeing too many cases where simply removing a lump of code to a separate SP gets a totally different (and better) result from the compiler/optimizer, such that I suspect it is not properly finding cutpoints in the text of large SPs. Joshua Stern On Fri, 12 Aug 2005 13:01:35 -0700, "Conor Cunningham [MS]" <conorc_remov***@online.microsoft.com> wrote: >Thanks for your feedback. I make certain to null that out through reruns, which also generally> >I will try to give you a few pieces of information that may help explain >what you are seeing (which, as others have mentioned, would be more >definitive with the plan + additional information in question) - so, I am >guessing. > >User latency for ad hoc queries is generally a function of compilation time >+ execution time. If your query is cached and run multiple times, it is >likely a function of execution time alone. ends up caching the data and getting down to zero physical reads. Most of the SPs I'm talking about here have runtimes of at least a minute when I begin to work on them, and still take five or ten seconds after improvements are in place, so the compilation is not likely a huge factor in any case (well, except where there are a lot of temp tables involved, but even there it's a fairly minor point, which I realize I could check further with the profiler and such). Show quote >In either case, there are a number of algorithmic problems that can impact All true in theory and practice, yet I wonder if there isn't some room>plan choice and thus user latency, even if the query is run multiple times. > >For example, if you join N tables together, there are a large number of ways >to actually execute this query based on the join order picked. Once you >have a sufficiently large number of tables, it actually becomes >algorithmically intractible to search the complete set of possible cases (in >a reasonable time - however, eventually it just becomes impossible). So, >heuristics are used to try to find a good join order relatively quickly. >This may or may not be the exact, optimal plan for a particular machine. >Alternatives are considered from this "spot" in the universe of possible >join orders. In some cases, it may be that the initial heuristic order is >not "close" to the optimal choice and significant searching is required to >get to the right plan. Additionally, it may be that the optimal plan is so >far away from the optimal choice as to make it impossible to consider that >choice in a reasonable amount of time using the heuristics I mentioned. for improvement here. Where I have had success breaking a join of six or sixteen tables into multiple queries, it hasn't seemed all that difficult for me to guess how to decompose them best, and I do wonder at how much better the heuristics could be. >If your query is one of these cases, manually rewriting the query to remove Never had much luck with hints.>some of the joins could possibly make things run faster for your query + >data. It's not impossible to have such a case. I'll suggest that you can >also use query hints, in many cases, to force the join order into one that >may be more optimal for your query. What I'd like are optimizer hints, like "hey, this temp table is going to only hold ten rows, so please use it as the root of your plan!") >Root causes for picking an initial, heuristic order that is suboptimal I don't believe I've had much of the cardinality issue, fwiw.>include missing statistics, statistics with a sample rate that is too low, >physical grouping of values on disk that causes problems in sampled >statistics interpolation, correlations between columns that cause the >optimizer to over- or under-estimate cardinality for a join or filter, or >others. In some cases, the technique you describe (moving part of the query >into a temporary result) can correct errors because part of the query >becomes materialized - the cardinality of that intermediate result thus does >not contain some of the errors and the rest of the query may have enough >information to pick a good join order. >Join order is one way in which this condition can happen - others also Any hints here, should the major tables be listed *last*? I think>exist, but the basic impact is generally the same. most developers tend to list them first. I recall back in 6.x days there was some order-dependency in where clauses (?) that I believe went away as of SQL7. >I hope that this gives you some insight into the complexities involved in Yes, it does. Until I started to specialize in this area, I had no>this process. proper appreciation for just what it takes to turn a couple of lines of SQL into an appropriate plan. I guess I would now wish for some additional "transparency" on just WHY the optimizer chooses what it does, so that I could use that as a basis for reorganizing code as necessary. No doubt there are all sorts of additional topics to explore in the area. But all in all, I *still* think there's a big, fat bug in the compiler right now that is preventing it from doing proper optimization of (logically independent) chunks of code in large SPs. Joshua Stern |
|||||||||||||||||||||||