|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Using ranking functions in a subquery to feed a CLR function values in a specified order.calculations which are based on running values (which imply an order). Of course, using SQL Server for this is not a good idea, since all operations are set-based. However, with the introduction of ranking functions in SQL Server 2005, I was wondering if it was possible to do such a thing now. BTW, the understanding of some of the things I present here might be a little rudimentary, if not downright wrong, so I apologize in advance, and welcome any clarification or correction. So, say you have a select statement with a subquery: select t1.field1, t1.field2 from ( select t2.field1, t2.field2, t2.field3 from MyTable as t2 ) as t1 My understanding is that the subquery is going to feed the outer query. There is no set order to how the outer query will process the rows from the inner query. I obviously can't place an order on the subquery either. I can, however, do this: select t1.field1, t1.field2 from ( select t2.field1, t2.field2, t2.field3, row_number() over (t2.field1, t2.field2) as rowNumber from MyTable as t2 ) as t1 Now, the results from the subquery will be fed to the outer query ordered by field1, then field2. So far so good. At this point I can ask my first question, is this behavior well-defined, or is this an implementation detail? If it is an implementation detail, then the following questions don't matter, since I can't build a dependable solution upon this. If it is well-defined behavior, great. Now, say I want to perform a running calc which requires an order to the processing. For this example, I will use average cost after a trade. For those that are unfamiliar, average cost is basically a weighted average taking into account all the trades that have come before it. Now, say I have a CLR function which I have exposed to SQL Server that, given the inputs, I can produce an average cost for a given row. Of course, this function would take a correlation id of some sort so that it can store information it needs between calls (as well as a cleanup function called after the query to remove information that is stored statically based on the same correlation id). The function declaration in T-SQL would be something like this (I'm omitting the specifics of how it's bound to the CLR function for clarification purposes): create function xfn_AverageCost ( @correlationId uniqueidientifier, @rowNumber int, @price decimal(12, 8), @amount decimal(38, 2) ) returns decimal(12, 8) as ... Putting it together with the correlated sub-query, I could then do something like this in a stored procedure (assume that the TradeDate column is always unique and is a tie-breaker so that the order is clearly established): --- Get an id for correlation. declare @correlationId uniqueidentifier set @correlationId = newid() --- The query. select ot.TradeDate, ot.Amount, ot.Price, dbo.xfn_AverageCost(@correlationId, ot.RowNumber, ot.Price, ot.Amount from ( select t.TradeDate, t.Amount, t.Price, row_number() over(order by t.TradeDate) as RowNumber from Trades as t ) as ot --- Call another clr function to release resources that are held in calls using the correlation id. exec xsp_CleanupStoredClrResources @correlationId Given the above query, when the call to xfn_AverageCost is made on each row passed to it, will it be fed the values in the specified order from the sub-query? Also, do I have to worry about parallelism here? In other words, might the processing of the query be broken up into chunks which would have the values fed to the function out of order? Finally, if this is possible, is it valid, in that the behavior is well-defined, or no, because it is an implementation detail? Thanks in advance. -- - Nicholas Paldino [.NET/C# MVP] - mvp@spam.guard.caspershouse.com |
|||||||||||||||||||||||