Home All Groups Group Topic Archive Search About

Computing several columns for each row in source table and joining to get result

Author
5 Jan 2006 3:33 PM
dustbort
I have come across this several times now, and I cannot figure out how to do
it better.  Say I have a simple table called SourceTable:

DECLARE @sourceTable TABLE
(
    data1 INT,
    data2 INT,
    data3 INT,
    data4 INT
)

I need to create a table (view, tv function, etc.) that looks something like

DECLARE @resultTable TABLE
(
    data1 INT,
    data2 INT,
    data3 INT,
    data4 INT,
    date1 SMALLDATETIME,
    date2 SMALLDATETIME
)

where date1 and date2 are calculated (with functions) using data1...data4
from the same row plus another parameter supplied by the user.  So you see
what I want is so simple:  For each row in @sourceTable, evaluate a
table-valued function getDates() that returns a single row containing date1
and date2, and join the result to produce @resultTable.  However, I can't
figure out any syntax to do this straightforwardly.

In some cases where date2 depends on date1, I can use nested queries, so I
can do something like

SELECT
    data1,
    data2,
    data3,
    data4,
    date1,
    date2 = getDate2(@userInput, date1, data3, data4)
FROM (
    SELECT
        data1,
        data2,
        data3,
        data4,
        date1 = getDate1(@userInput, data1, data2)
    FROM
        @sourceTable
) T1

But recently, I have had several problems where it would be more efficient
and maintainable if I could return both date1 and date2 from a table-valued
function as a single row with two columns.  This is because the relationship
between date1 and date2 is more complicated and they can't just be computed
sequentially.  My first attempt was to write a TV function that basically
was

CREATE FUNCTION getDates (@userInput INT, @data1 INT, @data2 INT, @data3
INT, @data4 INT)
RETURNS @dates TABLE (date1 SMALLDATETIME, date2 SMALLDATETIME) AS
BEGIN
    DECLARE @date1 SMALLDATETIME
    SET @date1 = getDate1(@userInput, @data1, @data2)

    DECLARE @date2 SMALLDATETIME
    SET @date2 = getDate2(@userInput, @data3, @data4)

    IF (@date1 < @date2)
        SET @date1 = getDate1(@date2, @data1, @data2)

    INSERT INTO @dates
    SELECT @date1, @date2

    RETURN
END

I tried to join the function with the source table to get my result table as
follows:

SELECT
    ST.data1,
    ST.data2,
    ST.data3,
    ST.data4,
    D.date1,
    D.date2
FROM @sourceTable ST
    INNER JOIN getDates(
        @userInput,
        ST.data1,
        ST.data2,
        ST.data3,
        ST.data4) D

but SQL Server always complains when it reaches the 'ST' in the second
argument of getDates(), because apparently ST is not available in that
context.  I tried using a cursor to evaluate getDates() for each row in
@sourceTable and join the result to produce @resultTable, but something was
just wrong and the query batch would never finish executing in query
analyzer.  (I debugged and found that the cursor was implemented properly,
it was just extremely slow or was hanging in QA.)  For now, I am using a
several-level-deep nested query that performs the logic of of my getDates()
function.  Each query level performs one calculation or condition on one of
the two dates, and the rest of the columns just get carried along. For
example:

SELECT
    data1,
    data2,
    data3,
    data4,
    date1 = CASE WHEN (date1 < date2)
        THEN getDate1(date2, data1, data2)
        ELSE date1
    END,
    date2
FROM (
    SELECT
        data1,
        data2,
        data3,
        data4,
        date1,
        date2 = getDate2(@userInput, data3, data4)
    FROM (
        SELECT
            data1,
            data2,
            data3,
            data4,
            date1 = getDate1(@userInput, data1, data2)
        FROM
            @sourceTable
    ) RT1
) RT2

The query is actually a few levels deeper because I have to calculate other
things based on date1, and there are many more columns. This is horrible in
terms of readability and maintanability because the logic is distributed
throughout each level of the query, and I have to repeat all the columns at
each level.  If I could return more than one column from a correlated
subquery, I would be fine, but I don't believe this is possible.  Can
someone please help?

Author
5 Jan 2006 3:51 PM
Steve Kass
Dustbort,

SQL Server 2000 and earlier do not support "correlated joins",
which is what you are trying to write.  In your example, the
right-hand table is a table-valued function that is a different
table for each row of the left-hand table.

In SQL Server 2005, this can be done with the new
APPLY operator.  In 2000, there is no easy way,
though it's possible that there is an easier way to solve
your specific problem.

Steve Kass
Drew University



dustbort wrote:

Show quote
>I have come across this several times now, and I cannot figure out how to do
>it better.  Say I have a simple table called SourceTable:
>
>DECLARE @sourceTable TABLE
>(
>    data1 INT,
>    data2 INT,
>    data3 INT,
>    data4 INT
>)
>
>I need to create a table (view, tv function, etc.) that looks something like
>
>DECLARE @resultTable TABLE
>(
>    data1 INT,
>    data2 INT,
>    data3 INT,
>    data4 INT,
>    date1 SMALLDATETIME,
>    date2 SMALLDATETIME
>)
>
>where date1 and date2 are calculated (with functions) using data1...data4
>from the same row plus another parameter supplied by the user.  So you see
>what I want is so simple:  For each row in @sourceTable, evaluate a
>table-valued function getDates() that returns a single row containing date1
>and date2, and join the result to produce @resultTable.  However, I can't
>figure out any syntax to do this straightforwardly.
>
>In some cases where date2 depends on date1, I can use nested queries, so I
>can do something like
>
>SELECT
>    data1,
>    data2,
>    data3,
>    data4,
>    date1,
>    date2 = getDate2(@userInput, date1, data3, data4)
>FROM (
>    SELECT
>        data1,
>        data2,
>        data3,
>        data4,
>        date1 = getDate1(@userInput, data1, data2)
>    FROM
>        @sourceTable
>) T1
>
>But recently, I have had several problems where it would be more efficient
>and maintainable if I could return both date1 and date2 from a table-valued
>function as a single row with two columns.  This is because the relationship
>between date1 and date2 is more complicated and they can't just be computed
>sequentially.  My first attempt was to write a TV function that basically
>was
>
>CREATE FUNCTION getDates (@userInput INT, @data1 INT, @data2 INT, @data3
>INT, @data4 INT)
>RETURNS @dates TABLE (date1 SMALLDATETIME, date2 SMALLDATETIME) AS
>BEGIN
>    DECLARE @date1 SMALLDATETIME
>    SET @date1 = getDate1(@userInput, @data1, @data2)
>
>    DECLARE @date2 SMALLDATETIME
>    SET @date2 = getDate2(@userInput, @data3, @data4)
>
>    IF (@date1 < @date2)
>        SET @date1 = getDate1(@date2, @data1, @data2)
>
>    INSERT INTO @dates
>    SELECT @date1, @date2
>
>    RETURN
>END
>
>I tried to join the function with the source table to get my result table as
>follows:
>
>SELECT
>    ST.data1,
>    ST.data2,
>    ST.data3,
>    ST.data4,
>    D.date1,
>    D.date2
>FROM @sourceTable ST
>    INNER JOIN getDates(
>        @userInput,
>        ST.data1,
>        ST.data2,
>        ST.data3,
>        ST.data4) D
>
>but SQL Server always complains when it reaches the 'ST' in the second
>argument of getDates(), because apparently ST is not available in that
>context.  I tried using a cursor to evaluate getDates() for each row in
>@sourceTable and join the result to produce @resultTable, but something was
>just wrong and the query batch would never finish executing in query
>analyzer.  (I debugged and found that the cursor was implemented properly,
>it was just extremely slow or was hanging in QA.)  For now, I am using a
>several-level-deep nested query that performs the logic of of my getDates()
>function.  Each query level performs one calculation or condition on one of
>the two dates, and the rest of the columns just get carried along. For
>example:
>
>SELECT
>    data1,
>    data2,
>    data3,
>    data4,
>    date1 = CASE WHEN (date1 < date2)
>        THEN getDate1(date2, data1, data2)
>        ELSE date1
>    END,
>    date2
>FROM (
>    SELECT
>        data1,
>        data2,
>        data3,
>        data4,
>        date1,
>        date2 = getDate2(@userInput, data3, data4)
>    FROM (
>        SELECT
>            data1,
>            data2,
>            data3,
>            data4,
>            date1 = getDate1(@userInput, data1, data2)
>        FROM
>            @sourceTable
>    ) RT1
>) RT2
>
>The query is actually a few levels deeper because I have to calculate other
>things based on date1, and there are many more columns. This is horrible in
>terms of readability and maintanability because the logic is distributed
>throughout each level of the query, and I have to repeat all the columns at
>each level.  If I could return more than one column from a correlated
>subquery, I would be fine, but I don't believe this is possible.  Can
>someone please help?
>
>

>
Author
5 Jan 2006 4:17 PM
dustbort
Well, at least I know it wasn't just me.  Thanks!

Show quote
"Steve Kass" <sk***@drew.edu> wrote in message
news:%23hvbu$gEGHA.2012@TK2MSFTNGP14.phx.gbl...
> Dustbort,
>
> SQL Server 2000 and earlier do not support "correlated joins",
> which is what you are trying to write.  In your example, the
> right-hand table is a table-valued function that is a different
> table for each row of the left-hand table.
>
> In SQL Server 2005, this can be done with the new
> APPLY operator.  In 2000, there is no easy way,
> though it's possible that there is an easier way to solve
> your specific problem.
>
> Steve Kass
> Drew University
>
>
>
> dustbort wrote:
>
>>I have come across this several times now, and I cannot figure out how to
>>do it better.  Say I have a simple table called SourceTable:
>>
>>DECLARE @sourceTable TABLE
>>(
>>    data1 INT,
>>    data2 INT,
>>    data3 INT,
>>    data4 INT
>>)
>>
>>I need to create a table (view, tv function, etc.) that looks something
>>like
>>
>>DECLARE @resultTable TABLE
>>(
>>    data1 INT,
>>    data2 INT,
>>    data3 INT,
>>    data4 INT,
>>    date1 SMALLDATETIME,
>>    date2 SMALLDATETIME
>>)
>>
>>where date1 and date2 are calculated (with functions) using data1...data4
>>from the same row plus another parameter supplied by the user.  So you see
>>what I want is so simple:  For each row in @sourceTable, evaluate a
>>table-valued function getDates() that returns a single row containing
>>date1 and date2, and join the result to produce @resultTable.  However, I
>>can't figure out any syntax to do this straightforwardly.
>>
>>In some cases where date2 depends on date1, I can use nested queries, so I
>>can do something like
>>
>>SELECT
>>    data1,
>>    data2,
>>    data3,
>>    data4,
>>    date1,
>>    date2 = getDate2(@userInput, date1, data3, data4)
>>FROM (
>>    SELECT
>>        data1,
>>        data2,
>>        data3,
>>        data4,
>>        date1 = getDate1(@userInput, data1, data2)
>>    FROM
>>        @sourceTable
>>) T1
>>
>>But recently, I have had several problems where it would be more efficient
>>and maintainable if I could return both date1 and date2 from a
>>table-valued function as a single row with two columns.  This is because
>>the relationship between date1 and date2 is more complicated and they
>>can't just be computed sequentially.  My first attempt was to write a TV
>>function that basically was
>>
>>CREATE FUNCTION getDates (@userInput INT, @data1 INT, @data2 INT, @data3
>>INT, @data4 INT)
>>RETURNS @dates TABLE (date1 SMALLDATETIME, date2 SMALLDATETIME) AS
>>BEGIN
>>    DECLARE @date1 SMALLDATETIME
>>    SET @date1 = getDate1(@userInput, @data1, @data2)
>>
>>    DECLARE @date2 SMALLDATETIME
>>    SET @date2 = getDate2(@userInput, @data3, @data4)
>>
>>    IF (@date1 < @date2)
>>        SET @date1 = getDate1(@date2, @data1, @data2)
>>
>>    INSERT INTO @dates
>>    SELECT @date1, @date2
>>
>>    RETURN
>>END
>>
>>I tried to join the function with the source table to get my result table
>>as follows:
>>
>>SELECT
>>    ST.data1,
>>    ST.data2,
>>    ST.data3,
>>    ST.data4,
>>    D.date1,
>>    D.date2
>>FROM @sourceTable ST
>>    INNER JOIN getDates(
>>        @userInput,
>>        ST.data1,
>>        ST.data2,
>>        ST.data3,
>>        ST.data4) D
>>
>>but SQL Server always complains when it reaches the 'ST' in the second
>>argument of getDates(), because apparently ST is not available in that
>>context.  I tried using a cursor to evaluate getDates() for each row in
>>@sourceTable and join the result to produce @resultTable, but something
>>was just wrong and the query batch would never finish executing in query
>>analyzer.  (I debugged and found that the cursor was implemented properly,
>>it was just extremely slow or was hanging in QA.)  For now, I am using a
>>several-level-deep nested query that performs the logic of of my
>>getDates() function.  Each query level performs one calculation or
>>condition on one of the two dates, and the rest of the columns just get
>>carried along. For example:
>>
>>SELECT
>>    data1,
>>    data2,
>>    data3,
>>    data4,
>>    date1 = CASE WHEN (date1 < date2)
>>        THEN getDate1(date2, data1, data2)
>>        ELSE date1
>>    END,
>>    date2
>>FROM (
>>    SELECT
>>        data1,
>>        data2,
>>        data3,
>>        data4,
>>        date1,
>>        date2 = getDate2(@userInput, data3, data4)
>>    FROM (
>>        SELECT
>>            data1,
>>            data2,
>>            data3,
>>            data4,
>>            date1 = getDate1(@userInput, data1, data2)
>>        FROM
>>            @sourceTable
>>    ) RT1
>>) RT2
>>
>>The query is actually a few levels deeper because I have to calculate
>>other things based on date1, and there are many more columns. This is
>>horrible in terms of readability and maintanability because the logic is
>>distributed throughout each level of the query, and I have to repeat all
>>the columns at each level.  If I could return more than one column from a
>>correlated subquery, I would be fine, but I don't believe this is
>>possible.  Can someone please help?
>>
>>

AddThis Social Bookmark Button