Home All Groups Group Topic Archive Search About

set theory operations on two tables containing date range data

Author
13 Jan 2006 9:01 AM
peter walker
Hi everyone, I was hoping anyone came across implementing set theory
operations on two tables containing date ranges.

In specific the tables are set up as so:

Table A
[client] [start date] [ end date]
a   1   3
a   8   12
b   6    8
b   10  12
c   15   31


Table B
[client] [start date] [ end date]
a   5   9
a   2  4
b   20    21
b   23  25
c   25   35

I require three sets of operations:

1. A UNION B: which would give:

[client] [start date] [ end date]
a   1   4
a   5   12
b   6    8
b   10  12
b   20    21
b   23  25
c   15   35


2. A AND B: which would give:

[client] [start date] [ end date]
a   2   3
a   8   9
c   25  35


3. A MINUS B: which would give:

[client] [start date] [ end date]
a   1   1
a   10   12
c   15  24

These set operations take place between the same clients on both tables i.e.
client A.a operates on B.a etc

I have been thinking of making one huge time dimension table table with Time
in one column and a value of type bit in the other, and use this to do the
operations. However I feel that this is not the most effective approa ch.
Would anyone have any idea how to go about this?

Many thanks
Peter

Author
15 Jan 2006 10:54 PM
Hugo Kornelis
On Fri, 13 Jan 2006 19:01:46 +1000, peter walker wrote:

>Hi everyone, I was hoping anyone came across implementing set theory
>operations on two tables containing date ranges.

Hi Peter,

A numbers table is your friend. See http://www.aspfaq.com/2516 for info
on how to make a numbers table and some examples of its use.

Show quote
>In specific the tables are set up as so:
>
>Table A
>[client] [start date] [ end date]
>a   1   3
>a   8   12
>b   6    8
>b   10  12
>c   15   31
>
>
>Table B
>[client] [start date] [ end date]
>a   5   9
>a   2  4
>b   20    21
>b   23  25
>c   25   35

Too bad you didn't post these as CREATE TABLE and INSERT statements;
that would have made it a whole lot easier to test the solutions below.
See www.aspfaq.com/5006 for more information on the best way to ask
questions in theses groups.

>I require three sets of operations:
>
>1. A UNION B: which would give:
>
>[client] [start date] [ end date]
>a   1   4
>a   5   12
>b   6    8
>b   10  12
>b   20    21
>b   23  25
>c   15   35

Why are the first two rows (a 1 4 and a 5 12) not combined into one row
(a 1 12)?

This view will give you the rough data. I'll get to the query that
transforms this back into ranges in a minute.

CREATE VIEW A_Union_B
AS
SELECT      C.Client, N.Number
FROM       (SELECT Client
            FROM   A
            UNION
            SELECT Client
            FROM   B) AS C
CROSS JOIN  Numbers AS N
-- Note: If you have a large Numbers table, use the line below instead
--       This is for performance (the OR condition forces a table scan).
--CROSS JOIN (SELECT Number
--            FROM   Numbers
--            WHERE  Number BETWEEN 1 AND 50) AS N
WHERE       EXISTS (SELECT *
                    FROM   A
                    WHERE  C.Client = A.Client
                    AND    N.Number BETWEEN A.StartDate AND A.EndDate)
OR          EXISTS (SELECT *
                    FROM   B
                    WHERE  C.Client = B.Client
                    AND    N.Number BETWEEN B.StartDate AND B.EndDate)
go

>2. A AND B: which would give:
>
>[client] [start date] [ end date]
>a   2   3
>a   8   9
>c   25  35

Surely, the last row should have read c 25 31?

Again, here's the view for the rough data:

CREATE VIEW A_And_B
AS
SELECT      C.Client, N.Number
FROM       (SELECT Client
            FROM   A
            UNION
            SELECT Client
            FROM   B) AS C
CROSS JOIN  Numbers AS N
INNER JOIN  A
      ON    C.Client = A.Client
      AND   N.Number BETWEEN A.StartDate AND A.EndDate
INNER JOIN  B
      ON    C.Client = B.Client
      AND   N.Number BETWEEN B.StartDate AND B.EndDate
go

>3. A MINUS B: which would give:
>
>[client] [start date] [ end date]
>a   1   1
>a   10   12
>c   15  24

Why are the rows b 6 8 and b 10 12 not included in this output?

CREATE VIEW A_Minus_B
AS
SELECT      C.Client, N.Number
FROM       (SELECT Client
            FROM   A
            UNION
            SELECT Client
            FROM   B) AS C
CROSS JOIN  Numbers AS N
INNER JOIN  A
      ON    C.Client = A.Client
      AND   N.Number BETWEEN A.StartDate AND A.EndDate
WHERE NOT   EXISTS (SELECT *
                    FROM   B
                    WHERE  C.Client = B.Client
                    AND    N.Number BETWEEN B.StartDate AND B.EndDate)
go

The last step would be to find continuous ranges in the views and output
them on a one row per range basis. (Skip this step if it's not required
for your actual application, since it introduces some overhead - if you
can iterate through the rows on the client and transform into ranges in
the output there, do so!)

I'll show you the pattern for the last of the views. It'll be the same
pattern for them all.

SELECT   Client, MIN(Number) AS StartDate, MAX(Number) AS EndDate
FROM    (SELECT     A.Client, A.Number, COUNT(*) AS Cnt
         FROM       A_Minus_B AS A
         INNER JOIN Numbers   AS N
               ON   N.Number  <= A.Number
         LEFT  JOIN A_Minus_B AS B
               ON   B.Client   = A.Client
               AND  B.Number   = N.Number
         WHERE      B.Client  IS NULL
         GROUP BY   A.Client, A.Number) AS D
GROUP BY Client, Cnt
ORDER BY Client, StartDate

--
Hugo Kornelis, SQL Server MVP
Author
17 Jan 2006 3:19 AM
--CELKO--
Go to the University of Ariziona website and download a copy of the
Rick Snodgrass book on temporal queries in SQL and the code that comes
with it. 

Do not bother with Chris Date's book on this topic.

AddThis Social Bookmark Button