Home All Groups Group Topic Archive Search About

MINUS operation between table date ranges, is my algorithm sound?

Author
14 Jan 2006 2:55 AM
peter walker
Hi everyone, i was hoping someone could help me.

i am looking to implement a MINUS operation between two tables which contain
date ranges.

table a:
[start] [finish] [group]
1   10    0
18 19    1
23  26   2
28 31    3


table b:
[start] [finish]
4   5
18 18
25  28

Result of table a - table b:
[start] [finish]
1   3
6   10
19 19
23  24
29 31

Would anyone know how to implement this? I have done this by firstly
performing an AND operation between the two tables:

table a AND table b = table c (here the grouping colums tells me which group
from table a the date range is associated with

[start] [finish] [group]
4   5     0
18 18   1
25  26   2
28 28    3

Then I throw table a and c into another table (d). Here I assign:

start date of table a: status = 1
end date of table a: status = 0
start date of table b: status = 0
end date of table b: status = 1
start of date range in table a: boundary = 0
end of date range for table a: boundary = 2
boundary = 1 for all data coming from table b


table d:
[date] [status] [group] [boundary]
1    1      0   1
10  0      0   3
18  1      1   1
19  0      1   3
23   1     2    1
26   0     2   3
28   1     3   1
31   0     3   3
4     0     0   2
5    1      0   2
18   0     1   2
18   1     1   2
25    0     2  2
26   1      2   2
28    0     3   2
28   1      3   2

I then sort using the SQL statement: order by 3,1,2 to get:

table d:
[date] [status] [group] [boundary]
1    1      0   1
4     0     0   2
5    1      0   2
10  0      0   3

18  1      1   1
18   0     1   2
18   1     1   2
19  0      1   3

23   1     2    1
25    0     2  2
26   1      2   2
26   0     2   3

28   1     3   1
28    0     3   2
28   1      3   2
31   0     3   3

I then update the table according to:

case
when boundary = 2 and status = 0 then data = date -1
when boundary = 2 and status = 1 then data = date +1
end

and I get:

table d:
[date] [status] [group] [boundary]
1    1      0   1
3     0     0   2
6    1      0   2
10  0      0   3

18  1      1   1
17   0     1   2
19   1     1   2
19  0      1   3

23   1     2    1
24    0     2  2
27   1      2   2
26   0     2   3

28   1     3   1
27    0     3   2
29   1      3   2
31   0     3   3

I then filter the table and look for adjacient row pairs where row(i)=1 and
row(i+1)=0 :

table d:
[date] [status] [group] [boundary]
1    1      0   1
3     0     0   2

6    1      0   2
10  0      0   3

18  1      1   1
17   0     1   2

19   1     1   2
19  0      1   3

23   1     2    1
24    0     2  2

27   1      2   2
26   0     2   3

28   1     3   1
27    0     3   2

29   1      3   2
31   0     3   3


I delete those pairs where (date(i) < date(i+1) and boundary(i) >
boundary(i+1) to get:

table d:
[date] [status] [group] [boundary]
1    1      0   1
3     0     0   2

6    1      0   2
10  0      0   3

19   1     1   2
19  0      1   3

23   1     2    1
24    0     2  2

29   1      3   2
31   0     3   3

This then gives me the resultant table im looking for:

table d recast:
[start] [finish]
1    3
6    10
19  19
23   24
29  31

Is my approach sound? I am yet to encode the above in SQL. Maybe there is a
faster approach to getting this result?

Any help most appreciated!

cheers, peter

Author
14 Jan 2006 9:06 AM
David Portas
Show quote
"peter walker" <pwalker@nospam.com> wrote in message
news:ucoD$XLGGHA.3056@TK2MSFTNGP09.phx.gbl...
> Hi everyone, i was hoping someone could help me.
>
> i am looking to implement a MINUS operation between two tables which
> contain date ranges.
>
> table a:
> [start] [finish] [group]
> 1   10    0
> 18 19    1
> 23  26   2
> 28 31    3
>
>
> table b:
> [start] [finish]
> 4   5
> 18 18
> 25  28
>
> Result of table a - table b:
> [start] [finish]
> 1   3
> 6   10
> 19 19
> 23  24
> 29 31
>
> Would anyone know how to implement this? I have done this by firstly

Read my signature to see how much information you left out.

Let's assume your tables look like this:

CREATE TABLE a (start INTEGER NOT NULL PRIMARY KEY /* ?? Was not specified
*/, finish INTEGER NOT NULL, CHECK (start<=finish) /* ??? Was not specified
*/, grp INTEGER NOT NULL /* "GROUP" is a reserved word - not a good column
name */);

CREATE TABLE b (start INTEGER NOT NULL PRIMARY KEY /* ?? */, finish INTEGER
NOT NULL, CHECK (start<=finish));

Your sample data:

INSERT INTO a (start,finish,grp)
SELECT 1, 10, 0 UNION ALL
SELECT 18, 19, 1 UNION ALL
SELECT 23,  26, 2 UNION ALL
SELECT 28, 31, 3;

INSERT INTO b (start, finish)
SELECT 4, 5 UNION ALL
SELECT 18, 18 UNION ALL
SELECT 25, 28;

I'll also assume you have a table of numbers - all integers from 0 to some
arbitrarily large number. One way to get the missing numbers would be like
this:

SELECT n.num
FROM numbers AS n
LEFT JOIN b
  ON n.num BETWEEN b.start AND b.finish
WHERE b.start IS NULL
  AND n.num BETWEEN
  (SELECT MIN(start)
   FROM a) AND
  (SELECT MAX(finish)
   FROM a);

I don't quite understand the significance of the "group" column here. Please
give us a better spec if you need more help.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
Author
14 Jan 2006 9:45 AM
mag1kus
Hi, thanks for the post. Im the original poster of this topic (at work
using a friend's account).

Please disregard the [group] column in my initial description of the
problem. I used a [group] column within my algorithm. In effect I would
like to know the best way to implement a subtraction of date ranges as
follows:

table a:
[start] [finish]
1   10
18 19
23  26
28 31


table b:
[start] [finish]
4   5
18 18
25  28


Result of table a - table b:
[start] [finish]
1   3
6   10
19 19
23  24
29 31


in effect, if a date range in table_b intersects a date range in
table_a, then that data in the intersection is removed from the date
range in table_a

for example, the following are example cases where we subtract from a
range in table_a where there are ranges in table_b which intersect with
that range in table_a:

1. {23.....37}  - {25....29} = {23....24}
2. {23.....37}  - {26....32} = {23....25}, {33....37}
3. {23.....37}  - [ {25...27} ,  {31...33} ] = {23....24}, {28...30},
{34...37}

Any help on this would be great!

Many thanks.
peter
Author
14 Jan 2006 12:09 PM
Andrey Odegov
Peter,

this is very ugly :)
but the SELECT statement should work
i use  a table of natural numbers proposed by David
David, pardon me for the use of your idea :)

SELECT MIN(U.seq) as start, MAX(U.seq) as finish
  FROM (SELECT G.seq, G.seq - COUNT(*)
          FROM (SELECT GS.seq
                  FROM TableA AS GA, Sequence AS GS
                 WHERE GS.seq BETWEEN GA.start AND GA.finish
                   AND NOT EXISTS(SELECT *
                                    FROM TableB AS GB, Sequence AS GSS
                                   WHERE GSS.seq BETWEEN GB.start AND
GB.finish
                                     AND GSS.seq = GS.seq)) AS G,
               (SELECT LS.seq
                  FROM TableA AS LA, Sequence AS LS
                 WHERE LS.seq BETWEEN LA.start AND LA.finish
                   AND NOT EXISTS(SELECT *
                                    FROM TableB AS LB, Sequence AS LSS
                                   WHERE LSS.seq BETWEEN LB.start AND
LB.finish
                                     AND LSS.seq = LS.seq)) AS L
         WHERE L.seq <= G.seq
         GROUP BY G.seq) AS U(seq, gb)
GROUP BY U.gb;

---
Andrey Odegov
avode***@yandex.ru
(remove GOV to respond)
Author
15 Jan 2006 10:54 PM
Hugo Kornelis
On Sat, 14 Jan 2006 12:55:36 +1000, peter walker wrote:

>Hi everyone, i was hoping someone could help me.
(snip)

Hi Peter,

I just posted a reply to your first thread about this issue.

--
Hugo Kornelis, SQL Server MVP
Author
24 Jan 2006 8:39 AM
jsfromynr
Hello There,
I hope this might solve your problem.
Create Table TableA
(
[start] int,
[finish] int,
[group] int
)
Go
Insert into TableA
Select 1   ,10    ,0
Union All
Select 18, 19    ,1
Union All
Select 23,  26   ,2
Union All
Select 28 ,31    ,3
Go

Create Table TableB
(
[start] int,
[finish] int
)
Go
Insert into TableB
Select 4   ,5
Union All
Select 18 ,18
Union All
Select 25  ,28
Go
Create View vwTmpData
As
Select * From (
Select 1 N
Union All
Select 2
Union All
Select 3
Union All
Select 4
Union All
Select 5
Union All
Select 6
Union All
Select 7
Union All
Select 8
Union All
Select 9
Union All
Select 10
Union All
Select 11
Union All
Select 12
Union All
Select 13
Union All
Select 14
Union All
Select 15
Union All
Select 16
Union All
Select 17
Union All
Select 18
Union All
Select 19
Union All
Select 20
Union All
Select 21
Union All
Select 22
Union All
Select 23
Union All
Select 24
Union All
Select 25
Union All
Select 26
Union All
Select 27
Union All
Select 28
Union All
Select 29
Union All
Select 30
Union All
Select 31
Union All
Select 32
Union All
Select 33
) Seq Inner Join TableA T1 On N Between T1.start and T1.finish
Where N Not In (Select Start From tableB Union Select Finish From
TableB)

Go
Select identity(int,1,1) N1,* into tmpData From vwTmpData

Update tmpData Set [group] = [group] + 1
Where N - N1> 0

Select Min(N) Start,Max(N) Finish From tmpData Group by [group]


Drop Table tmpData
Drop View vwTmpData
Drop Table TableA
Drop Table TableB

With Warm regards
Jatinder Singh

AddThis Social Bookmark Button