Home All Groups Group Topic Archive Search About
Author
30 Jun 2005 4:16 PM
Roy
Hi all,

I need to identify duplicate data (based on the non-key columns) in two
tables that have a child parent relationship:
Table1(ColID, col1, col2, col3, col4)
Table2(CID, colID, c1, c2, c3)
These are columns that could have same data: Table1(col2, col3, col4) and
Table2(c2, c3) where Table1.colID = Table2.colID
For example:
Table1
ColID    col1    col2    col3    col4
1    'John'    'ABC'    34     0
2    'Joe'    'ABC'    34    0
3    'jeff'    'XYZ'    23    1

In this table row 1 and 2 have same data for col2, col3, and col4, so we
have colID=1,2.
Now in Table2 we have:
CID    colID    c1    c2    c3
1    1    'A'    'AB'    'BC'
2    1    'B'    'AB'    'BC'
3    1    'C'    'AC'    'AD'
4    2    'D'    'CD'    'CE'
5    3    'E'    'EC'    'EF'

colID = 1 has three records in this table which the first two records have
same data in c2 and c3.
Query needs to identify this record and returns:
ColID col1    col2   col3  col4  cid  c1  c2   c3
1     'John'  'ABC'  34    0     1    'A' 'AB' 'BC'

1     'John'  'ABC'  34    0     2    'B' 'AB' 'BC'

Any help would be appreciated.
Roy

Author
30 Jun 2005 4:50 PM
Alejandro Mesa
Try,

select
    *
from
    (
    select
        t1.ColID,
        t1.col1,
        t1.col2,
        t1.col3,
        t1.col4
    from
        table1 as t1
        inner join
        (
        select
            col2,
            col3,
            col4
        from
            table1
        group by
            col2,
            col3,
            col4
        having
            count(*) > 1
        ) as t
        on t1.col2 = t.col2
        and t1.col3 = t.col3
        and t1.col4 = t.col4
    ) as a
    inner join
    (
    select
        t2.cid,
        t2.colid,
        t2.c1,
        t2.c2,
        t2.c3
    from
        table2 as t2
        inner join
        (
        select
            colid,
            c2,
            c3
        from
            table2
        group by
            colid,
            c2,
            c3
        having
            count(*) > 1
        ) as t
        on t2.colid = t.colid
        and t2.c2 = t.2
        and t2.c3 = t.3
    ) as b
    on a.colid = b.colid


AMB


Show quote
"Roy" wrote:

> Hi all,
>
> I need to identify duplicate data (based on the non-key columns) in two
> tables that have a child parent relationship:
> Table1(ColID, col1, col2, col3, col4)
> Table2(CID, colID, c1, c2, c3)
> These are columns that could have same data: Table1(col2, col3, col4) and
> Table2(c2, c3) where Table1.colID = Table2.colID
> For example:
> Table1
> ColID    col1    col2    col3    col4
> 1    'John'    'ABC'    34     0
> 2    'Joe'    'ABC'    34    0
> 3    'jeff'    'XYZ'    23    1
>
> In this table row 1 and 2 have same data for col2, col3, and col4, so we
> have colID=1,2.
> Now in Table2 we have:
> CID    colID    c1    c2    c3
> 1    1    'A'    'AB'    'BC'
> 2    1    'B'    'AB'    'BC'
> 3    1    'C'    'AC'    'AD'
> 4    2    'D'    'CD'    'CE'
> 5    3    'E'    'EC'    'EF'
>
> colID = 1 has three records in this table which the first two records have
> same data in c2 and c3.
> Query needs to identify this record and returns:
> ColID col1    col2   col3  col4  cid  c1  c2   c3
> 1     'John'  'ABC'  34    0     1    'A' 'AB' 'BC'
>
> 1     'John'  'ABC'  34    0     2    'B' 'AB' 'BC'
>
> Any help would be appreciated.
> Roy
Author
30 Jun 2005 4:55 PM
Alejandro Mesa
Correction,

>         and t2.c2 = t.2
>         and t2.c3 = t.3

use northwind
go

create table table1 (
ColID int,
col1 varchar(15),
col2 varchar(15),
col3 int,
col4 int
)
go

insert into table1 values(1,    'John',    'ABC',    34, 0)
insert into table1 values(2,    'Joe',    'ABC',    34,    0)
insert into table1 values(3,    'jeff',    'XYZ',    23,    1)
go

create table table2 (
CID int,
colID int,
c1 varchar(15),
c2 varchar(15),
c3 varchar(15)
)
go

insert into table2 values(1,    1,    'A',    'AB',    'BC')
insert into table2 values(2,    1,    'B',    'AB',    'BC')
insert into table2 values(3,    1,    'C',    'AC',    'AD')
insert into table2 values(4,    2,    'D',    'CD',    'CE')
insert into table2 values(5,    3,    'E',    'EC',    'EF')
go

select
    *
from
    (
    select
        t1.ColID,
        t1.col1,
        t1.col2,
        t1.col3,
        t1.col4
    from
        table1 as t1
        inner join
        (
        select
            col2,
            col3,
            col4
        from
            table1
        group by
            col2,
            col3,
            col4
        having
            count(*) > 1
        ) as t
        on t1.col2 = t.col2
        and t1.col3 = t.col3
        and t1.col4 = t.col4
    ) as a
    inner join
    (
    select
        t2.cid,
        t2.colid,
        t2.c1,
        t2.c2,
        t2.c3
    from
        table2 as t2
        inner join
        (
        select
            colid,
            c2,
            c3
        from
            table2
        group by
            colid,
            c2,
            c3
        having
            count(*) > 1
        ) as t
        on t2.colid = t.colid
        and t2.c2 = t.c2
        and t2.c3 = t.c3
    ) as b
    on a.colid = b.colid
go

drop table table1, table2
go


AMB


Show quote
"Alejandro Mesa" wrote:

> Try,
>
> select
>     *
> from
>     (
>     select
>         t1.ColID,
>         t1.col1,
>         t1.col2,
>         t1.col3,
>         t1.col4
>     from
>         table1 as t1
>         inner join
>         (
>         select
>             col2,
>             col3,
>             col4
>         from
>             table1
>         group by
>             col2,
>             col3,
>             col4
>         having
>             count(*) > 1
>         ) as t
>         on t1.col2 = t.col2
>         and t1.col3 = t.col3
>         and t1.col4 = t.col4
>     ) as a
>     inner join
>     (
>     select
>         t2.cid,
>         t2.colid,
>         t2.c1,
>         t2.c2,
>         t2.c3
>     from
>         table2 as t2
>         inner join
>         (
>         select
>             colid,
>             c2,
>             c3
>         from
>             table2
>         group by
>             colid,
>             c2,
>             c3
>         having
>             count(*) > 1
>         ) as t
>         on t2.colid = t.colid
>         and t2.c2 = t.2
>         and t2.c3 = t.3
>     ) as b
>     on a.colid = b.colid
>
>
> AMB
>
>
> "Roy" wrote:
>
> > Hi all,
> >
> > I need to identify duplicate data (based on the non-key columns) in two
> > tables that have a child parent relationship:
> > Table1(ColID, col1, col2, col3, col4)
> > Table2(CID, colID, c1, c2, c3)
> > These are columns that could have same data: Table1(col2, col3, col4) and
> > Table2(c2, c3) where Table1.colID = Table2.colID
> > For example:
> > Table1
> > ColID    col1    col2    col3    col4
> > 1    'John'    'ABC'    34     0
> > 2    'Joe'    'ABC'    34    0
> > 3    'jeff'    'XYZ'    23    1
> >
> > In this table row 1 and 2 have same data for col2, col3, and col4, so we
> > have colID=1,2.
> > Now in Table2 we have:
> > CID    colID    c1    c2    c3
> > 1    1    'A'    'AB'    'BC'
> > 2    1    'B'    'AB'    'BC'
> > 3    1    'C'    'AC'    'AD'
> > 4    2    'D'    'CD'    'CE'
> > 5    3    'E'    'EC'    'EF'
> >
> > colID = 1 has three records in this table which the first two records have
> > same data in c2 and c3.
> > Query needs to identify this record and returns:
> > ColID col1    col2   col3  col4  cid  c1  c2   c3
> > 1     'John'  'ABC'  34    0     1    'A' 'AB' 'BC'
> >
> > 1     'John'  'ABC'  34    0     2    'B' 'AB' 'BC'
> >
> > Any help would be appreciated.
> > Roy
Author
30 Jun 2005 4:55 PM
--CELKO--
Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications. Is this what you meant?

CREATE TABLE Foobar
(foo_id INTEGER NOT NULL PRIMARY KEY,
col1 CHAR(5) NOT NULL,
col2 CHAR(5) NOT NULL,
col3 INTEGER NOT NULL,
col4 INTEGER NOT NULL);

>> In this table row 1 and 2 have same data for col2, col3, and col4, so we  have foo_id = {1, 2}. <<

This kind of duplication is usually the result of using an IDENTITY
column instead of a key.

SELECT F1.foo_id, F1.col1,  F1.col2,  F1.col3,  F1.col4
  FROM Foobar AS F1
       INNER JOIN
       (SELECT col2, col3, col4
          FROM Foobar AS F1
         GROUP BY col2, col3, col4
        HAVING COUNT(*) > 1)) AS M1(col2, col3, col4)
       ON F1.col2 = M1.col2
          AND F1.col3 = M1.col3
          AND F1.col4 = M1.col4;

And do the same thing for the second table.  Since they have no
relationship to each other, you not put them in the same query -- or di
you fail to tell us something about that?

AddThis Social Bookmark Button