Home All Groups Group Topic Archive Search About

SQL query to find repeat entries

Author
17 Aug 2006 5:34 PM
Greg Smith
Can you run a query that will return only results that have the data in a
target column duplicated in another row?

i.e. Duplicate address data

Original table tblPeople:
RecordID    Last_name               First_name              Address
======== ============== ============= ================
1                 FlintStone                 Fred                       123
Main
2                 FlintStone                 Wilma                    123
Main
3                 Rubble                      Barney                   321
Elm
4                 Rubble                      Betty                     321
Elm
5                 Flintstone                  Dino                      123
1/2 Main


Run a query to return all records that have an address duplicated in another
record.

SELECT blah,blah.....

The result table contains:
RecordID    Last_name               First_name              Address
======== ============== ============= ================
1                 FlintStone                 Fred                       123
Main
2                 FlintStone                 Wilma                    123
Main
3                 Rubble                      Barney                   321
Elm
4                 Rubble                      Betty                     321
Elm

Record 5 was not returned because the address was unique.

Is there a query that would do this?

Any help is greatly appreciated.

Author
17 Aug 2006 5:41 PM
Aaron Bertrand [SQL Server MVP]
You could try this using a correlated subquery:

SELECT *
FROM tblPeople p
WHERE EXISTS
(
    SELECT Address
    FROM tblPeople
    WHERE Address = p.Address
    GROUP BY Address
    HAVING COUNT(*) > 1
) x


Show quote
"Greg Smith" <g**@umn.edu> wrote in message
news:ewX2WOiwGHA.1216@TK2MSFTNGP03.phx.gbl...
> Can you run a query that will return only results that have the data in a
> target column duplicated in another row?
>
> i.e. Duplicate address data
>
> Original table tblPeople:
> RecordID    Last_name               First_name              Address
> ======== ============== ============= ================
> 1                 FlintStone                 Fred
> 123 Main
> 2                 FlintStone                 Wilma                    123
> Main
> 3                 Rubble                      Barney                   321
> Elm
> 4                 Rubble                      Betty
> 321 Elm
> 5                 Flintstone                  Dino
> 123 1/2 Main
>
>
> Run a query to return all records that have an address duplicated in
> another record.
>
> SELECT blah,blah.....
>
> The result table contains:
> RecordID    Last_name               First_name              Address
> ======== ============== ============= ================
> 1                 FlintStone                 Fred
> 123 Main
> 2                 FlintStone                 Wilma                    123
> Main
> 3                 Rubble                      Barney                   321
> Elm
> 4                 Rubble                      Betty
> 321 Elm
>
> Record 5 was not returned because the address was unique.
>
> Is there a query that would do this?
>
> Any help is greatly appreciated.
>
>
Author
17 Aug 2006 6:55 PM
Greg Smith
Just what I was looking for Aaron.

Thanks muchly.

Show quote
"Aaron Bertrand [SQL Server MVP]" <ten.xoc@dnartreb.noraa> wrote in message
news:%23DhY5QiwGHA.4032@TK2MSFTNGP04.phx.gbl...
> You could try this using a correlated subquery:
>
> SELECT *
> FROM tblPeople p
> WHERE EXISTS
> (
>    SELECT Address
>    FROM tblPeople
>    WHERE Address = p.Address
>    GROUP BY Address
>    HAVING COUNT(*) > 1
> ) x
>
>
> "Greg Smith" <g**@umn.edu> wrote in message
> news:ewX2WOiwGHA.1216@TK2MSFTNGP03.phx.gbl...
>> Can you run a query that will return only results that have the data in a
>> target column duplicated in another row?
>>
>> i.e. Duplicate address data
>>
>> Original table tblPeople:
>> RecordID    Last_name               First_name              Address
>> ======== ============== ============= ================
>> 1                 FlintStone                 Fred 123 Main
>> 2                 FlintStone                 Wilma                    123
>> Main
>> 3                 Rubble                      Barney
>> 321 Elm
>> 4                 Rubble                      Betty 321 Elm
>> 5                 Flintstone                  Dino 123 1/2 Main
>>
>>
>> Run a query to return all records that have an address duplicated in
>> another record.
>>
>> SELECT blah,blah.....
>>
>> The result table contains:
>> RecordID    Last_name               First_name              Address
>> ======== ============== ============= ================
>> 1                 FlintStone                 Fred 123 Main
>> 2                 FlintStone                 Wilma                    123
>> Main
>> 3                 Rubble                      Barney
>> 321 Elm
>> 4                 Rubble                      Betty 321 Elm
>>
>> Record 5 was not returned because the address was unique.
>>
>> Is there a query that would do this?
>>
>> Any help is greatly appreciated.
>>
>>
>
>
Author
17 Aug 2006 8:11 PM
--CELKO--
If this is really address data, then look at specialized products
(Group 1, SSA, Melissa Data, etc.) for scrubbing the data.  It is much
harder than it looks at first.

AddThis Social Bookmark Button