|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Removing duplicate dataI have a table that I need to remove duplicates from and the table includes
an identity column. The table contains 12 fields but our business rules are that only 3 of the fields can make the record a duplicate. We would like to keep the record with the min identity column. I have done this in the past, but it was about 5 years ago and I did not insert the dups into another table, I did it strictly with a query. Can anybody please help me out with some code to take care of this or any suggestions on a better way of doing this? So something along the lines of grouping the data by the fields that we are interested in and then deleting the records where the identity column is greater than the min identity column. Thanks. This is the duplicates along with the
minimum id per group SELECT MIN(id),Col1,Col2,Col3 FROM sometable GROUP BY Col1,Col2,Col3 HAVING COUNT(*)>1 Join this to the original table to give the ids of the duplicates excluding the minimum id. SELECT a.id FROM sometable a INNER JOIN ( SELECT MIN(id),Col1,Col2,Col3 FROM sometable GROUP BY Col1,Col2,Col3 HAVING COUNT(*)>1) b(id,Col1,Col2,Col3) ON b.id<>a.id AND b.Col1=a.Col1 AND b.Col2=a.Col2 AND b.Col3=a.Col3 Put it all together to delete them DELETE FROM sometable WHERE id IN ( SELECT a.id FROM sometable a INNER JOIN ( SELECT MIN(id),Col1,Col2,Col3 FROM sometable GROUP BY Col1,Col2,Col3 HAVING COUNT(*)>1) b(id,Col1,Col2,Col3) ON b.id<>a.id AND b.Col1=a.Col1 AND b.Col2=a.Col2 AND b.Col3=a.Col3 ) Here's an example. This will delete all but 1 of the dupes. You will need to
change table / col name accordingly DELETE FROM _Holding_table WHERE EXISTS(SELECT NULL FROM _Holding_table s1 WHERE s1.PhoneNumber= _Holding_table.PhoneNumber and s1.PK_Holding_TableID > _Holding_table.PK_Holding_TableID) -- Show quoteHTH. Ryan "Andy" <A***@discussions.microsoft.com> wrote in message news:32ACE9AA-67CE-454C-9711-956595CB7A8D@microsoft.com... >I have a table that I need to remove duplicates from and the table includes > an identity column. The table contains 12 fields but our business rules > are > that only 3 of the fields can make the record a duplicate. We would like > to > keep the record with the min identity column. I have done this in the > past, > but it was about 5 years ago and I did not insert the dups into another > table, I did it strictly with a query. Can anybody please help me out > with > some code to take care of this or any suggestions on a better way of doing > this? So something along the lines of grouping the data by the fields > that > we are interested in and then deleting the records where the identity > column > is greater than the min identity column. > > Thanks. Do you need to export the removed duplicates to another table?
Try something like this: INSERT INTO RemDup (...) SELECT ... FROM MyTab AS t1 WHERE EXISTS (SELECT NULL FROM MyTab AS t2 WHERE t1.Col1=t2.Col1 AND t1.Col2=t2.Col2 AND t1.Col2=t2.Col2 AND t1.ID>t2.ID) DELETE FROM MyTab AS t1 WHERE EXISTS (SELECT NULL FROM MyTab AS t2 WHERE t1.Col1=t2.Col1 AND t1.Col2=t2.Col2 AND t1.Col2=t2.Col2 AND t1.ID>t2.ID) HTH, Axel Dahmen --------------- Show quote "Andy" <A***@discussions.microsoft.com> wrote in message news:32ACE9AA-67CE-454C-9711-956595CB7A8D@microsoft.com... > I have a table that I need to remove duplicates from and the table includes > an identity column. The table contains 12 fields but our business rules are > that only 3 of the fields can make the record a duplicate. We would like to > keep the record with the min identity column. I have done this in the past, > but it was about 5 years ago and I did not insert the dups into another > table, I did it strictly with a query. Can anybody please help me out with > some code to take care of this or any suggestions on a better way of doing > this? So something along the lines of grouping the data by the fields that > we are interested in and then deleting the records where the identity column > is greater than the min identity column. > > Thanks. |
|||||||||||||||||||||||