|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Deleting Duplicate DataI have a table with 60 columns and more than one million rows. i have to implement composite primary key but it gives me error of duplicate data. i have used following query to detect the duplicate rows select NID,output_No from tbl_Data group by NID,output_No having count(*) > 1 it gives me 2526 duplicate dows. Now i wants to delete the duplicate rows what will be the query for deleting the duplicate records. Thanx *** Sent via Developersdex http://www.developersdex.com *** This script has written by Itzik Ben-Gan
CREATE TABLE #Demo ( idNo int identity(1,1), colA int, colB int ) INSERT INTO #Demo(colA,colB) VALUES (1,6) INSERT INTO #Demo(colA,colB) VALUES (1,6) INSERT INTO #Demo(colA,colB) VALUES (2,4) INSERT INTO #Demo(colA,colB) VALUES (3,3) INSERT INTO #Demo(colA,colB) VALUES (4,2) INSERT INTO #Demo(colA,colB) VALUES (3,3) INSERT INTO #Demo(colA,colB) VALUES (5,1) INSERT INTO #Demo(colA,colB) VALUES (8,1) PRINT 'Table' SELECT * FROM #Demo PRINT 'Duplicates in Table' SELECT * FROM #Demo WHERE idNo IN (SELECT B.idNo FROM #Demo A JOIN #Demo B ON A.idNo <> B.idNo AND A.colA = B.colA AND A.colB = B.colB) PRINT 'Duplicates to Delete' SELECT * FROM #Demo WHERE idNo IN (SELECT B.idNo FROM #Demo A JOIN #Demo B ON A.idNo < B.idNo -- < this time, not <> AND A.colA = B.colA AND A.colB = B.colB) DELETE FROM #Demo WHERE idNo IN (SELECT B.idNo FROM #Demo A JOIN #Demo B ON A.idNo < B.idNo -- < this time, not <> AND A.colA = B.colA AND A.colB = B.colB) PRINT 'Cleaned-up Table' SELECT * FROM #Demo DROP TABLE #Demo Show quote "Ghulam Farid" <gfa***@yahoo.com> wrote in message news:umVO5durFHA.3444@TK2MSFTNGP12.phx.gbl... > > > Hi to All! > > I have a table with 60 columns and more than one million rows. i have to > implement composite primary key but it gives me error of duplicate data. > i have used following query to detect the duplicate rows > > select NID,output_No from tbl_Data > group by NID,output_No > having count(*) > 1 > > it gives me 2526 duplicate dows. Now i wants to delete the duplicate > rows what will be the query for deleting the duplicate records. > > > Thanx > > > > *** Sent via Developersdex http://www.developersdex.com *** Try this:
(1) SELECT columnList INTO workTable FROM tableName GROUP BY keyColumns HAVING COUNT(*) > 1 (2) DELETE tableName FROM workTable WHERE tableName.keyColumns = workTable.keyColumns (3) INSERT tableName (columnList) SELECT columnList FROM workTable (4) DROP workTable You might want to wrap this in a transaction, but if you don't then a temp table for the work table is contraindicated because if power goes out between steps 2 and 3, you will lose the duplicate rows altogether. If the table were tiny, you could use something like: SET ROWCOUNT 1 AGAIN: DELETE tableName FROM (SELECT keyColumns FROM tableName GROUP BY keyColumns HAVING COUNT(*) > 1) a WHERE tableName.keyColumns = a.keyColumns IF @@ROWCOUNT > 0 GOTO AGAIN SET ROWCOUNT 0 Show quote "Ghulam Farid" <gfa***@yahoo.com> wrote in message news:umVO5durFHA.3444@TK2MSFTNGP12.phx.gbl... > > > Hi to All! > > I have a table with 60 columns and more than one million rows. i have to > implement composite primary key but it gives me error of duplicate data. > i have used following query to detect the duplicate rows > > select NID,output_No from tbl_Data > group by NID,output_No > having count(*) > 1 > > it gives me 2526 duplicate dows. Now i wants to delete the duplicate > rows what will be the query for deleting the duplicate records. > > > Thanx > > > > *** Sent via Developersdex http://www.developersdex.com *** |
|||||||||||||||||||||||