Home All Groups Group Topic Archive Search About
Author
18 May 2006 8:05 PM
bic
I am trying to clean up a set of a table where there are some duplicate data
with different ids.  What's the easiest way to  eliminate the duplicate rows,
for there are several hundred rows and only about 400 rows are unique.  Thanks
--
bic

Author
18 May 2006 8:16 PM
Mark Williams
One way would be to create a new table from the original one, minus the
duplicate rows:

SELECT MIN(id) AS id, Column1, Column2, Column3
INTO [New Table]
FROM [Old Table]
GROUP BY Column1, Column2, Column3

Then delete the duplicate rows in the original table:
/*Untested, make a backup */

DELETE FROM [Old Table] 
WHERE EXISTS
(SELECT 1 FROM [New Table]
WHERE [Old Table].Column1 = [New Table].Column1
AND [Old Table].Column2 = [New Table].Column2
AND [Old Table].Column2 = [New Table].Column2
AND [Old Table].id > [New Table].id )

--

Show quote
"bic" wrote:

> I am trying to clean up a set of a table where there are some duplicate data
> with different ids.  What's the easiest way to  eliminate the duplicate rows,
> for there are several hundred rows and only about 400 rows are unique.  Thanks
> --
> bic
Author
18 May 2006 9:55 PM
bic
It worked. Thanks for the help, Mark.
--
bic


Show quote
"Mark Williams" wrote:

> One way would be to create a new table from the original one, minus the
> duplicate rows:
>
> SELECT MIN(id) AS id, Column1, Column2, Column3
> INTO [New Table]
> FROM [Old Table]
> GROUP BY Column1, Column2, Column3
>
> Then delete the duplicate rows in the original table:
> /*Untested, make a backup */
>
> DELETE FROM [Old Table] 
> WHERE EXISTS
> (SELECT 1 FROM [New Table]
> WHERE [Old Table].Column1 = [New Table].Column1
> AND [Old Table].Column2 = [New Table].Column2
> AND [Old Table].Column2 = [New Table].Column2
> AND [Old Table].id > [New Table].id )
>
> --
>
> "bic" wrote:
>
> > I am trying to clean up a set of a table where there are some duplicate data
> > with different ids.  What's the easiest way to  eliminate the duplicate rows,
> > for there are several hundred rows and only about 400 rows are unique.  Thanks
> > --
> > bic
Author
18 May 2006 8:18 PM
curtmorrisonemail@gmail.com
Something like the following should work, but check the code before you
run it:

-- Create a temporary table
CREATE TABLE #TMPTABLE (col1 datatype(size), col1 datatype(size), etc.)

-- Insert rows in to a temporary table by grouping duplicate records
together or use DISTINCT, thus eliminating the duplicate records
INSERT INTO #TMPTABLE
--SELECT DISTINCT * FROM [source-table]
--SELECT * FROM [source-table] GROUP BY col1, col2, col3, col4, etc.

-- Delete all the records from the source table
DELETE FROM [source-table]

-- Insert the records from the temporary table in to the source table
INSERT INTO [source-table]
SELECT * FROM #TMPTABLE

-- Drop the temporary table
DROP TABLE #TMPTABLE

Something like that should be fine. Sorry if my code isn't up to spec.

- Curt Morrison

AddThis Social Bookmark Button