Thursday, March 14, 2013

Different Ways to Delete Duplicate Rows in Table

In this article I am going to show different ways of deleting duplicating records from the table. It is a common question in interviews which is asked frequently.

Consider the following table with rows as an example:
Table Name: Products

ProductId Price
1         10
1         10
2         20
3         30
3         30

Here assume that productId column should be unique after deleting. Now we see how to delete the duplicate records from the products table in different ways.

1. Using Rowid

The following Delete statement deletes the rows using the Rowid.

Delete from <tablename>
where rowid not in (select max(rowid) from <tablename> group by <unique columns or primary key>);


Delete from products
where rowid not in (select max(rowid) from products group by productid);

2. Using temp table and Distinct

Here, first create a temp table and insert distinct rows in the temp table. Then truncate the main table and insert records from the temp table.
Create temporary table products_temp As
Select Distinct ProductID, Price
From   Products;

Truncate table Products;

Insert into products
Select * 
From   products_temp;

3. Using temp table and Row Number analytic function.

The row_number analytic function is used to rank the rows. Here we use the row_number function to rank the rows for each group of productId and then select only record from the group.
Create temporary table products_temp As
Select productid, price
   Select productid, price,
          row_number() over (partition by productId order by price) group_rank
   From   products
Where  group_rank = 1;

No comments:

Post a Comment