Table of Contents
PostgreSQL – How to find and delete duplicate records
PostgreSQL – how to find and delete duplicate records ? Lot of developers have asked this question. Therefore, in this tutorial I have explained these steps in very simple manner. In the below mentioned video, you can learn these steps while watching rather than reading.
Demo to remove duplicate records in PostgreSQL 9.6
Create a table and insert some records
create table students (name varchar(10)) ;
Now let’s insert dummy data i.e. data with duplicate records
insert into students values (‘Chris’) ;
insert into students values (‘Chris’) ;
Now let’s insert few more duplicate records for another user
insert into students values (‘Den’) ;
insert into students values (‘Den’) ;
let’s insert one distinct user records
insert into students values (‘Shiva’) ;
Current data in table
select name from students ;
Confirm if there are duplicate records in table
select name, count(1) from students group by name;
We can identify each record in a table with PostgreSQL inbuilt column ctid
select name, ctid from students ;
list down required i.e. distinct rows.
select name, max(ctid) as max_ctid from students group by name ;
Now we can delete duplicate records with below query.
delete from students s
using
(select name, max(ctid) as max_ctid from students group by name ) t
where s.ctid <> t.max_ctid
and s.name=t.name ;
Crosscheck that duplicate records have been deleted from table.
select name from students ;
Checkout more PostgreSQL posts