Open Source Databases: June 2020

I recently lost my admin password of ubuntu postgresql, this is how I did a reset to access my postgresql again

Step 1 : Run following command change the authentication method in the configuration file and then reload

sudo sed -ibak 's/^\([^#]*\)md5/\1trust/g' pg_hba.conf
/etc/init.d/postgresql reload

Step 2 : Once the authentication method is reset then you can login to postgres using following and set the new password value

psql -U postgres
postgres=# alter user postgres with password 'NEW_PASSWORD';
postgresl=# \q

Step 3 : Once the new password is set, change the authentication method back to md5

sudo sed -i 's/^\([^#]*\)trust/\1md5/g' pg_hba.conf
sudo -u postgres pg_ctl reload or /etc/init.d/postgresql reload

You are all set to login using your new password

One of the biggest problem with analysis or business intelligence is clean data. In many real world scenarios you will find that data is corrupt, missing or duplicate. In this article we are going to demonstrate multiple ways to delete duplicate data from a table.

Step 1: Lets create a dummy table.

create table demo (id int, name char(10))

Step 2: Insert duplicate values

insert into demo(id,name) values(1,'abc');
insert into demo(id,name) values(2,'abd');
insert into demo(id,name) values(3,'vijay');
insert into demo(id,name) values(4,'vijay');
insert into demo(id,name) values(5,'ajay');
insert into demo(id,name) values(6,'sanjay');

Step 3: Select * from demo;

select * from demo;
+------+--------+
| id   | name   |
+------+--------+
|    1 | abc    |
|    2 | abd    |
|    3 | vijay  |
|    4 | vijay  |
|    5 | ajay   |
|    6 | sanjay |
+------+--------+
6 rows in set (0.00 sec)

So the table now contains duplicate values, id 3,4 with name vijay is duplicate. Now lets look at the options to delete this duplicate value

Step 4: Find duplicate value using select

select id, count(name) from demo group by name having count(name)>1;
+------+-------------+
| id   | count(name) |
+------+-------------+
|    3 |           2 |
+------+-------------+
1 row in set (0.00 sec)

So the output shows that id 3 with name vijay has two entries. The simple way to find this was using group by and count on column where you think duplicate are there and should not be. Lets delete this


Step 5: Delete duplicate using inner join
delete d1 from demo d1 inner join demo d2 where d1.id>d2.id AND d1.name=d2.name;
Query OK, 1 row affected (0.00 sec)
Step 6: Check if the query ran successfully, repeat step 4.select id, count(name) from demo group by name having count(name)>1;
Empty set (0.00 sec)

So no output that means duplicate data has been deleted success. Reconfirm with select on table

select id,name from demo;
+------+--------+
| id   | name   |
+------+--------+
|    1 | abc    |
|    2 | abd    |
|    3 | vijay  |
|    5 | ajay   |
|    6 | sanjay |
+------+--------+
5 rows in set (0.00 sec)

In next article, I am going to show other queries to delete duplicate.

Open Source Databases

Wednesday, June 17, 2020

Reset PostgreSQL master password ubuntu bash

Thursday, June 4, 2020

MySQL delete duplicate

Step 1: Lets create a dummy table.

Step 2: Insert duplicate values

Step 3: Select * from demo;

Step 4: Find duplicate value using select

Step 5: Delete duplicate using inner join