We are given email table which includes user_id, user_name and email
and our task is to find out duplicate record from this table.
How can we get this result using windows function?
Idea:
And then if we partition them by user_name (which should be unique for each user), and if there are any duplicate user_name then row number would be different for same user_name.
Something like this...
And now, it is an easy task to pick a row number the one which is not unique
So we know that Keeny is a duplicate user_name.
Using this method we can find any duplicates in the column. Just have to make sure that the column that we are taking as a reference (user_name in this case) is supposed to have unique records.
In this table, user_id is also unique and user_name is also unique. But after checking for duplicates in user_id, we could not get any duplicates so we checked for the other unique value column and found duplicate.
In case if we did not get duplicate using user_name then we could check duplicate in email column also as each user_name is supposed to have a unique email address.
No comments:
Post a Comment