Remove duplicate records/objects uniquely identified by multiple attributes
Posted
by keruilin
on Stack Overflow
See other posts from Stack Overflow
or by keruilin
Published on 2010-05-07T20:10:02Z
Indexed on
2010/05/07
20:18 UTC
Read the original article
Hit count: 284
I have a model called HeroStatus with the following attributes:
- id
- user_id
- recordable_type
- hero_type (can be NULL!)
- recordable_id
- created_at
There are over 100 hero_statuses, and a user can have many hero_statuses, but can't have the same hero_status more than once.
A user's hero_status is uniquely identified by the combination of recordable_type + hero_type + recordable_id. What I'm trying to say essentially is that there can't be a duplicate hero_status for a specific user.
Unfortunately, I didn't have a validation in place to assure this, so I got some duplicate hero_statuses for users after I made some code changes. For example:
user_id = 18
recordable_type = 'Evil'
hero_type = 'Halitosis'
recordable_id = 1
created_at = '2010-05-03 18:30:30'
user_id = 18
recordable_type = 'Evil'
hero_type = 'Halitosis'
recordable_id = 1
created_at = '2009-03-03 15:30:00'
user_id = 18
recordable_type = 'Good'
hero_type = 'Hugs'
recordable_id = 1
created_at = '2009-02-03 12:30:00'
user_id = 18
recordable_type = 'Good'
hero_type = NULL
recordable_id = 2
created_at = '2009-012-03 08:30:00'
(Last two are not a dups obviously. First two are.) So what I want to do is get rid of the duplicate hero_status. Which one? The one with the most-recent date.
I have three questions:
How do I remove the duplicates using a SQL-only approach?
How do I remove the duplicates using a pure Ruby solution? Something similar to this: http://stackoverflow.com/questions/2790004/removing-duplicate-objects.
How do I put a validation in place to prevent duplicate entries in the future?
© Stack Overflow or respective owner