I'm trying to update a table to remove all but the first instance of a group. Basically, I have a table with vehicle data related to an insurance policy. Each policy should only have one power_unit. Everything else should be a towed unit. Unfortunately, a bug has been duplicating power units, and now I need to clean this up. There are ~10k records in the database, and ~4k of them have doubled up power units.
The important bits of my table (call it test1 for now) are:
+------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------+------+-----+---------+----------------+
| id | int(10) | NO | PRI | NULL | auto_increment |
| policy_id | int(10) | NO | | NULL | |
| power_unit | int(1) | NO | | 0 | |
+------------+---------+------+-----+---------+----------------+
And some sample data:
+----+-----------+------------+
| id | policy_id | power_unit |
+----+-----------+------------+
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 1 |
| 4 | 2 | 1 |
| 5 | 2 | 1 |
| 6 | 2 | 1 |
| 7 | 4 | 1 |
| 8 | 4 | 1 |
| 9 | 4 | 1 |
| 10 | 5 | 1 |
| 11 | 5 | 1 |
| 12 | 6 | 1 |
+----+-----------+------------+
Basically I'd like to end up where policy_id 1 has only one power_unit=1. Same for policy_id 2, 3, 4, etc. For policy_id 6, nothing should change (there is only one entry, and it is a power_unit already).
I don't know if this is possible, but it was an intriguing problem for me, so I thought you guys might find it the same.