How do I get rid of duplicate records?

How do I get rid of duplicate records?

am 02.03.2006 20:47:26 von phillip.s.powell

mysql> select id, student_first_name, student_last_name, email,
application_date, modification_date, unique_key from student where id
in (7268, 862);
+------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+
| id | student_first_name | student_last_name | email
| application_date | modification_date | unique_key |
+------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+
| 862 | Phil | Powell | phil@blah.com |
2006-02-27 00:00:00 | 2006-02-27 00:00:00 | dF0WByrCP0vACftA |
| 7268 | Phil | Powell | phil@blah.com |
2006-02-27 00:00:00 | 2006-02-27 00:00:00 | dF0WByrCP0vACftA |
+------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+


I accidentally created duplicate records upon attempting to migrate
data from one server to another. You will have 2, 3 or more records
with every single field identical except for the ID. Best way to know
they're dups is by "application_date" along with "unique_key".

How do I get rid of the dups?

Thanx
Phil

Re: How do I get rid of duplicate records?

am 03.03.2006 02:16:32 von avidfan

phillip.s.powell@gmail.com wrote:
> mysql> select id, student_first_name, student_last_name, email,
> application_date, modification_date, unique_key from student where id
> in (7268, 862);
> +------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+
> | id | student_first_name | student_last_name | email
> | application_date | modification_date | unique_key |
> +------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+
> | 862 | Phil | Powell | phil@blah.com |
> 2006-02-27 00:00:00 | 2006-02-27 00:00:00 | dF0WByrCP0vACftA |
> | 7268 | Phil | Powell | phil@blah.com |
> 2006-02-27 00:00:00 | 2006-02-27 00:00:00 | dF0WByrCP0vACftA |
> +------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+
>
>
> I accidentally created duplicate records upon attempting to migrate
> data from one server to another. You will have 2, 3 or more records
> with every single field identical except for the ID. Best way to know
> they're dups is by "application_date" along with "unique_key".
>
> How do I get rid of the dups?
>
> Thanx
> Phil
>


This is going to depend on which values you want to keep.


mysql> select * from testa;
+------+------+------+
| a | c | d |
+------+------+------+
| 1 | 2 | 3 |
| 2 | 2 | 3 |
| 3 | 3 | 4 |
| 4 | 3 | 4 |
+------+------+------+
4 rows in set (0.01 sec)

mysql> select c,d,min(a) e,count(*) from testa group by c,d having
count(*) > 1;
+------+------+--------+----------+
| c | d | e | count(*) |
+------+------+--------+----------+
| 2 | 3 | 1 | 2 |
| 3 | 4 | 3 | 2 |
+------+------+--------+----------+
2 rows in set (0.01 sec)

mysql> select c,d,max(a) e from testa group by c,d having count(*) > 1;

+------+------+------+
| c | d | e |
+------+------+------+
| 2 | 3 | 2 |
| 3 | 4 | 4 |
+------+------+------+
2 rows in set (0.00 sec)


test this by changing the "delete from" to "select * from"

DELETE FROM some_table WHERE primaryKey NOT IN
(SELECT MIN(primaryKey) FROM some_table GROUP BY some_column)

Re: How do I get rid of duplicate records?

am 03.03.2006 18:26:13 von phillip.s.powell

noone wrote:
> phillip.s.powell@gmail.com wrote:
> > mysql> select id, student_first_name, student_last_name, email,
> > application_date, modification_date, unique_key from student where id
> > in (7268, 862);
> > +------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+
> > | id | student_first_name | student_last_name | email
> > | application_date | modification_date | unique_key |
> > +------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+
> > | 862 | Phil | Powell | phil@blah.com |
> > 2006-02-27 00:00:00 | 2006-02-27 00:00:00 | dF0WByrCP0vACftA |
> > | 7268 | Phil | Powell | phil@blah.com |
> > 2006-02-27 00:00:00 | 2006-02-27 00:00:00 | dF0WByrCP0vACftA |
> > +------+--------------------+-------------------+----------- -----------------+---------------------+-------------------- -+------------------+
> >
> >
> > I accidentally created duplicate records upon attempting to migrate
> > data from one server to another. You will have 2, 3 or more records
> > with every single field identical except for the ID. Best way to know
> > they're dups is by "application_date" along with "unique_key".
> >
> > How do I get rid of the dups?
> >
> > Thanx
> > Phil
> >
>
>
> This is going to depend on which values you want to keep.
>
>
> mysql> select * from testa;
> +------+------+------+
> | a | c | d |
> +------+------+------+
> | 1 | 2 | 3 |
> | 2 | 2 | 3 |
> | 3 | 3 | 4 |
> | 4 | 3 | 4 |
> +------+------+------+
> 4 rows in set (0.01 sec)
>
> mysql> select c,d,min(a) e,count(*) from testa group by c,d having
> count(*) > 1;
> +------+------+--------+----------+
> | c | d | e | count(*) |
> +------+------+--------+----------+
> | 2 | 3 | 1 | 2 |
> | 3 | 4 | 3 | 2 |
> +------+------+--------+----------+
> 2 rows in set (0.01 sec)
>
> mysql> select c,d,max(a) e from testa group by c,d having count(*) > 1;
>
> +------+------+------+
> | c | d | e |
> +------+------+------+
> | 2 | 3 | 2 |
> | 3 | 4 | 4 |
> +------+------+------+
> 2 rows in set (0.00 sec)
>
>
> test this by changing the "delete from" to "select * from"
>
> DELETE FROM some_table WHERE primaryKey NOT IN
> (SELECT MIN(primaryKey) FROM some_table GROUP BY some_column)

What I wound up doing very late last night (love working at midnight..
*sigh*).. was

CREATE TABLE temp_student SELECT min(id) AS id, student_first_name,
student_last_name, email, application_date, modification_date,
unique_key from student

DELETE s.* FROM student s, temp_student t WHERE s.id != t.id AND
s.student_first_name = t.student_first_name ...

It worked, however, it crashed both MySQL and Apache upon transacting
:(

Phil