Duplicate entries despite group by

Duplicate entries despite group by

am 21.02.2010 04:14:55 von Yang Zhang

I have the following table:

CREATE TABLE `graph` (
`tableid1` varchar(20) NOT NULL,
`tupleid1` int(11) NOT NULL,
`tableid2` varchar(20) NOT NULL,
`tupleid2` int(11) NOT NULL,
`node1` int(11) NOT NULL,
`node2` int(11) NOT NULL,
`weight` int(10) NOT NULL,
PRIMARY KEY (`tableid1`,`tupleid1`,`tableid2`,`tupleid2`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

and I'm running this query (note the 'group by'):

insert into graph (node1, node2, tableid1, tupleid1, tableid2,
tupleid2, weight)
select 0, 0, a.tableid, a.tupleid, b.tableid, b.tupleid, count(*)
from transactionlog a, transactionlog b
where a.transactionid = b.transactionid
and (a.tableid, a.tupleid) {'<'} (b.tableid, b.tupleid)
group by a.tableid, a.tupleid, b.tableid, b.tupleid

However, after running for a few hours, the query fails with the
following error:

com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViol ationException:
Duplicate entry 'new_order-248642-order_line-13126643' for key
'group_key'

How is this possible? There were no concurrently running queries
inserting into 'graph'. I'm using mysql-5.4.3; is this a beta
bug/anyone else happen to know something about this? Thanks in
advance.
--
Yang Zhang
http://www.mit.edu/~y_z/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Duplicate entries despite group by

am 21.02.2010 10:49:47 von Carsten Pedersen

Is the CREATE TABLE you show the result of SHOW CREATE TABLE or your own
create statement? If the latter, please show the output of SHOW CREATE.

Does SELECT succeed if you remove the INSERT part of the statement?

You might want to consider adding an index on transactionlogid, this
could bring down query time significantly.

/ Carsten

Yang Zhang skrev:
> I have the following table:
>
> CREATE TABLE `graph` (
> `tableid1` varchar(20) NOT NULL,
> `tupleid1` int(11) NOT NULL,
> `tableid2` varchar(20) NOT NULL,
> `tupleid2` int(11) NOT NULL,
> `node1` int(11) NOT NULL,
> `node2` int(11) NOT NULL,
> `weight` int(10) NOT NULL,
> PRIMARY KEY (`tableid1`,`tupleid1`,`tableid2`,`tupleid2`)
> ) ENGINE=MyISAM DEFAULT CHARSET=latin1
>
> and I'm running this query (note the 'group by'):
>
> insert into graph (node1, node2, tableid1, tupleid1, tableid2,
> tupleid2, weight)
> select 0, 0, a.tableid, a.tupleid, b.tableid, b.tupleid, count(*)
> from transactionlog a, transactionlog b
> where a.transactionid = b.transactionid
> and (a.tableid, a.tupleid) {'<'} (b.tableid, b.tupleid)
> group by a.tableid, a.tupleid, b.tableid, b.tupleid
>
> However, after running for a few hours, the query fails with the
> following error:
>
> com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViol ationException:
> Duplicate entry 'new_order-248642-order_line-13126643' for key
> 'group_key'
>
> How is this possible? There were no concurrently running queries
> inserting into 'graph'. I'm using mysql-5.4.3; is this a beta
> bug/anyone else happen to know something about this? Thanks in
> advance.
> --
> Yang Zhang
> http://www.mit.edu/~y_z/
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org