Better that `NOT IN`

Better that `NOT IN`

am 15.01.2010 06:01:50 von Junior Ortis

Hi guys i have a problem, 3 big tables: item_instance about 15KK rows,
character_inventory 15KK rows, guild_bank_item 2KK rows.

And i need i clean on item_instance how this query:

DELETE FROM `item_instance` WHERE guid NOT IN(SELECT item FROM
`character_inventory`) AND guid NOT IN(SELECT item_guid FROM
`guild_bank_item`) AND
guid NOT IN(SELECT item_guid FROM `mail_items`) and guid NOT IN(SELECT
itemguid FROM `auctionhouse`);

Well atm is running about 13 hours, State = Sending Data.

I will be a better option ?

Thanks all !

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Better that `NOT IN`

am 15.01.2010 08:00:10 von Peter Brawley

--------------030100080504030101070909
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

For alternatives, have a look at "The unbearable slowness of IN()" at
http://www.artfulsoftware.com/queries.php.

PB

-----

Junior Ortis wrote:
> Hi guys i have a problem, 3 big tables: item_instance about 15KK rows,
> character_inventory 15KK rows, guild_bank_item 2KK rows.
>
> And i need i clean on item_instance how this query:
>
> DELETE FROM `item_instance` WHERE guid NOT IN(SELECT item FROM
> `character_inventory`) AND guid NOT IN(SELECT item_guid FROM
> `guild_bank_item`) AND
> guid NOT IN(SELECT item_guid FROM `mail_items`) and guid NOT IN(SELECT
> itemguid FROM `auctionhouse`);
>
> Well atm is running about 13 hours, State = Sending Data.
>
> I will be a better option ?
>
> Thanks all !
>
>
> ------------------------------------------------------------ ------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.432 / Virus Database: 270.14.139/2620 - Release Date: 01/14/10 07:35:00
>
>

--------------030100080504030101070909--

Re: Better that `NOT IN`

am 15.01.2010 16:36:22 von Tom Worster

On 1/15/10 12:01 AM, "Junior Ortis" wrote:

> Hi guys i have a problem, 3 big tables: item_instance about 15KK rows,
> character_inventory 15KK rows, guild_bank_item 2KK rows.
>
> And i need i clean on item_instance how this query:
>
> DELETE FROM `item_instance` WHERE guid NOT IN(SELECT item FROM
> `character_inventory`) AND guid NOT IN(SELECT item_guid FROM
> `guild_bank_item`) AND
> guid NOT IN(SELECT item_guid FROM `mail_items`) and guid NOT IN(SELECT
> itemguid FROM `auctionhouse`);
>
> Well atm is running about 13 hours, State = Sending Data.
>
> I will be a better option ?

would something like this work?

DELETE FROM item_instance i
LEFT JOIN character_inventory c ON c.item=i.guid
LEFT JOIN guild_bank_item g ON g.item_guid=i.guid
LEFT JOIN mail_items m ON m.item_guid=i.guid
LEFT JOIN auctionhouse a ON a.itemguid=i.guid
WHERE c.item IS NULL
AND g.item_guid IS NULL
AND m.item_guid IS NULL
AND a.itemguid IS NULL



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

RE: Better that `NOT IN`

am 15.01.2010 21:51:55 von Daevid Vincent

Peter, this only addresses a sub-select scenario, which doesn't surprise me
it would be slow. How does IN() fare if you populate it with the ID's
directly? For example: IN(1,3,6,8,19,45,54...) ?

In the example, on the web page, we could have run this as two separate
queries. One for the 'inner' select to get an array of orderID and then
shove those back into the 'outter' query using PHP's implode() or
something. I suspect this would be significantly faster no?

> -----Original Message-----
> From: Peter Brawley [mailto:peter.brawley@earthlink.net]
> Sent: Thursday, January 14, 2010 11:00 PM
> To: Junior Ortis
> Cc: mysql@lists.mysql.com
> Subject: Re: Better that `NOT IN`
>
> For alternatives, have a look at "The unbearable slowness of IN()" at
> http://www.artfulsoftware.com/queries.php.
>
> PB
>
> -----
>
> Junior Ortis wrote:
> > Hi guys i have a problem, 3 big tables: item_instance about
> 15KK rows,
> > character_inventory 15KK rows, guild_bank_item 2KK rows.
> >
> > And i need i clean on item_instance how this query:
> >
> > DELETE FROM `item_instance` WHERE guid NOT IN(SELECT item FROM
> > `character_inventory`) AND guid NOT IN(SELECT item_guid FROM
> > `guild_bank_item`) AND
> > guid NOT IN(SELECT item_guid FROM `mail_items`) and guid
> NOT IN(SELECT
> > itemguid FROM `auctionhouse`);
> >
> > Well atm is running about 13 hours, State = Sending Data.
> >
> > I will be a better option ?
> >
> > Thanks all !
> >
> >
> >
> ------------------------------------------------------------ --
> ----------
> >
> >
> > No virus found in this incoming message.
> > Checked by AVG - www.avg.com
> > Version: 8.5.432 / Virus Database: 270.14.139/2620 -
> Release Date: 01/14/10 07:35:00
> >
> >
>


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Better that `NOT IN`

am 15.01.2010 23:23:22 von Dan Nelson

In the last episode (Jan 15), Daevid Vincent said:
> From: Peter Brawley [mailto:peter.brawley@earthlink.net]
> > Junior Ortis wrote:
> > > Hi guys i have a problem, 3 big tables: item_instance about 15K rows,
> > > character_inventory 15K rows, guild_bank_item 2K rows.
> > >
> > > And i need i clean on item_instance how this query:
> > >
> > > DELETE FROM `item_instance` WHERE guid NOT IN(SELECT item FROM
> > > `character_inventory`) AND guid NOT IN(SELECT item_guid FROM
> > > `guild_bank_item`) AND guid NOT IN(SELECT item_guid FROM `mail_items`)
> > > and guid NOT IN(SELECT itemguid FROM `auctionhouse`);
>
> Peter, this only addresses a sub-select scenario, which doesn't surprise
> me it would be slow. How does IN() fare if you populate it with the ID's
> directly? For example: IN(1,3,6,8,19,45,54...) ?

mysql's subquery optimizer is still pretty bad. It assumes that almost all
subqueries are dependant, and runs them for each outer record. It really
should realize that those queries were independant of the outer query, and
cache the results (or hoist the query out of the loop). Ideally the
performance should be identical to IN(list of constants).

> In the example, on the web page, we could have run this as two separate
> queries. One for the 'inner' select to get an array of orderID and then
> shove those back into the 'outter' query using PHP's implode() or
> something. I suspect this would be significantly faster no?

IN() using constants should be very efficient.

--
Dan Nelson
dnelson@allantgroup.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org