RE: MySQL Bug: Outer Joins producing incorrect result on a single
am 02.10.2002 16:44:26 von Jean-Pierre PelletierHi,
I have observed these 4 queries to return the same 4 rows on both MS SQL
Server and DB2 and
I suspect on Oracle 9 too.
As there is very few documentation published on how to process multiple
joins,
I deduced the joining rules myself, I might be wrong on them, but the fact
remains
that Mysql gives a different result than the other DBMS.
How-To-Repeat:
SECOND CASE:
create table T1 (ID varchar(10));
create table T2 (ID varchar(10));
create table T3 (ID varchar(10));
insert into T1 values ('T1');
insert into T1 values ('T1, T2');
insert into T1 values ('T1, T3'); -- NEW
insert into T1 values ('T1, T2, T3');
insert into T2 values ('T2'); -- NEW
insert into T2 values ('T1, T2');
insert into T2 values ('T2, T3');
insert into T2 values ('T1, T2, T3');
insert into T3 values ('T3');
insert into T3 values ('T1, T3');
insert into T3 values ('T2, T3');
insert into T3 values ('T1, T2, T3'); -- NEW
-- (1)
-- ((T1 LEFT JOIN T2) LEFT JOIN T3)
SELECT
T1.ID AS T1,
T2.ID AS T2,
T3.ID AS T3
FROM
T1
LEFT OUTER JOIN T2
ON T1.ID = T2.ID
LEFT OUTER JOIN T3
ON T2.ID = T3.ID
ORDER BY
T1.ID,
T2.ID,
T3.ID;
-- (2)
-- (T3 RIGHT JOIN (T1 LEFT JOIN T2))
-- Uses nested joins, not supported by MySQL yet
SELECT
T1.ID AS T1,
T2.ID AS T2,
T3.ID AS T3
FROM
T3
RIGHT OUTER JOIN (T1
LEFT OUTER JOIN T2
ON T1.ID = T2.ID)
ON T3.ID = T2.ID
ORDER BY
T1.ID,
T2.ID,
T3.ID;
-- (3)
-- ((T2 LEFT JOIN T3) RIGHT JOIN T1)
SELECT
T1.ID AS T1,
T2.ID AS T2,
T3.ID AS T3
FROM
T2
LEFT OUTER JOIN T3
ON T2.ID = T3.ID
RIGHT OUTER JOIN T1
ON T1.ID = T2.ID
ORDER BY
T1.ID,
T2.ID,
T3.ID;
-- (4)
-- (T1 LEFT (T2 LEFT JOIN T3))
-- Uses nested joins, not supported by MySQL yet
SELECT
T1.ID AS T1,
T2.ID AS T2,
T3.ID AS T3
FROM
T1
LEFT OUTER JOIN (T2
LEFT OUTER JOIN T3
ON T2.ID = T3.ID)
ON T1.ID = T2.ID
ORDER BY
T1.ID,
T2.ID,
T3.ID;
All these queries should return these 4 rows:
but on MySQL (1) is ok but (3) is not.
T1 T2 T3
---------- ---------- ----------
T1 NULL NULL
T1, T2 T1, T2 NULL
T1, T2, T3 T1, T2, T3 T1, T2, T3
T1, T3 NULL NULL
Regards
Jean-Pierre Pelletier
-----Original Message-----
From: Vladimir V. Kolpakov [mailto:w@sfgate.com]
Sent: Wednesday, October 02, 2002 12:00 AM
To: Jean-Pierre Pelletier
Subject: Re: MySQL Bug: Outer Joins producing incorrect result on a
single join referencing multiple tables
Jean-Pierre, --
What you wrote can't be true, Monty pointed you:
> SELECT
> T1.ID AS T1,
> T2.ID AS T2,
> T3.ID AS T3
> FROM
> T2
>
> LEFT OUTER JOIN T3
> ON T2.ID = T3.ID
>
> RIGHT OUTER JOIN T1
> ON T1.ID = T2.ID
> ORDER BY
> T1.ID,
> T2.ID,
> T3.ID;
>
> The following 3 rows should be returned
>
> +------+------+------+
> | T1 | T2 | T3 |
> +------+------+------+
> | 1 | NULL | NULL |
> | 2 | 2 | NULL |
> | 3 | 3 | 3 |
> +------+------+------+
There T2 is leading table in joins,
so it's ID can't be NULL.
Seems to me you actually meant something like this
FROM
T1
LEFT JOIN T2
ON T2.ID = T1.ID
LEFT JOIN T1
ON T3.ID = T2.ID
isn't it?
It should give you this result:
+------+------+------+
| T1 | T2 | T3 |
+------+------+------+
| 1 | NULL | NULL |
| 2 | 2 | NULL |
| 3 | 3 | 3 |
+------+------+------+
Regards,
--Vladimir
On Tue, Oct 01, 2002 at 09:06:03AM -0700, Jean-Pierre Pelletier wrote:
> Hi,
>
> Here is how I think it should work.
> - Joins should be processed from left to right (in the absence of nested
> joins).
>
> - Each join is for exactly 2 table expressions.
>
> - The table expression on the left side of a join is the result of all
> preceding joins
> or the first table in the from clause if there are no preceding joins.
>
> - The table expression on the right side of a join is the single table
named
> unless
> nested joins are used.
>
> - I believed than to be consistent with this, in the on clause of a join,
> forward references to column of tables that haven't yet been joined
should
> be disallowed.
>
>
> These rules are for SQL-92 inner/outer join, I don't use much old style
> joins with commas.
> I am not sure how the joins should be ordered when old-style joins and
> outer-joins are used
> as in your last example.
>
>
> I haven't found much documentation about the join processing
> so what I know come from experience with Ms SQL Server
> and more recently DB2. I have used Oracle until 8i but it didn't have
> SQL-92 compliant outer joins before 9i.
>
>
> Here is the only reference I found about join processing
> but interestingly it's about multi-table joins and
> it's published by Oracle which interpretation of the standard
> shouldn't be taken lightly.
>
> Mutable Joins are those where more than two tables are joined.
> The SQL: 1999 standard assumes the tables are joined from the left to the
> right,
> with the join conditions only being able to reference columns relating to
> the current join and any previous joins to the left
> http://www.oracle-base.com/Articles/9i/ANSIISOSQLSupport.asp
>
>
> The above rules applied to the 2 cases I submitted would give
> what I called the right answers.
>
>
> You said:
> "you would get a lot of problems using LEFT join in
> combination with many tables where you only want to guard against
> missing rows in one the tables (which I believe is the most common way
> to use LEFT join)."
>
> It's possible to apply the rules I described without problems
> that's what MS SQL Server and DB2 do and Oracle has published
> their interpretation of the SQL-1999 rules to be along the same lines.
>
> How-To-Repeat:
>
> FIRST CASE:
>
> CREATE TABLE Object (Object_ID INTEGER NOT NULL, Object_Table VARCHAR(6)
NOT
> NULL);
> CREATE TABLE NMID (Appliance_ID INTEGER NOT NULL, NMID_NMID INTEGER,
> Object_ID INTEGER NOT NULL);
> CREATE TABLE Health (Appliance_ID INTEGER NOT NULL, Health_NMID INTEGER
NOT
> NULL, Health_DeviceOrPort VARCHAR(6) NOT NULL);
>
> INSERT INTO Object(Object_ID, Object_Table) VALUES (101,'Device');
> INSERT INTO Object(Object_ID, Object_Table) VALUES (201,'Device');
> INSERT INTO Object(Object_ID, Object_Table) VALUES (202,'Port');
> INSERT INTO Object(Object_ID, Object_Table) VALUES (302,'Port');
>
> INSERT INTO NMID (Appliance_ID, NMID_NMID, Object_ID) VALUES (0, 1, 101);
> INSERT INTO NMID (Appliance_ID, NMID_NMID, Object_ID) VALUES (0, 2, 201);
> INSERT INTO NMID (Appliance_ID, NMID_NMID, Object_ID) VALUES (0, 2, 202);
> INSERT INTO NMID (Appliance_ID, NMID_NMID, Object_ID) VALUES (0, 3, 302);
>
> INSERT INTO Health (Appliance_ID, Health_NMID, Health_DeviceOrPort) VALUES
> (0, 1, 'Device');
> INSERT INTO Health (Appliance_ID, Health_NMID, Health_DeviceOrPort) VALUES
> (0, 2, 'Device');
> INSERT INTO Health (Appliance_ID, Health_NMID, Health_DeviceOrPort) VALUES
> (0, 3, 'Device');
> INSERT INTO Health (Appliance_ID, Health_NMID, Health_DeviceOrPort) VALUES
> (0, 4, 'Device');
>
> ===> THE RIGHT JOIN REFERENCES ALL THREE TABLES, it can't be processed
first
>
> SELECT DISTINCT
> H.Appliance_ID, H.Health_NMID, H.Health_DeviceOrPort
> FROM
> Object O
>
> INNER JOIN NMID N
> ON O.Object_ID = N.Object_ID
>
> RIGHT OUTER JOIN Health H
> ON H.Appliance_ID = N.Appliance_ID
> AND H.Health_NMID = N.NMID_NMID
> AND H.Health_DeviceOrPOrt = O.Object_Table
> WHERE
> H.Appliance_ID = 0
> AND O.Object_ID IS NULL;
>
> The following 2 rows should be returned
>
> Appliance_ID Health_NMID Health_DeviceOrPort
> ------------ ----------- -------------------
> 0 3 Device
> 0 4 Device
>
>
> On MYSQL, This incorrectly returns no rows.
>
>
> SECOND CASE)
>
>
> create table T1 (ID integer);
> create table T2 (ID integer);
> create table T3 (ID integer);
>
> insert into T1 values (1);
> insert into T1 values (2);
> insert into T1 values (3);
>
> insert into T2 values (2);
> insert into T2 values (3);
> insert into T2 values (4);
>
> insert into T3 values (3);
> insert into T3 values (4);
> insert into T3 values (5);
>
> SELECT
> T1.ID AS T1,
> T2.ID AS T2,
> T3.ID AS T3
> FROM
> T2
>
> LEFT OUTER JOIN T3
> ON T2.ID = T3.ID
>
> RIGHT OUTER JOIN T1
> ON T1.ID = T2.ID
> ORDER BY
> T1.ID,
> T2.ID,
> T3.ID;
>
> The following 3 rows should be returned
>
> +------+------+------+
> | T1 | T2 | T3 |
> +------+------+------+
> | 1 | NULL | NULL |
> | 2 | 2 | NULL |
> | 3 | 3 | 3 |
> +------+------+------+
>
> MySQL incorrectly returns these 9 rows.
>
> +------+------+------+
> | T1 | T2 | T3 |
> +------+------+------+
> | 1 | 2 | NULL |
> | 1 | 3 | NULL |
> | 1 | 4 | NULL |
> | 2 | 2 | NULL |
> | 2 | 3 | NULL |
> | 2 | 4 | NULL |
> | 3 | 2 | NULL |
> | 3 | 3 | 3 |
> | 3 | 4 | NULL |
> +------+------+------+
>
>
> Regards,
> Jean-Pierre Pelletier
>
> -----Original Message-----
> From: Michael Widenius [mailto:monty@mysql.com]
> Sent: Tuesday, October 01, 2002 2:04 AM
> To: Jean-Pierre Pelletier
> Cc: bugs@lists.mysql.com; Mark Matthews; Robert Berman; Eric Neron;
> Matthew Darwin; Allan Mertner
> Subject: RE: MySQL Bug: Outer Joins producing incorrect result on a
> single join referencing multiple tables
>
>
>
> Hi!
>
> >>>>> "Jean-Pierre" == Jean-Pierre Pelletier
>
>
> JP> How-To-Repeat:
>
> JP> FIRST CASE:
>
> JP> CREATE TABLE Object (Object_ID INTEGER NOT NULL, Object_Table
VARCHAR(6)
> NOT
> JP> NULL);
>
>
>
> JP> ===> THE RIGHT JOIN REFERENCES ALL THREE TABLES, it can't be processed
> first
>
> JP> SELECT DISTINCT
> JP> H.Appliance_ID, H.Health_NMID, H.Health_DeviceOrPort
> JP> FROM
> JP> Object O
>
> JP> INNER JOIN NMID N
> JP> ON O.Object_ID = N.Object_ID
>
> JP> RIGHT OUTER JOIN Health H
> JP> ON H.Appliance_ID = N.Appliance_ID
> JP> AND H.Health_NMID = N.NMID_NMID
> JP> AND H.Health_DeviceOrPOrt = O.Object_Table
> JP> WHERE
> JP> H.Appliance_ID = 0
> JP> AND O.Object_ID IS NULL;
>
> JP> The following 2 rows should be returned
>
> JP> Appliance_ID Health_NMID Health_DeviceOrPort
> JP> ------------ ----------- -------------------
> JP> 0 3 Device
> JP> 0 4 Device
>
>
> JP> On MYSQL, This incorrectly returns no rows.
>
> If you run EXPLAIN on the above query you get:
>
> +-----------------------------------------------------+
> | Comment |
> +-----------------------------------------------------+
> | Impossible WHERE noticed after reading const tables |
> +-----------------------------------------------------+
> 1 row in set (0.00 sec)
>
> This is because MySQL notices that O.Object_ID can never be NULL and
> thus the query is impossible.
>
> The logic MySQL uses with RIGHT JOIN is that if the RIGHT table (in
> this case table H) doesn't have a any rows that satisfies the ON
> clause then we generate a NULL-row for this table. MySQL doesn't
> generate any NULL-rows for the O table as this is not part of the
> RIGHT join.
>
> I don't know if this is the correct way to do this; I have always
> thought if you don't have braces RIGHT JOIN and LEFT JOIN will only
> generate a NULL-row for the table that is immediate before or after,
> not to a set of previous tables.
>
> I did a new check in 'SQL-99 Complete, Really' but could not find
> anything that proves or disapproves this logic :(
> Do you have any reference that shows how this should work when there
> is more than two tables involved ?
>
>
>
> JP> create table T1 (ID integer);
> JP> create table T2 (ID integer);
> JP> create table T3 (ID integer);
>
> JP> insert into T1 values (1);
> JP> insert into T1 values (2);
> JP> insert into T1 values (3);
>
> JP> insert into T2 values (2);
> JP> insert into T2 values (3);
> JP> insert into T2 values (4);
>
> JP> insert into T3 values (3);
> JP> insert into T3 values (4);
> JP> insert into T3 values (5);
>
> JP> SELECT
> JP> T1.ID AS T1,
> JP> T2.ID AS T2,
> JP> T3.ID AS T3
> JP> FROM
> JP> T2
>
> JP> LEFT OUTER JOIN T3
> JP> ON T2.ID = T3.ID
>
> JP> RIGHT OUTER JOIN T1
> JP> ON T1.ID = T2.ID
> JP> ORDER BY
> JP> T1.ID,
> JP> T2.ID,
> JP> T3.ID;
>
> JP> The following 3 rows should be returned
>
> JP> +------+------+------+
> JP> | T1 | T2 | T3 |
> JP> +------+------+------+
> JP> | 1 | NULL | NULL |
> JP> | 2 | 2 | NULL |
> JP> | 3 | 3 | 3 |
> JP> +------+------+------+
>
> JP> MySQL incorrectly returns these 9 rows.
>
> JP> +------+------+------+
> JP> | T1 | T2 | T3 |
> JP> +------+------+------+
> JP> | 1 | 2 | NULL |
> JP> | 1 | 3 | NULL |
> JP> | 1 | 4 | NULL |
> JP> | 2 | 2 | NULL |
> JP> | 2 | 3 | NULL |
> JP> | 2 | 4 | NULL |
> JP> | 3 | 2 | NULL |
> JP> | 3 | 3 | 3 |
> JP> | 3 | 4 | NULL |
> JP> +------+------+------+
>
> First, the result you gave can't be correct as T2 can never be NULL.
>
> An EXPLAIN for the above shows the join order MySQL uses:
>
> ----------------+
> | table | type | possible_keys | key | key_len | ref | rows | Extra
> |
>
+-------+------+---------------+------+---------+------+---- --+-------------
> --------------------+
> | T2 | ALL | NULL | NULL | NULL | NULL | 3 | Using
> temporary; Using filesort |
> | T1 | ALL | NULL | NULL | NULL | NULL | 3 |
> |
> | T3 | ALL | NULL | NULL | NULL | NULL | 3 |
> |
>
+-------+------+---------------+------+---------+------+---- --+-------------
> --------------------+
>
> In other words, MySQL will solve the JOIN the following way:
>
> - Read a row from T2
> - Read through all rows from T1
> - Read through all rows from T3
> If there is no rows matching the ON clause generate an extra
> NULL-row in T3
>
> - As T2 is used on the LEFT side of a LEFT join, we don't generate
> NULL-rows for this.
> - As T1 is on the RIGHT side of a RIGHT join, we don't generate NULL
> rows for this.
>
> I think the above is the only logical way to do this. If this would
> not be the logic, you would get a lot of problems using LEFT join in
> combination with many tables where you only want to guard against
> missing rows in one the tables (which I belive is the most common way
> to use LEFT join).
>
> For example:
>
> SELECT * from customer_group, order LEFT JOIN customer USING
> (customer_id), product WHERE ....
>
> In this case you don't want to have generated null-rows for the
> customer_group, product or order tables as these is not part of the
> left join.
>
> To repeat: MySQL interpretes the above the above LEFT join to mean
>
> - You have to read 'order' before customer.
> - If there is no matching customer, generate a null row in the
> customer table.
> - Other tables are not affected by the LEFT join.
>
>
> Regards,
> Monty
>
> --
> For technical support contracts, goto https://order.mysql.com/
> __ ___ ___ ____ __
> / |/ /_ __/ __/ __ \/ / Mr. Michael Widenius
> / /|_/ / // /\ \/ /_/ / /__ MySQL AB, CTO
> /_/ /_/\_, /___/\___\_\___/ Helsinki, Finland
> <___/ www.mysql.com
>
> ------------------------------------------------------------ ---------
> Before posting, please check:
> http://www.mysql.com/manual.php (the manual)
> http://lists.mysql.com/ (the list archive)
>
> To request this thread, e-mail bugs-thread12632@lists.mysql.com
> To unsubscribe, e-mail
--
--w ----
Vladimir Kolpakov (415) 865-2882 /|\/|\ 201 Potrero Avenue
Software Engineer (415) 999-0985 ~I~~I~ San Francisco, CA 94103
------------------------------------------------------------ --------
------------------------------------------------------------ ---------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)
To request this thread, e-mail bugs-thread12633@lists.mysql.com
To unsubscribe, e-mail