Subselect optimieren...
am 04.02.2007 19:17:16 von plutoplanetHello World!
Ich habe zwei Tabellen: Eine für die Kunden und eine aller ihrer
Bestellungen. Letztendlich will nur die letzten Bestellungen meiner
Kunden sehen. Die Query hab ich unten angeführt. Bei höheren
Datenmengen ist diese aber sehr, sehr langsam. Kann mir da jemand
helfen?
Kundentabelle:
CREATE TABLE Customer (
Id INTEGER NOT NULL AUTO_INCREMENT,
Name VARCHAR(10),
INDEX (Id),
PRIMARY KEY (Id)
) ENGINE=3DInnoDB;
INSERT INTO Customer ( Name ) VALUES ( 'Peter' );
INSERT INTO Customer ( Name ) VALUES ( 'Paul' );
INSERT INTO Customer ( Name ) VALUES ( 'Mary' );
INSERT INTO Customer ( Name ) VALUES ( 'John' );
INSERT INTO Customer ( Name ) VALUES ( 'Travolta' );
SELECT * FROM Customer;
+----+----------+
| Id | Name |
+----+----------+
| 1 | Peter |
| 2 | Paul |
| 3 | Mary |
| 4 | John |
| 5 | Travolta |
+----+----------+
Tabelle mit Bestellungen:
CREATE TABLE Orders (
Id INTEGER NOT NULL AUTO_INCREMENT,
Date DATETIME,
CustomerId INTEGER NOT NULL,
INDEX (Id),
INDEX (Date),
INDEX (CustomerId),
FOREIGN KEY (CustomerId) REFERENCES Customer (Id),
PRIMARY KEY (Id)
) ENGINE=3DInnoDB;
Speichern wir mal ein paar Zufallsdaten in die Bestelltabelle:
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
INSERT INTO Orders ( DATE, CustomerId ) VALUES ( DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) ) );
mysql> SELECT * FROM Orders ORDER BY CustomerId, Date;
+----+---------------------+------------+
| Id | Date | CustomerId |
+----+---------------------+------------+
| 3 | 2006-12-09 20:57:05 | 1 |
| 2 | 2007-01-11 16:57:05 | 1 |
| 8 | 2007-01-18 08:57:05 | 1 |
| 10 | 2007-02-02 15:57:05 | 1 |
| 5 | 2007-01-31 16:57:05 | 2 |
| 9 | 2006-12-01 09:57:05 | 4 |
| 6 | 2006-12-22 20:57:05 | 4 |
| 4 | 2007-01-04 05:57:05 | 4 |
| 1 | 2007-01-19 00:57:05 | 4 |
| 7 | 2006-12-18 17:57:05 | 5 |
| 11 | 2006-12-31 10:57:05 | 5 |
+----+---------------------+------------+
11 rows in set (0.00 sec)
Wenn ich nun die jeweils letzten Bestellungen der Kunden Abfragen
will, dann tu ich das so:
SELECT
Customer.Id, Customer.Name,
Orders.Id, Orders.Date
FROM Orders
JOIN
(
SELECT CustomerId, MAX( Date ) AS OrdersDate
FROM Orders
GROUP BY CustomerId
) MaxOrder
ON Orders.CustomerId =3D MaxOrder.CustomerId AND Orders.Date =3D
MaxOrder.OrdersDate
JOIN Customer ON Customer.Id =3D Orders.CustomerId
GROUP BY Orders.CustomerId
ORDER BY Customer.Id
;
+----+----------+----+---------------------+
| Id | Name | Id | Date |
+----+----------+----+---------------------+
| 1 | Peter | 10 | 2007-02-02 15:57:05 |
| 2 | Paul | 5 | 2007-01-31 16:57:05 |
| 4 | John | 1 | 2007-01-19 00:57:05 |
| 5 | Travolta | 11 | 2006-12-31 10:57:05 |
+----+----------+----+---------------------+
5 rows in set (0.00 sec)
Das ist toll, denn das Ergebnis ist richtig!
Da wir das ganze aber nicht nur zum Spass machen, und wir uns viele
Bestellungen erwarten, werde ich diese mal vervielfachen:
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
INSERT INTO Orders ( DATE, CustomerId ) SELECT DATE_SUB( NOW(),
INTERVAL FLOOR( RAND() * 2000 ) HOUR ), FLOOR( 1 + ( RAND() * 5 ) )
FROM Orders;
SELECT COUNT(*) FROM Orders;
+----------+
| COUNT(*) |
+----------+
| 720896 |
+----------+
1 row in set (0.33 sec)
Bei der Abfrage nach den letzten Bestellungen bekomme ich auch ein
richtiges Ergebnis, aber es lässt mich sehr lange warten:
+----+----------+--------+---------------------+
| Id | Name | Id | Date |
+----+----------+--------+---------------------+
| 1 | Peter | 364660 | 2007-02-04 18:58:18 |
| 2 | Paul | 369953 | 2007-02-04 18:58:18 |
| 3 | Mary | 417947 | 2007-02-04 18:58:18 |
| 4 | John | 396817 | 2007-02-04 18:58:18 |
| 5 | Travolta | 366755 | 2007-02-04 18:58:18 |
+----+----------+--------+---------------------+
5 rows in set (6.77 sec)
Kann meine Query in irgendeiner Weise noch optimiert werden?
Danke,
Herwig