You track orders and their items in orders and orderdetails tables, as in the NorthWind database. How many of your orders have been for multiple items? We can use the standard SQL IN() operator to answer the question:
SELECT orderID FROM orders WHERE orderID IN ( SELECT orderID FROM orderdetails GROUP BY orderID HAVING COUNT(orderID) > 1 );Trouble is, in MySQL 5.0 this query takes 130 times longer to execute than a JOIN version of the same logic:
SELECT orderID FROM orders o JOIN orderdetails od USING (orderID) GROUP BY orderID HAVING COUNT(od.orderID) > 1Why? The query optimiser decides to execute the IN() query outside-in: it runs the subquery once for each outer query row it finds, so if there are 800 orders and 2000 items, it does 1,600,000 reads. On the other hand, it optimises the JOIN very well.
But here's a surprise. If a query has a EXPLAIN EXTENDED SELECT SQL_NO_CACHE orderID FROM orders WHERE orderID IN ( SELECT orderID FROM ( -- redundant FROM subquery SELECT orderID FROM orderdetails GROUP BY orderID HAVING COUNT(orderID) > 1 ) AS tmp );Sure enough, this version of IN() executes as fast as the JOIN version because the total number of reads is 800+2000=2,800, not 800*2000=1,600,000.
Optimisations in MySQL 5.5 reduce the speed difference from 130:1 to 5:1. With large tables, that will still be significant.
You wonder if SELECT orderID FROM orders o WHERE EXISTS ( SELECT orderID FROM orderdetails WHERE orderID = o.orderID GROUP BY orderID HAVING COUNT(orderID)>1 );In 5.0, EXISTS() is much faster than IN() , but slower than JOIN . In 5.5, EXISTS() performs about as well as JOIN .
For SELECT ... FROM a WHERE a.x NOT IN ( SELECT y FROM b );becomes ... SELECT ... FROM a LEFT JOIN b ON a.x=b.y WHERE b.y IS NULL;One more surprise. Running EXPLAIN EXTENDED on these queries does not predict all the performance differences. You have to benchmark the queries. So for problems like this, experiment with the MySQL 5.5/5.6 optimizer_switch variable to see whether your version of this problem is best solved by IN() , IN() with the extra subquery, EXISTS() , or JOIN .
For comparative analysis of |