Join or subquery?

from the Artful Common Queries page


Usually, a JOIN is faster than an uncorrelated subquery. For example in the sakila test database, customer is a parent of rental (via customer_id) which in turn is a parent of payment (via rental_id). The subquery version of a query for whether a customer has made payments and rentals...


SELECT DISTINCT c.customer_id

FROM customer c

WHERE c.customer_id IN (

   SELECT r.customer_id 

   FROM rental r

   JOIN payment p USING (rental_id) 

   WHERE c.customer_id = 599;

);



is eight times slower than the join version...

 

SELECT DISTINCT c.customer_id

FROM customer c

JOIN rental r USING (customer_id)

JOIN payment p USING (rental_id) 

WHERE c.customer_id = 599;



Running EXPLAIN on the two queries reveals why: the subquery version has to read most customer rows, while the join version proceeds inside out and discovers it needs to read just one customer row.

Return to the Artful Common Queries page