What exams did a student not register for?

from the Artful Common Queries page


We have a students table, an exams table, and a registrations table in which each row registers one student for one exam. How do we find the exams for which a particular student is not registered?
DROP TABLE IF EXISTS students, exams, registrations;
CREATE TABLE students (
  sid int(10) unsigned PRIMARY KEY auto_increment,
  firstname varchar(45) NOT NULL default '',
  lastname varchar(45) NOT NULL default ''
);
INSERT INTO students VALUES 
(1, 'Jack', 'Malone'),(2, 'Hiro', 'Nakamura'),(3, 'Bree', 'Van de Kamp'),
(4, 'Susan', 'Mayer'),(5, 'Matt', 'Parkman'),(6, 'Claire', 'Bennet');

CREATE TABLE exams (
  eid int(10) unsigned PRIMARY KEY auto_increment,
  exam_name varchar(45) NOT NULL default '',
  active smallint(5) unsigned NOT NULL default '0'
);
INSERT INTO exams VALUES 
(1, 'Javascript Expert', 1),(2, 'Lost Survival Course', 0),(3, 'Zend PHP Certification', 1),
(4, 'Superhero Advanced Skills', 1),(5, 'Desperation Certificate', 1);

CREATE TABLE registrations (
  registration_id int(11) PRIMARY KEY auto_increment,
  eid int(10) unsigned NOT NULL default '0',
  sid int(10) unsigned NOT NULL default '0',
  registration_date datetime NOT NULL default '0000-00-00 00:00:00'
);
INSERT INTO registrations (registration_id, eid, sid, registration_date) VALUES 
(1, 5, 14, '2007-10-25 00:00:00'),(2, 5, 3, '0000-00-00 00:00:00'),
(3, 5, 4, '2007-10-23 00:00:00'),(4, 4, 2, '2007-10-16 00:00:00'),
(5, 4, 5, '2007-10-22 00:00:00'),(6, 4, 6, '2007-10-23 00:00:00'), 
(7, 5, 2, '2007-10-23 00:00:00');
This is another instance of the All X for which there is no Y pattern, easily solved with either a Not Exists subquery or an Exclusion join. Exclusion joins usually perform better, but here is a wrinkle (thanks to Pascal Mitride for this example): we might expect to join registrations to students to get student info into the result, yet the registrations table will be the object of the exclusion join, so how do we retrieve the required student info?

A solution is to left join exams to a subquery: students left join registrations, restricted by a WHERE clause specifying the target student. The subquery encapsulates conditions on the joined table. Then we can impose the IS NULL condition on the subquery:

SELECT e.exam_name FROM exams AS e       -- all exams
LEFT JOIN (
  SELECT eid                             -- Hiro's exams
  FROM students s LEFT JOIN registrations r ON s.sid=r.sid 
  WHERE s.firstname='Hiro' AND s.lastname='Nakamura'
) AS tmp ON e.eid=tmp.eid
WHERE tmp.eid IS NULL;                   -- exclusion
+------------------------+
| exam_name              |
+------------------------+
| Javascript Expert      |
| Lost Survival Course   |
| Zend PHP Certification |
+------------------------+
A query showing Hiro's registration or not for all exams proves our logic correct:
SELECT e.exam_name,IF(tmp.sid IS NULL, 'No', 'Yes') AS 'Hiro registered'
FROM exams e
LEFT JOIN (
  SELECT eid 
  FROM students s LEFT JOIN registrations r ON s.sid=r.sid 
  WHERE s.firstname='Hiro' AND s.lastname='Nakamura'
) tmp ON e.eid=tmp.eid;
+---------------------------+-----------------+
| exam_name                 | Hiro registered |
+---------------------------+-----------------+
| Javascript Expert         | No              |
| Lost Survival Course      | No              |
| Zend PHP Certification    | No              |
| Superhero Advanced Skills | Yes             |
| Desperation Certificate   | Yes             |
+---------------------------+-----------------+
Beginners often have trouble working out how to write this kind of query, especially when the LEFT JOIN condition requires comparison with a literal value on the right side, thereby making the LEFT JOIN, effectively, an INNER JOIN and undermining the query design.

The method is to subtract one set (here, the exams Hiro registered for) from another (all exams). Start by writing a query expression for the set to be subtracted, then write the query expression to be subtracted from, then LEFT JOIN the two, then add the IS NULL condition:

1. Write a query expression to retrieve rows which positively match the exclusion condition. In simple cases, this may be a simple table reference. Here we need it to be an inner query for the exams Hiro Nakamura did register for...

  SELECT eid 
  FROM students s LEFT JOIN registrations r ON s.sid=r.sid 
  WHERE s.firstname='Hiro' AND s.lastname='Nakamura'
2. Write a query to retrieve the rows from which the result of [1] is to be subtracted--here, simply all exams...
SELECT e.exam_name FROM exams e 
3. Left join [2] to [1] on the appropriate key...
SELECT e.exam_name FROM exams AS e
LEFT JOIN (
  SELECT eid 
  FROM students s LEFT JOIN registrations r ON s.sid=r.sid 
  WHERE s.firstname='Hiro' AND s.lastname='Nakamura'
) AS tmp ON e.eid=tmp.eid
4. Add a WHERE clause condition requiring that some column in the inner query evaluates to NULL.
WHERE tmp.eid IS NULL;

Return to the Artful Common Queries page