Find missing numbers in a sequence

from the Artful Common Queries page


You have a table tbl(id int) with values (1,2,4,18,19,20,21), and you wish to find the first missing number in its sequence of id values:
SELECT t1.id+1 AS Missing
FROM tbl      AS t1
LEFT JOIN tbl AS t2 ON t1.id+1 = t2.id
WHERE t2.id IS NULL
ORDER BY id LIMIT 1;
+---------+
| Missing |
+---------+
|       3 |
+---------+
For all the gaps, including gaps of more than 1 value, aggregate on an inequality join ...
SELECT a.id+1 AS 'Missing From', MIN(b.id)-1 AS 'Through'
FROM tbl AS a
JOIN tbl AS b ON a.id < b.id
GROUP BY a.id
HAVING a.id+1 < through;
+--------------+-----------+
| Missing From | Through   |
+--------------+-----------+
|            3 |         3 |
|            5 |        17 |
+--------------+-----------+
We often need such lists, so the query is a natural for a stored procedure that finds missing sequence values in any table:
DROP PROCEDURE IF EXISTS MissingInSeq;
DELIMITER |
CREATE PROCEDURE 
  MissingInSeq( db VARCHAR(64), tbl VARCHAR(64), col VARCHAR(64) 
)
BEGIN
  SET @sql = CONCAT( "SELECT  a.", col, 
                     "+1 AS 'Missing From',", 
                     "MIN(b.", 
                     col, 
                     ") - 1 AS 'Through' FROM ",
                     db, 
                     ".", 
                     tbl, 
                     " AS a,", 
                     db, 
                     ".", 
                     tbl, 
                     " AS b WHERE a.", 
                     col, 
                     " < b.", 
                     col, 
                     " GROUP BY a.", 
                     col, 
                     " HAVING a.", 
                     col, 
                     " < MIN(b.", 
                     col, 
                     ") - 1" 
                   ); 
  -- SELECT @sql;
  PREPARE stmt FROM @sql;
  EXECUTE stmt;
  DROP PREPARE stmt;
END;
|
DELIMITER ;

A much quicker algorithm

For row counts up to a few thousand, the above approach is fine. For 105 or more rows, Dan Black, a performance improvement engineer at IBM, developed this gem ...
set @last=-1 ;
CREATE TEMPORARY TABLE v(gap varbinary(30) PRIMARY KEY, next int unsigned)
  IGNORE
  SELECT IF(@last=-1 OR @last=a.id, 'Exists', CONCAT(@last,'-',id-1)) as gap,
         @last:=a.id+1 as next
  FROM target_table AS a
  ORDER BY id;
SELECT gap FROM v where gap!='Exists';
On a modest laptop it found 8,000 gaps in a 2-million-row table in 0.01 sec. How does it do that? CREATE TABLE...IGNORE...SELECT... silently drops all dupes, so it flies through target_table so long as it can do the whole operation without swapping.

Auto_increment gaps

Auto_increment doesn't guarantee sequentiality, but sometimes—e.g., when you're initialising a production lookup table with predefined values—it's necessary to tighten up a table's auto_increment sequence.

Until someone figures out how to adapt Dan's solution to this problem, a simple method assuming table tbl with id int unsigned primary key auto_increment: drop the auto_increment primary key, set all id values null, then add the auto_increment primary key back in, whereupon MySQL will autofill the column sequentially ...

alter table tbl modify id int unsigned, drop primary key;
update tbl set id=null;
alter table tbl add primary key(id), modify id int unsigned auto_increment;
Note that you have to remove the auto_increment attribute before dropping the primary key, and restore the primary key before restoring the auto_increment attribute.

This won't do, though, if auto_increment values need to be ordered by another column value. To order an auto_increment ID on column othercol ...

alter table tbl modify id int unsigned, drop primary key;
set @count=0;
update tbl set id=(@count:=@count+1) order by othercol;
alter table tbl add primary key(id), modify id int unsigned auto_increment;

Last updated 28 Mar 2020