ISO SQL defines ROW_NUMBER() OVER as a "windowing" or "analytic" function with an optional PARTITION clause for generating a derived row number column in a resultset. Several RDBMSs—including DB2, Oracle and SQL Server—have long implemented it. It came into MariaDB with version 10.0.0, into MySQL with version 8.0.2.
The whole list of windowing/analytic functions:
Here is the simplest possible Row_Number() example. Given a table with two columns i and j, generate a resultset that has a derived sequential row_number column taking the values 1,2,3,... for a defined ordering of j which resets to 1 when the value of i changes: DROP TABLE IF EXISTS test; CREATE TABLE test(i int,j int); INSERT INTO test VALUES (3,31),(1,11),(4,14),(1,13),(2,21),(1,12),(2,22),(3,32),(2,23),(3,33);The result must look like this: +------+------+------------+ | i | j | row_number | +------+------+------------+ | 1 | 11 | 1 | | 1 | 12 | 2 | | 1 | 13 | 3 | | 2 | 21 | 1 | | 2 | 22 | 2 | | 2 | 23 | 3 | | 3 | 31 | 1 | | 3 | 32 | 2 | | 3 | 33 | 3 | | 4 | 14 | 1 | +------+------+------------+The SQL Server query that gives the correct answer is: SELECT i, j, ROW_NUMBER() OVER ( PARTITION BY i ORDER BY j ) AS row_number FROM test ORDER BY i,j;This works in MariaDB 10.2 and MySQL 8.0.2. Are you stuck if you can't use those versions? No. Here are two ways to get the desired result without Row_Number(). First, with user variables: SET @iprev=0, @jprev=0; SELECT i, j, row_number FROM ( SELECT j, @jprev := if(@iprev = i, @jprev+1, 1) AS row_number, @iprev := i AS i FROM test ORDER BY i,j ) AS tmp;The second method uses a join and aggregation, but is correct only if there are no duplicate values of j
SELECT a.i, a.j, count(*) as row_number FROM test a JOIN test b ON a.i=b.i AND a.j >= b.j GROUP BY a.i, a.j ;To simplify generalising these query patterns to multiple ordering columns, use self-documenting column names: DROP TABLE IF EXISTS test; CREATE TABLE test(partition int, ord1 int, ord2 int); INSERT INTO test VALUES (3,31,55),(1,11,19),(4,14,23),(2,22,8),(2,22,42),(1,13,56), (2,21,77),(2,21,7),(1,12,17),(2,23,92),(3,32,24),(3,33,62);The user variable solution easily handles multiple ordering columns: just add the ordering column names to the outer query's SELECT list and the inner query's ORDER BY list:
SET @partitionPrev=0, @ordPrev=0; SELECT partition, ord1, ord2, row_number FROM ( SELECT ord1, ord2, @ordPrev := If(@partitionPrev = partition, @ordPrev+1, 1) as row_number, @partitionPrev := partition AS partition FROM test ORDER BY partition, ord1, ord2 ) AS tmp; +-----------+------+------+------------+ | partition | ord1 | ord2 | row_number | +-----------+------+------+------------+ | 1 | 11 | 19 | 1 | | 1 | 12 | 17 | 2 | | 1 | 13 | 56 | 3 | | 2 | 21 | 7 | 1 | | 2 | 21 | 77 | 2 | | 2 | 22 | 8 | 3 | | 2 | 22 | 42 | 4 | | 2 | 23 | 92 | 5 | | 3 | 31 | 55 | 1 | | 3 | 32 | 24 | 2 | | 3 | 33 | 62 | 3 | | 4 | 14 | 23 | 1 | +-----------+------+------+------------+In the aggregating solution for multiple columns, the theta join gets more complex as ordering columns are added: SELECT a.partition, a.ord1, a.ord2, count(*) as row_number FROM test a JOIN test b ON a.partition=b.partition AND (a.ord1>b.ord1 OR (a.ord1=b.ord1 AND a.ord2>=b.ord2)) GROUP BY a.partition, a.ord1, a.ord2 ; |