Added in MySQL 8 and MariaDB 10, SQL window-analytic functions extend and refine the concept of aggregation.
The windowing function OVER() works with all aggregate functions except GROUP_CONCAT() —essentially making them window-aware—and works also with window-analytic functions:
CUME_DIST() |
Cumulative distribution value |
DENSE_RANK() |
Rank of current row in its window without gaps |
FIRST_VALUE() |
Argument value from first row of window |
LAG() |
Argument value from row lagging current row in window |
LAST_VALUE() |
Argument value from last row of window |
LEAD() |
Argument value from row leading current row in window |
NTH_VALUE() |
Argument value from Nth row of window |
NTILE() |
Bucket number of current row in its window |
PERCENT_RANK() |
Cumulative percent rank of argument value |
RANK() |
Current row rank within its window, with gaps |
ROW_NUMBER() |
Current row number in its window |
This is the toy table from http://mysqlserverteam.com/mysql-8-0-2-introducing-window-functions:
CREATE TABLE sales(employee VARCHAR(8), `date` DATE, sale INT);
INSERT INTO sales VALUES
('odin', '2025-03-01', 200),('odin', '2024-04-01', 300),
('odin', '2024-05-01', 400),('thor', '2025-03-01', 400),
('thor', '2024-04-01', 300),('thor', '2024-05-01', 500);
Aggregate functions suppress display of the rows they aggregate ...
SELECT SUM(sale) AS sum FROM sales;
+------+
| sum |
+------+
| 2100 |
+------+
... but adding in the OVER() windowing function undoes that row suppression, so individual rows that have been aggregated appear in the result ...
SELECT employee, SUM(sale) OVER() AS sum FROM sales;
+----------+------+
| employee | sum |
+----------+------+
| odin | 2100 |
| odin | 2100 |
| odin | 2100 |
| thor | 2100 |
| thor | 2100 |
| thor | 2100 |
+----------+------+
OVER() accepts two optional arguments, a PARTITION BY clause specifying how to partition row groups into windows, and an ORDER BY clause which orders window rows and makes their aggregate results row-by-row cumulative.
OVER() with only a PARTITION BY clause partitions aggregate results, but each row shows the same aggregate result for its window:
SELECT
employee, date, sale,
SUM(sale) OVER(PARTITION BY employee) AS sum
FROM sales;
+----------+------------+------+------+
| employee | date | sale | sum |
+----------+------------+------+------+
| odin | 2025-03-01 | 200 | 900 |
| odin | 2024-04-01 | 300 | 900 |
| odin | 2024-05-01 | 400 | 900 |
| thor | 2025-03-01 | 400 | 1200 |
| thor | 2024-04-01 | 300 | 1200 |
| thor | 2024-05-01 | 500 | 1200 |
+----------+------------+------+------+
As you'd expect, OVER() with only an ORDER BY clause orders by the ORDER BY clause, and accumulates by it:
SELECT employee, date, sale, SUM(sale) OVER(ORDER BY date) AS sum
FROM sales;
+----------+------------+------+------+
| employee | date | sale | sum |
+----------+------------+------+------+
| odin | 2025-03-01 | 200 | 600 | (sum for 2025-03-01)
| thor | 2025-03-01 | 400 | 600 |
| odin | 2024-04-01 | 300 | 1200 | (sum for 2024-04-01)
| thor | 2024-04-01 | 300 | 1200 |
| odin | 2024-05-01 | 400 | 2100 | (sum for 2024-05-01)
| thor | 2024-05-01 | 500 | 2100 |
+----------+------------+------+------+
As you might not expect, when there is no PARTITION BY clause, aggregate results repeat within each window. Why?
ORDER BY accepts two kinds of ordering, ROWS which means by physical row, and RANGE which means by value. RANGE is the default. Therefore ORDER BY x with no modifier, as in the above query, means ORDER BY x RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW .
(How did SQL designers come to take UNBOUNDED to mean "begin with the first item found"? I'd give worlds to know.)
To get ORDER BY x to accumulate by each row value of x instead of by each distinct value found, specify the ROWS option instead:
SELECT
employee, sale, date,
SUM(sale) OVER(ORDER BY date ROWS
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cum_sales
FROM sales;
+----------+------+------------+-----------+
| employee | sale | date | cum_sales |
+----------+------+------------+-----------+
| odin | 200 | 2025-03-01 | 200 |
| thor | 400 | 2025-03-01 | 600 |
| odin | 300 | 2024-04-01 | 900 |
| thor | 300 | 2024-04-01 | 1200 |
| odin | 400 | 2024-05-01 | 1600 |
| thor | 500 | 2024-05-01 | 2100 |
+----------+------+------------+-----------+
We get the full potential of OVER() by giving it both PARTITION BY and ORDER BY arguments:
SELECT
employee, date, sale,
SUM(sale) OVER(PARTITION BY employee ORDER BY date) AS sum
FROM sales;
+----------+------------+------+------+
| employee | date | sale | sum |
+----------+------------+------+------+
| odin | 2025-03-01 | 200 | 200 | (odin sum starts to accumulate)
| odin | 2024-04-01 | 300 | 500 |
| odin | 2024-05-01 | 400 | 900 |
| thor | 2025-03-01 | 400 | 400 | (thor sum starts to accumulate)
| thor | 2024-04-01 | 300 | 700 |
| thor | 2024-05-01 | 500 | 1200 |
+----------+------------+------+------+
Windowing functions
OVER() works with all aggregate functions except GROUP_CONCAT() , also with the windowing functions implemented since MySQL 8 and MariaDB 10. Mostly, these functions compute row ranking statistics, either across the whole resultset or within windows created by OVER() .
Adding windowing functionality to a query involves adding at least two elements:
1. A call to any standard aggregating function except GROUP_CONCAT() , or to one of the windowing functions (listed in the table near the top of this article), and
2. A call to OVER() , specifying the rowset (or partition or window) to which the above aggregating or windowing function should be applied.
See "Calling window functions" below for info on calling these functions, and ample queries using them. The article "Within-group quotas (Top N per group)" illustrates how to use ROW_NUMBER() to rank rows within bands or rowsets.
Given the toy table ...
drop table if exists tbl;
create table tbl(foo char(1), bar int);
insert into tbl values
('a',7),('b',5),('c',1),('d',9),('e',6),
('f',3),('g',2),('h',8),('i',5),('j',0);
... this query illustrates some of the row-to-row logic of windowing functions, applied to the whole resultset code>win ...
SELECT
bar as Val,
ROW_NUMBER() OVER win AS 'RowNo',
LAG(bar) OVER win AS 'Lag',
LAST_VALUE(bar) OVER win AS 'LastVal',
CUME_DIST() OVER win AS 'CumeDist',
100*ROUND( PERCENT_RANK() OVER win, 1 ) AS 'PctRank',
DENSE_RANK() OVER win AS 'DenseRank'
FROM tbl
WINDOW win AS (ORDER BY bar);
+-----+-------+------+---------+----------+---------+-----------+
| Val | RowNo | Lag | LastVal | CumeDist | PctRank | DenseRank |
+-----+-------+------+---------+----------+---------+-----------+
| 0 | 1 | NULL | 0 | 0.1 | 0.0 | 2 |
| 1 | 2 | 0 | 1 | 0.2 | 10.0 | 3 |
| 2 | 3 | 1 | 2 | 0.3 | 20.0 | 4 |
| 3 | 4 | 2 | 3 | 0.4 | 30.0 | 5 |
| 5 | 5 | 3 | 5 | 0.6 | 40.0 | 6 |
| 5 | 6 | 5 | 5 | 0.6 | 40.0 | 6 |
| 6 | 7 | 5 | 6 | 0.7 | 70.0 | 7 |
| 7 | 8 | 6 | 7 | 0.8 | 80.0 | 8 |
| 8 | 9 | 7 | 8 | 0.9 | 90.0 | 9 |
| 9 | 10 | 8 | 9 | 1 | 100.0 | 10 |
+-----+-------+------+---------+----------+--------+------------+
Here's another banding example: the toy table c has multiple values of pid per value of id :
create table c ( id smallint primary key auto_increment, pid smallint );
insert into c values(1,1),(2,1),(3,2),(4,2),(5,2),(6,3);
A combination of aggregation with windowing functions retrieves the count, the first id value per pid , and the last id value per pid , without GROUP BY :
select distinct
pid,
count(*) over (partition by pid) as N,
first_value(id) over (partition by pid) as 'first',
last_value(id) over (partition by pid) as 'last'
from c
order by pid;
+------+---+-------+------+
| pid | N | first | last |
+------+---+-------+------+
| 1 | 2 | 1 | 2 |
| 2 | 3 | 3 | 5 |
| 3 | 1 | 6 | 6 |
+------+---+-------+------+
Calling analytic functions
Many of these function take no arguments. The few that do mostly accept an expr argument, a column expression that evaluates to a scalar; other arguments are as shown, brackets indicate the argument is optional.
CUME_DIST() : Returns cumulative distribution between 0 and 1
DENSE_RANK() : Returns 1 + count of distinct ranks before current row
FIRST_VALUE() : Returns argument value from first row of window
LAG(expr[, offset[, nullexpr]]) : Returns value of expr in row lagging current window row by offset rows; show nullexpr for nulls
LAST_VALUE() : Returns argument value from last row of window
LEAD(expr[, offset[, nullexpr]]) : Returns value of expr in row leading current window row by offset rows; show nullexpr for nulls
NTH_VALUE(expr, n) : Returns value of expr in n th window row
NTILE(n) : Returns bucket count of current window row if there are n buckets
PERCENT_RANK() : Returns cumulative percent argument rank value
RANK() : Returns current row rank within its window, with gaps
ROW_NUMBER() : Returns current row number in its window
This query on the sales table illustrates some basics of their use:
SELECT
employee as name, date, sale,
ROW_NUMBER() OVER(PARTITION BY employee ORDER BY date) AS 'Row',
NTILE(2) OVER(PARTITION BY employee ORDER BY date) AS 'Tile',
FIRST_VALUE(sale) OVER(PARTITION BY employee ORDER BY date) AS '1stVal',
NTH_VALUE(sale,1) OVER(PARTITION BY employee ORDER BY date) AS '2ndVal',
LAST_VALUE(sale) OVER(PARTITION BY employee ORDER BY date) AS 'LastVal',
LAG(sale,1,'None') OVER(PARTITION BY employee ORDER BY date) AS 'Lag',
LEAD(sale,1,'None') OVER(PARTITION BY employee ORDER BY date) AS 'Lead',
ROUND( CUME_DIST() OVER(PARTITION BY employee ORDER BY date), 1 ) AS 'CumDist',
100*ROUND( PERCENT_RANK() OVER(PARTITION BY employee ORDER BY date), 1 ) AS 'PctRnk',
DENSE_RANK() OVER(PARTITION BY employee ORDER BY date) AS 'DRnk'
FROM sales;
+------+------------+------+-----+------+--------+--------+---------+------+------+---------+--------+------+
| name | date | sale | Row | Tile | 1stVal | 2ndVal | LastVal | Lag | Lead | CumDist | PctRnk | DRnk |
+------+------------+------+-----+------+--------+--------+---------+------+------+---------+--------+------+
| odin | 2025-03-01 | 200 | 1 | 1 | 200 | 200 | 200 | None | 300 | 0.3 | 0.0 | 1 |
| odin | 2024-04-01 | 300 | 2 | 1 | 200 | 200 | 300 | 200 | 400 | 0.7 | 50.0 | 2 |
| odin | 2024-05-01 | 400 | 3 | 2 | 200 | 200 | 400 | 300 | None | 1.0 | 100.0 | 3 |
| thor | 2025-03-01 | 400 | 1 | 1 | 400 | 400 | 400 | None | 300 | 0.3 | 0.0 | 1 |
| thor | 2024-04-01 | 300 | 2 | 1 | 400 | 400 | 300 | 400 | 500 | 0.7 | 50.0 | 2 |
| thor | 2024-05-01 | 500 | 3 | 2 | 400 | 400 | 500 | 300 | None | 1.0 | 100.0 | 3 |
+------+------------+------+-----+------+--------+--------+---------+------+------+---------+--------+------+
For more on windowing functions see ...
https://dev.mysql.com/doc/refman/8.0/en/window-functions.html
https://www.slideshare.net/DagHWanvik/sql-window-functions-for-mysqlLast updated 6 Feb 2025 |
 |