Median

from the Artful Common Queries page


Statistically, the median is the middle value in a vector or array or list of numbers--the value smaller than in half of all rows, and larger than in the other half. If the row count is even, the median is the average of the two middle values.

MariaDB 10.3.3 introduced a MEDIAN() function. No such function yet in MySQL.

A tiny test table ...


drop table if exists data;

create table data( val decimal(10,2) );

insert into data values(1),(3),(6),(8),(9),(10),(11),(12);



Some published SQL Median functions don't work in some cases, and some don't scale. This query is correct in all datasets we've given it in MySQL versions before 8.0, and it scales at O(log n):


select round( avg(val), 1 ) as Median

from (

  select d.val, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum

  from data d, (select @rownum:=0) z 

  where d.val is not null

  order by d.val

) as x

where x.row_number in( floor((@total_rows+1)/2), floor((@total_rows+2)/2) );

+--------+

| Median |

+--------+

|    8.5 |

+--------+



MySQL 8.0 introduced common table expressions and windowing functions including ROW_NUMBER(), and it deprecated iterative use of user variables as in the above query, so an 8.0 version of the above query uses a common table expression for convenience and optimisation, and invokes the ROW_NUMBER() windowing function instead of counting rows with a user variable...


with 

  x as (

    select val, row_number() over() as rownum

    from data

  ),

  total as (

    select count(*)/2 N1, floor((count(*)+2)/2) N2

    from data

  )

select round( avg(x.val), 1 ) as Median

from x

join total on x.rownum in( total.N1, total.N2 );  




Return to the Artful Common Queries page