Percentiles
In the Sakila table film
, retrieve a top-down percentile ranking of film lengths:
SELECT
a.film_id ,
ROUND( 100.0 * ( SELECT COUNT(*) FROM film AS b WHERE b.length <= a.length ) / total.cnt, 1 )
AS percentile
FROM film a
CROSS JOIN (
SELECT COUNT(*) AS cnt
FROM film
) AS total
ORDER BY percentile DESC;
If there are NULL
s, filter them out before computing percentiles.
On his blog, Roland Bouman shows a much faster query; here is a version retrieving the first film at or above the 90th percentile:
SELECT
SUM(g1.r) sr,
g2.length l,
SUM(g1.r)/(SELECT COUNT(*) FROM film) p
FROM (
SELECT COUNT(*) r, length
FROM film
GROUP BY length
) g1
JOIN (
SELECT COUNT(*) r, length
FROM film
GROUP BY length
) g2 ON g1.length < g2.length
GROUP BY g2.length
HAVING p >= 0.9
ORDER BY p LIMIT 1