Pivot table basics: rows to columns
From table tbl( class, member )
, you want to cross-tabulate all classes with their members. In SQL terms, you aggregate members over classes. In MySQL:
SELECT class,GROUP_CONCAT(member)
FROM tbl
GROUP BY class;
With that simple query, you're halfway toward cross-tabulation, halfway to implementing a simple CUBE
, and halfway to basic entity-attribute-value (EAV) logic. This is easier to see if we have two columns, rather than just one, to tabulate against the grouping column:
DROP TABLE IF EXISTS tbl;
CREATE TABLE tbl( id INT, colID INT, value CHAR(20) );
INSERT INTO tbl VALUES
(1,1,'Sampo'),(1,2,'Kallinen'),(1,3,'Office Manager'),
(2,1,'Jakko'),(2,2,'Salovaara'),(2,3,'Vice President');
To tabulate all colID
and value
values against all id
values—that is, to write a reporting CUBE
for the table—write a GROUP_CONCAT()
instruction for each colID
found in the table, then GROUP BY id
:
SELECT
id,
GROUP_CONCAT(if(colID = 1, value, NULL)) AS 'First Name',
GROUP_CONCAT(if(colID = 2, value, NULL)) AS 'Last Name',
GROUP_CONCAT(if(colID = 3, value, NULL)) AS 'Job Title'
FROM tbl
GROUP BY id;
+------+------------+-----------+----------------+
| id | First Name | Last Name | Title |
+------+------------+-----------+----------------+
| 1 | Sampo | Kallinen | Office Manager |
| 2 | Jakko | Salovaara | Vice President |
+------+------------+-----------+----------------+
Since Paul Spinks first taught a spreadsheet how to do that cross-tabulation trick in 1979, this has also been known as a pivot table: we pivot colID
and value
against ID
.
(Of course for a proper EAV representation, we'd add an attributes table:
DROP TABLE IF EXISTS attrs;
CREATE TABLE attrs(colID INT,attr CHAR(12));
INSERT INTO attrs VALUES (1,'First Name'),(2,'Last Name'),(3,'Title');
and write a stored procedure to PREPARE
the above query from table-based attribute names.)
More often, crosstab queries calculate. Here is a simple sales table:
DROP TABLE IF EXISTS sales;
CREATE TABLE Sales (empID INT, yr SMALLINT, sales DECIMAL(10,2));
INSERT sales VALUES
(1, 2005, 12000),(1, 2006, 18000),(1, 2007, 25000),
(2, 2005, 15000),(2, 2006, 6000),(3, 2006, 20000),(3, 2007, 24000);
In Microsoft SQL Server, CUBE/PIVOT
syntax for horizontal and vertical totals is:
SELECT EmpId, [2005], [2006], [2007], [ALL]
FROM (
SELECT
CASE WHEN GROUPING(EmpId)=0 THEN CAST(EmpId AS CHAR(7)) ELSE 'ALL' END AS EmpId,
CASE WHEN GROUPING(Yr)=0 THEN CAST(Yr AS CHAR(7)) ELSE 'ALL' END AS Yr,
SUM(Sales) AS Sales
FROM Sales
GROUP BY EmpId, Yr WITH CUBE
) AS s
PIVOT( SUM(Sales) FOR Yr IN ([2005], [2006], [2007], [ALL]) ) AS p
In MySQL, we write one aggregating expression per reporting column in the inner query, and build the horizontal sums in the outer query:
SELECT
IFNULL(empId,'Totals') AS EmpId,
sums.2005, sums.2006, sums.2007,
sums.2005 + sums.2006 + sums.2007 AS Sums
FROM (
SELECT
EmpID,
SUM(IF(Yr=2005,sales,0)) As '2005',
SUM(IF(Yr=2006,sales,0)) As '2006',
SUM(IF(Yr=2007,sales,0)) As '2007'
FROM Sales
GROUP BY EmpID WITH ROLLUP
) AS sums;
+--------+----------+----------+----------+-----------+
| EmpId | 2005 | 2006 | 2007 | Sums |
+--------+----------+----------+----------+-----------+
| 1 | 12000.00 | 18000.00 | 25000.00 | 55000.00 |
| 2 | 15000.00 | 6000.00 | 0.00 | 21000.00 |
| 3 | 0.00 | 20000.00 | 24000.00 | 44000.00 |
| Totals | 27000.00 | 44000.00 | 49000.00 | 120000.00 |
+--------+----------+----------+----------+-----------+
That approach works beautifully for the common need to tabulate monthly amounts by year, say from an order history table orderhist(orderdate date,amount decimal(10,2))
:
SELECT
IfNull(Year,'Totals') Year,
Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, `Dec`,
Qty AS Count,
Yrly as 'Yrly Total'
FROM (
SELECT
year(orderdate) AS 'Year',
Round(Sum(CASE WHEN Month(orderdate)= 1 THEN amount ELSE 0 END),2) AS Jan,
Round(Sum(CASE WHEN Month(orderdate)= 2 THEN amount ELSE 0 END),2) AS Feb,
Round(Sum(CASE WHEN Month(orderdate)= 3 THEN amount ELSE 0 END),2) AS Mar,
Round(Sum(CASE WHEN Month(orderdate)= 4 THEN amount ELSE 0 END),2) AS Apr,
Round(Sum(CASE WHEN Month(orderdate)= 5 THEN amount ELSE 0 END),2) AS May,
Round(Sum(CASE WHEN Month(orderdate)= 6 THEN amount ELSE 0 END),2) AS Jun,
Round(Sum(CASE WHEN Month(orderdate)= 7 THEN amount ELSE 0 END),2) AS Jul,
Round(Sum(CASE WHEN Month(orderdate)= 8 THEN amount ELSE 0 END),2) AS Aug,
Round(Sum(CASE WHEN Month(orderdate)= 9 THEN amount ELSE 0 END),2) AS Sep,
Round(Sum(CASE WHEN Month(orderdate)=10 THEN amount ELSE 0 END),2) AS Oct,
Round(Sum(CASE WHEN Month(orderdate)=11 THEN amount ELSE 0 END),2) AS Nov,
Round(Sum(CASE WHEN Month(orderdate)=12 THEN amount ELSE 0 END),2) AS `Dec`,
Count(*) AS Qty,
Round(Sum(amount),2) AS Yrly
FROM orderhist
GROUP BY year
WITH ROLLUP
) AS sums ;
More likely the required sums need to be calculated across joins. Build them in using a good text editor. Here is the above query for the orders
and orderdetails
table in the Northwind database:
SELECT
IfNull(Year,'Totals') Year,
Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, `Dec`,
quantity AS Count,
Yrly as 'Yrly Total'
FROM (
SELECT
year(orderdate) AS 'Year',
Round(Sum(If( Month(o.orderdate)= 1, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Jan,
Round(Sum(If( Month(o.orderdate)= 2, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Feb,
Round(Sum(If( Month(o.orderdate)= 3, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Mar,
Round(Sum(If( Month(o.orderdate)= 4, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Apr,
Round(Sum(If( Month(o.orderdate)= 5, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS May,
Round(Sum(If( Month(o.orderdate)= 6, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Jun,
Round(Sum(If( Month(o.orderdate)= 7, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Jul,
Round(Sum(If( Month(o.orderdate)= 8, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Aug,
Round(Sum(If( Month(o.orderdate)= 9, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Sep,
Round(Sum(If( Month(o.orderdate)=10, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Oct,
Round(Sum(If( Month(o.orderdate)=11, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Nov,
Round(Sum(If( Month(o.orderdate)=12, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS `Dec`,
Count(*) AS quantity,
Round(Sum((unitprice*quantity)-discount),2) AS Yrly
FROM orders o
JOIN orderdetails d USING(orderID)
GROUP BY year
WITH ROLLUP
) AS sums ;
With many columns or subtotals, pivot tables get complicated, time-consuming and error-prone. Automation is needed. Oracle and Microsoft SQL Server have CUBE
syntax to simplify the job, especially for big pivot tables. MySQL doesn't. See "Automate pivot table queries" for how to roll your own cube with MySQL.