We have a wages table for wage rates by waiter and startdate,
and a tips table to track daily waiters' hours worked and tips
received. The requirement is to report wages and concurrent tips
per waiter per day.
Why is this different from an ordinary within-groups aggregate? The correct wage rate for a tips row is not the wages row for that waiter with the latest date; it is the wages row having the latest date before the date in the given tips row.
One way to proceed is to build a temporary table from a join of wages to tips on waiter and wages.start < tips.date, then exclusion-join that result to itself to remove all rows except those with the latest wage rate per tips row. A two-step ...
-- wages-tips join removing later wage changes:
DROP TABLE IF EXISTS tmp;
CREATE TABLE tmp
SELECT
t.id AS tid, t.date AS Date,
t.hours_worked AS Hrs,t.tabs,t.tips,
w.id AS wid, w.waiter, w.rate, w.start
FROM tips t
JOIN wages w ON w.waiter=t.waiter AND w.start <= t.date;
-- self-exclusion join to remove obsolete wage rows:
SELECT t1.*
FROM tmp t1
LEFT JOIN tmp t2
ON t1.tid=t2.tid and t1.start < t2.start
WHERE t2.waiter is null
ORDER BY t1.Date;
+------+------------+------+------+-------+------+--------+------+------------+
| tid | Date | Hrs | tabs | tips | wid | waiter | rate | start |
+------+------------+------+------+-------+------+--------+------+------------+
| 1 | 2025-02-29 | 7 | 21 | 65.25 | 2 | 4 | 6.00 | 2025-03-01 |
| 2 | 2025-03-06 | 6 | 15 | 51.75 | 3 | 5 | 7.00 | 2025-01-05 |
| 3 | 2024-03-21 | 3 | 5 | 17.85 | 2 | 4 | 6.00 | 2025-03-01 |
| 4 | 2024-03-22 | 5 | 10 | 39.00 | 4 | 5 | 8.00 | 2024-03-20 |
+------+------------+------+------+-------+------+--------+------+------------+
DROP TABLE tmp;
As of MySQL 8 and MariaDB 10.2, the intermediary table code can be a Common Table Expression ...
WITH tmp AS (
SELECT
t.id AS tid, t.date AS Date,
t.hours_worked AS Hrs,t.tabs,t.tips,
w.id AS wid, w.waiter, w.rate, w.start
FROM tips t
JOIN wages w ON w.waiter=t.waiter AND w.start <= t.date
)
SELECT t1.*
FROM tmp t1
LEFT JOIN tmp t2 ON t1.tid=t2.tid and t1.start < t2.start
WHERE t2.waiter is null
ORDER BY t1.Date;
Can we skip the temp table, event the virtual one? Yes---by adding the condition wages.start <= tips.date to each side of the exclusion join ...
SELECT
t.id AS tid, t.date, t.hours_worked AS Hrs,
t.tabs,t.tips,w.id AS wid, w.waiter, w.rate,
w.start
FROM tips t
JOIN wages w ON w.waiter=t.waiter AND w.start <= t.date
LEFT JOIN wages w2
ON w.waiter=w2.waiter
AND w2.start <= t.date
AND w.start < w2.start
WHERE w2.id IS NULL
ORDER BY t.date;
Simpler, gives the same result as the CTE, and slightly faster.