A superstore is running a promotion: each day, every fifth a customer wins a prize. Each day you're given a text file with data columns for customerID, timestamp and order amount. How do you find every fifth customer?
Load the data into a table, then write the query inside out; with this kind of problem, nested queries are a boon.
1. Using Load Data Infile, load the text file into a table (indexless for speed) named tbl with columns for customerID, timestamp and amount, and index the table on (customerID, timestamp).
2. Write a query that tracks per-customer order by timestamp with user variables:
set @id='', @ord=1;
select
customerID, timestamp,
if(customerID=@ID, @ord := @ord+1, @ord := 1) as custord,
@ID := customerID
from tbl
order by customerID, timestamp;
3. Add an outer query that invokes #2 as a subquery to find every fifth customer row.
set @id='', @ord=1;
select customerID, custord
from (
select
customerID, timestamp,
if(customerID=@ID,
@ord := @ord+1,
@ord := 1
) as custord,
@ID := customerID
from tbl
order by customerID, timestamp
) x
where custord >= 5
and custord % 5 = 0
4. Add an outer query that counts result rows from #3 by customerID:
set @id='', @ord=1;
select customerID, count(*) as tokens
from (
select customerID, custord
from (
select
customerID, timestamp,
if(customerID=@ID,
@ord := @ord+1,
@ord := 1
) as custord,
@ID := customerID
from tbl
order by customerID, timestamp
) x
where custord >= 5 and custord % 5 = 0
) y
group by customerID;