
I have a table that records a row for each unique user per day with some aggregated stats for that user on that day, and I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.
eg.
-
<li>for Aug 31st, it'll count the unique users from Aug 2nd to Aug 31st</li>
<li>for Aug 30th, it'll count the unique users from Aug 1st to Aug 30th</li>
<li>and so on...</li>
</ul>
I've looked at some related questions but they aren't quite what I need - if a user logs in on multiple days in the last 30 days he should be counted only once, so I can't just sum the DAU count for the last 30 days.
Bigquery SQL for sliding window aggregate
BigQuery SQL for 28-day sliding window aggregate (without writing 28 lines of SQL)
So far, my ideas are to either:
-
<li>write a simple script that'll execute a separate BigQuery for each of the relevant days</li>
<li>write a BigQuery UDF that'll execute basically the same query for each day selected from another query</li>
</ul>
but I've not found any examples on how to execute another BigQuery query inside an UDF, or if it's possible at all.
Answer1:
I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.
Below should do this
SELECT
calendar_day,
EXACT_COUNT_DISTINCT(userID) AS unique_users
FROM (
SELECT calendar_day, userID
FROM YourTable
CROSS JOIN (
SELECT DATE(DATE_ADD('2016-08-08', pos - 1, "DAY")) AS calendar_day
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF('2016-09-08', '2016-08-08'), '.'),'') AS h
FROM (SELECT NULL)),h
)))
) AS calendar
WHERE DATEDIFF(calendar_day, dt) BETWEEN 0 AND 29
)
GROUP BY calendar_day
ORDER BY calendar_day DESC
It assumes YourTable has userID and dt fields (like below for example)
dt userID
2016-09-08 1
2016-09-08 2
...
And you can control:
- reporting dates range by changing respectively 2016-08-08
and 2016-09-08
- aggregation size by changing 29
in BETWEEN 0 AND 29