Pitfall #4: Treating events_* Like a Flat Table
The Trap
Section titled “The Trap”You open the GA4 export in BigQuery, write SELECT * FROM your_project.analytics_123456789.events_*, and immediately something feels off. Some columns return RECORD as their type. Your row counts don’t match what you expected. Or you try to filter on a specific event parameter and BigQuery throws an error about accessing fields on a repeated struct. So you start guessing at syntax, copy-pasting UNNEST patterns from Stack Overflow, and suddenly your 50,000 events became 300,000 rows and you have no idea why.
Why It Happens
Section titled “Why It Happens”If you’ve spent your career working with flat, relational tables — which most analysts have — GA4’s export schema is a genuine shock. Three critical columns are not scalar values but nested, repeated RECORD fields:
event_params— a repeated STRUCT containing every parameter attached to an event (page_location, ga_session_id, engagement_time, and dozens more)user_properties— a repeated STRUCT of user-scoped properties set via the SDKitems— a repeated STRUCT for e-commerce item data
“Repeated” means each event row can contain an array of these records. A single page_view event might carry 15 different event parameters, each stored as a separate element inside the event_params array. That’s not 15 rows in the table — it’s 15 elements nested inside one row.
The confusion deepens because event_params doesn’t use named columns. Every parameter is stored as a generic key-value pair with key (string) and value (another STRUCT with string_value, int_value, float_value, double_value). So there’s no event_params.page_location column you can just reference. You have to dig into the array, find the element where key = 'page_location', and then pull out the correct value field.
This design is flexible — GA4 can add any parameter without changing the schema — but it’s hostile to anyone who just wants to write a WHERE clause.
The Fix
Section titled “The Fix”There are two patterns you’ll use constantly. Know when to reach for each one.
Pattern 1: Extract a single parameter (no row multiplication)
Use a subquery to pluck one value out of the array. This keeps your row count intact — one row in, one row out.
SELECT event_timestamp, event_name, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS page_location, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS ga_session_idFROM `your_project.analytics_123456789.events_*`WHERE _TABLE_SUFFIX BETWEEN '20260301' AND '20260307'Each (SELECT ... FROM UNNEST(...)) subquery runs independently against the nested array for that row. It returns exactly one value (or NULL if the key doesn’t exist). Your output has the same number of rows as your input. This is the pattern you want 90% of the time.
Pattern 2: Explode items for e-commerce analysis (intentional row multiplication)
When you actually need one row per item — say, for product-level revenue analysis — use CROSS JOIN UNNEST:
SELECT event_timestamp, event_name, items.item_name, items.item_revenue, items.quantityFROM `your_project.analytics_123456789.events_*`, UNNEST(items) AS itemsWHERE _TABLE_SUFFIX BETWEEN '20260301' AND '20260307' AND event_name = 'purchase'Here’s the thing: CROSS JOIN UNNEST (which the comma syntax is shorthand for) multiplies your rows. A purchase event with 3 items becomes 3 rows. That’s exactly what you want for item-level analysis — but if you do this accidentally on event_params, your row count explodes and your aggregations break.
The rule of thumb: use the subquery pattern for event_params and user_properties. Use CROSS JOIN UNNEST for items only when you specifically need item-level granularity.
How to Check If You Have This Problem
Section titled “How to Check If You Have This Problem”Run this query to see the difference side by side. It compares the base event count with what happens when you accidentally unnest event_params:
WITH base AS ( SELECT COUNT(*) AS event_count FROM `your_project.analytics_123456789.events_*` WHERE _TABLE_SUFFIX BETWEEN '20260301' AND '20260301'),unnested AS ( SELECT COUNT(*) AS exploded_count FROM `your_project.analytics_123456789.events_*`, UNNEST(event_params) AS params WHERE _TABLE_SUFFIX BETWEEN '20260301' AND '20260301')SELECT base.event_count, unnested.exploded_count, ROUND( unnested.exploded_count / base.event_count, 1 ) AS multiplication_factorFROM base, unnestedYou’ll typically see a multiplication factor between 10x and 20x. That’s 10-20 event parameters per event, each becoming its own row. If any of your production queries use CROSS JOIN UNNEST(event_params) without a very good reason, you’ve found your problem.