Skip to content

Pitfall #10: Consent Mode Creates Invisible Data Gaps

By Timo Dechau · Last updated March 25, 2026

You’ve implemented Google Consent Mode v2 — the right thing to do for privacy compliance. Your data keeps flowing into BigQuery, your event counts look healthy, and everything seems fine. Then you try to do session analysis and the numbers are off. Or you notice your user count is lower than expected. Or — and this is the sneaky one — your conversion rate looks higher than it should, and you can’t figure out why.

What’s happening is that a chunk of your events are arriving without the identifiers you need. Consent mode is working exactly as designed, and it’s quietly making parts of your dataset unreliable for user-level and session-level analysis.

When a user doesn’t grant consent for analytics cookies, Google Consent Mode does something that’s easy to miss: it still sends events, but it strips out the identifying information.

In Advanced Consent Mode (which is what most implementations use), non-consented hits arrive in your BigQuery export as what I call “cookieless pings.” They have event names, timestamps, page locations — but the user_pseudo_id is either missing or replaced with a temporary value, and ga_session_id is absent. These events count toward your totals but can’t be stitched into sessions or attributed to users.

Here’s the thing — your event count stays high, which makes everything feel normal. But when you try to aggregate at the session or user level, you’re working with an incomplete picture. And the incompleteness isn’t random. Users who decline consent tend to be more privacy-conscious, often more technically sophisticated, and depending on your market, may represent a specific demographic segment.

So when teams filter to consented-only data — the natural instinct, since it’s the only data with usable identifiers — they’re not just shrinking the sample. They’re introducing a systematic bias toward users who click “Accept All.” In some European markets, I’ve seen consent rates as low as 30-40%. Filtering to consented-only means you’re basing your analysis on the minority of your traffic.

There’s no magic query that makes this go away. The fix is a conscious strategy, and it starts with understanding your actual consent breakdown.

Step one: know your consent rate. Run the diagnostic below. This tells you what percentage of your data is fully consented, partially consented, and non-consented.

Step two: decide on a strategy based on that number:

  • High consent rate (>80%): You can mostly filter to consented data and note the limitation. The bias exists but is manageable.
  • Medium consent rate (50-80%): Use consented data for user/session analysis but use all events (including non-consented) for aggregate metrics like page views and event counts. Be explicit about which metrics use which dataset.
  • Low consent rate (<50%): Your consented-only data is a minority sample. Aggregate event-level analysis on the full dataset. For user/session work, consider statistical adjustment or server-side collection as a complement.

The key principle: never mix consented and non-consented data in session or user analysis without knowing you’re doing it. And never present consented-only analysis as “total” numbers without disclosing the gap.

For page-level or content analysis where you don’t need user identity, use all events regardless of consent status:

-- Aggregate analysis: use all events
SELECT
(SELECT value.string_value
FROM UNNEST(event_params)
WHERE key = 'page_location'
) AS page,
COUNT(*) AS pageviews
FROM
`your_project.analytics_123456789.events_*`
WHERE
_TABLE_SUFFIX BETWEEN
FORMAT_DATE('%Y%m%d',
DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY))
AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
AND event_name = 'page_view'
GROUP BY page
ORDER BY pageviews DESC
LIMIT 50;

For session-level analysis, filter explicitly and document it:

-- Session analysis: consented traffic only
SELECT
CONCAT(user_pseudo_id, '.',
(SELECT value.int_value
FROM UNNEST(event_params)
WHERE key = 'ga_session_id')
) AS session_id,
COUNT(*) AS events_in_session
FROM
`your_project.analytics_123456789.events_*`
WHERE
_TABLE_SUFFIX BETWEEN
FORMAT_DATE('%Y%m%d',
DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY))
AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
AND privacy_info.analytics_storage = 'Yes'
GROUP BY session_id
HAVING session_id IS NOT NULL;

Run this to see your consent breakdown right now:

SELECT
privacy_info.analytics_storage,
privacy_info.ads_storage,
COUNT(*) AS event_count,
COUNT(DISTINCT user_pseudo_id)
AS unique_pseudo_ids,
ROUND(
COUNT(*)
/ SUM(COUNT(*)) OVER () * 100, 1
) AS pct_of_events
FROM
`your_project.analytics_123456789.events_*`
WHERE
_TABLE_SUFFIX BETWEEN
FORMAT_DATE('%Y%m%d',
DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY))
AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
GROUP BY
privacy_info.analytics_storage,
privacy_info.ads_storage
ORDER BY event_count DESC;

Look at the rows where analytics_storage is 'No' or NULL. That’s your invisible data gap. If those rows represent more than 20% of your events, you need a deliberate strategy — not just a WHERE clause.

Also pay attention to the unique_pseudo_ids column for the non-consented rows. If that number is suspiciously low or shows a lot of NULL values, those events are truly anonymous — they can’t be sessionized or attributed at all. That’s the portion of your data that’s essentially event-level only. Knowing the size of that segment is the first step to not being surprised by it.