Skip to content

Pitfall #7: Building Audience Segments from Events When GA4 Already Exports Them

By Timo Dechau · Last updated March 25, 2026

Your marketing team wants to know which users are likely to churn. Or which users belong to the “high-value purchasers” audience you painstakingly built in the GA4 interface. So you do what any self-respecting analyst would do: you write a massive SQL query against events_*, reconstruct the user journey, apply your own logic to define the segment, and deliver the results. It takes two days and the numbers don’t quite match what the GA4 UI shows.

Meanwhile, GA4 has been quietly exporting exactly this data into tables you never looked at.

Most teams set up the GA4 BigQuery link, see the events_* tables appear, and stop there. That’s the export everyone talks about. Blog posts, tutorials, conference talks — it’s all about the event data.

But GA4 can also export two additional table types:

  • pseudonymous_users_* — One row per user_pseudo_id (the cookie-based device identifier). Contains user properties, audience memberships, and Google’s predictive metrics.
  • users_* — One row per user_id (your authenticated user identifier, if you set one). Same structure but scoped to known users.

These tables include fields that are genuinely difficult or impossible to reconstruct from raw events:

  • Audience memberships — Every GA4 audience the user currently belongs to, with the timestamp they joined. This is the same audience logic the GA4 UI uses, including Google’s internal processing.
  • Predictive metrics — Purchase probability, churn probability, and predicted revenue. These are machine learning scores Google computes on their side. You can’t replicate them from the event stream because the model is proprietary.
  • User properties — The full set of user-scoped properties, already resolved to their most recent values (no need to deduplicate from the event stream).
  • Lifetime metrics — LTV revenue, session count, engagement metrics, all pre-computed.

Look, I’ll admit: for a long time I didn’t pay much attention to these tables either. They were introduced later, documentation was sparse, and the event tables were already there. But ignoring them means you’re rebuilding wheels — and some of those wheels (predictive scores, audience membership timestamps) you literally cannot rebuild from events alone.

Enabling these exports takes about 30 seconds.

  1. Go to GA4 Admin > Product Links > BigQuery Links
  2. Click on your existing BigQuery link
  3. Under Configure data streams and events, you’ll see the export settings
  4. Check the boxes for “Include user data in daily export” — this enables both pseudonymous_users_* and users_* tables
  5. Make sure “Daily” frequency is selected (streaming export is also available but not needed for most use cases)
  6. Save

Now here’s where it gets interesting — and where the caveats live.

Caveat 1: No historical backfill. These tables start populating from the day you enable them. If you enable them on March 25th, your first pseudonymous_users_20260325 table appears the next day. There’s no retroactive export for users who were active before that date. Enable it now, even if you don’t plan to use it immediately.

Caveat 2: User counts won’t match event-derived counts. The pseudonymous_users_* table contains users who were active during the export window, based on Google’s internal definition of “active.” If you count distinct user_pseudo_id values from events_* for the same date range, you’ll get a different number. This is expected — the scoping logic is different. Don’t try to reconcile them; use each table for its intended purpose.

Caveat 3: Audience membership reflects a point-in-time snapshot. The export captures which audiences a user belongs to at the time of export. It doesn’t give you a full history of audience membership changes. If a user left and rejoined an audience, you’ll only see the current membership.

Caveat 4: Predictive metrics require minimum thresholds. Google’s purchase probability, churn probability, and revenue prediction models only activate if your property has enough data (roughly 1,000 purchasers and 1,000 churners in the past 28 days, plus other conditions). Smaller properties will see these fields as NULL.

First, check whether you even have these tables:

SELECT
table_name,
TIMESTAMP_MILLIS(creation_time) AS created,
ROUND(size_bytes / POW(1024, 2), 1)
AS size_mb,
row_count
FROM
`your_project.analytics_123456789`
.__TABLES__
WHERE
table_name LIKE 'pseudonymous_users_%'
OR table_name LIKE 'users_%'
ORDER BY table_name DESC
LIMIT 10

If this returns no rows, the user export isn’t enabled. Go enable it.

If it does return rows, take a look at what you’ve got:

SELECT
user_pseudo_id,
audiences,
predictions,
user_properties,
user_ltv
FROM
`your_project.analytics_123456789`
.`pseudonymous_users_*`
WHERE
_TABLE_SUFFIX = FORMAT_DATE(
'%Y%m%d',
DATE_SUB(CURRENT_DATE(), INTERVAL 2 DAY)
)
LIMIT 20

Browse those results. If you see audience arrays populated with the segments you’ve built in the GA4 interface, and prediction scores with actual values instead of NULLs, you’ve got a data source that would have taken weeks to approximate from raw events. If predictions are all NULL, check whether your property meets Google’s minimum thresholds — that’s a volume issue, not a configuration one.

Either way, now you know what’s there. And knowing is considerably better than spending two days writing SQL to approximate it.