117 lines
6.6 KiB
Markdown
117 lines
6.6 KiB
Markdown
|
|
# Server-side analytics + visitor journey
|
|||
|
|
|
|||
|
|
Two things in one pipeline:
|
|||
|
|
|
|||
|
|
1. **Ad-block-resistant GA4** — forward events server-to-server when client-side `gtag.js` is blocked.
|
|||
|
|
2. **Visitor journey reconstruction** — record every event into our own DB, and when a visitor submits the booking form, link their journey to that submission so the owner can review it in the CP dashboard.
|
|||
|
|
|
|||
|
|
## Why each piece exists
|
|||
|
|
|
|||
|
|
### Ad-block fallback
|
|||
|
|
Browser ad blockers (uBlock, Brave, Safari ITP, AdGuard, Pi-hole) block requests to `googletagmanager.com` and `google-analytics.com`. For NZ consumer traffic that's roughly **20–40% of visits silently lost**. The fix is a first-party endpoint on our own domain that blocklists don't match, which forwards to GA4 via the Measurement Protocol.
|
|||
|
|
|
|||
|
|
### Journey reconstruction
|
|||
|
|
GA4 is aggregate. The owner can see "200 hero CTA clicks this week" but not "this specific submission's journey was /pricing → /about → hero CTA → form." For a small services business, knowing what a *specific lead* engaged with before submitting is more useful than another aggregate dashboard.
|
|||
|
|
|
|||
|
|
## Architecture
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Browser ── trackEvent() ──┬─► gtag (when not blocked) ─► GA4
|
|||
|
|
│
|
|||
|
|
└─► /api/track (always) ─┬─► session_events table
|
|||
|
|
│
|
|||
|
|
└─► GA4 (only when gtag missing)
|
|||
|
|
|
|||
|
|
Browser also keeps a rolling sessionStorage buffer of the last 30 events
|
|||
|
|
as a fallback (in case /api/track is itself blocked at the network layer).
|
|||
|
|
|
|||
|
|
On booking form submit success:
|
|||
|
|
Browser ── promoteJourney(email) ──► /api/track/promote
|
|||
|
|
│
|
|||
|
|
├─► reads session_events for this anon_id
|
|||
|
|
├─► reads sessionStorage buffer from request body
|
|||
|
|
└─► writes one row to submission_journeys
|
|||
|
|
|
|||
|
|
Owner opens enquiry in CP dashboard:
|
|||
|
|
AdminDashboard ── /api/owner/client-enquiry?email=... ──► mail-api
|
|||
|
|
│
|
|||
|
|
├─► returns enquiry record
|
|||
|
|
└─► returns submission_journeys row
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Tables (`docker/postgres/init/004-session-events.sql`)
|
|||
|
|
|
|||
|
|
### `session_events`
|
|||
|
|
Every analytics event, keyed by the `anonId` cookie set in `src/hooks.server.ts`. **Pruned after 24h** by a probabilistic cleanup inside `/api/track` (~1 in 200 inserts triggers a `DELETE WHERE created_at < now() - 24h`). No cron container needed — cleanup runs naturally with traffic.
|
|||
|
|
|
|||
|
|
### `submission_journeys`
|
|||
|
|
Promoted journeys keyed by email. **Not auto-pruned.** Owner-facing data. Contains:
|
|||
|
|
- `events` — snapshot of `session_events` rows at promotion time (server-captured)
|
|||
|
|
- `client_events` — the sessionStorage buffer the client posted (fallback)
|
|||
|
|
|
|||
|
|
The merge happens in the CP UI (`AdminDashboard.mergedJourneyEvents`), de-duped by `name|page_path|ts`.
|
|||
|
|
|
|||
|
|
## De-duplication
|
|||
|
|
|
|||
|
|
- **GA4** receives each event exactly once: client when gtag is loaded, server when it isn't. The `forward_ga4` flag in the `/api/track` body controls this.
|
|||
|
|
- **session_events** receives every event once (always written server-side).
|
|||
|
|
- **Journey display** merges server + client events with key `name|page_path|ts`.
|
|||
|
|
|
|||
|
|
## Privacy
|
|||
|
|
|
|||
|
|
Disclosed in `src/lib/content/privacy-policy.ts` under the **Analytics** section. The key promises:
|
|||
|
|
|
|||
|
|
- Browsing record contains pages, clicks, timestamps, and a random browser ID — never name/email/phone or form contents.
|
|||
|
|
- Unsubmitted journeys are deleted within 24h.
|
|||
|
|
- Submitted journeys are linked to the enquiry email, visible only to the Goodwalk team, never shared or used for advertising.
|
|||
|
|
- Users can request deletion at info@goodwalk.co.nz.
|
|||
|
|
|
|||
|
|
**Update the policy in the same PR** if you ever change what's stored or how long.
|
|||
|
|
|
|||
|
|
## Configuration
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
GA4_MEASUREMENT_ID=G-K7TLSFJVP1 # already in deploy.env.template
|
|||
|
|
GA4_API_SECRET=<from GA4 admin> # required for the GA4 forward
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
To get the API secret: GA4 admin → Data Streams → web stream → Measurement Protocol API secrets → Create. Without it, `/api/track` still records to `session_events` (journey works) — only the GA4 forward is off.
|
|||
|
|
|
|||
|
|
## Files
|
|||
|
|
|
|||
|
|
- `src/routes/api/track/+server.ts` — main ingest, persists + forwards
|
|||
|
|
- `src/routes/api/track/promote/+server.ts` — links journey to submission email
|
|||
|
|
- `src/lib/analytics.ts` — client `trackEvent`, sessionStorage buffer, `promoteJourney(email)`
|
|||
|
|
- `src/lib/components/BookingWizard.svelte` — calls `promoteJourney(email)` on submit success
|
|||
|
|
- `mail-api/db.py` — `get_submission_journey(email)` reader
|
|||
|
|
- `mail-api/main.py` — `/owner/client-enquiry` returns `{enquiry, journey}`
|
|||
|
|
- `src/lib/components/admin-dashboard/AdminDashboard.svelte` — renders the **Visitor journey** section in the enquiry modal
|
|||
|
|
- `src/lib/content/privacy-policy.ts` — disclosure
|
|||
|
|
- `docker/postgres/init/004-session-events.sql` — table definitions
|
|||
|
|
|
|||
|
|
## Testing locally
|
|||
|
|
|
|||
|
|
Without env vars set (no GA4 forwarding):
|
|||
|
|
```bash
|
|||
|
|
curl -X POST http://localhost:5173/api/track \
|
|||
|
|
-H 'content-type: application/json' \
|
|||
|
|
-H 'user-agent: Mozilla/5.0' \
|
|||
|
|
-d '{"name":"test_event","params":{"label":"manual","page_path":"/"}}'
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Then check the row landed:
|
|||
|
|
```bash
|
|||
|
|
docker exec -it goodwalk_svelte_db psql -U goodwalk -d goodwalk \
|
|||
|
|
-c "select event_name, page_path, created_at from session_events order by id desc limit 5;"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
To test the full journey flow locally, submit a booking through the wizard with a test email, then open `cp.goodwalk.local` (or use `?preview=cp` on localhost), open the enquiry for that email, and the **Visitor journey** panel should list every page view and click that led to the submission.
|
|||
|
|
|
|||
|
|
## What this does NOT do
|
|||
|
|
|
|||
|
|
- **Meta Pixel / Facebook Ads** — same blocker problem, different fix (Conversions API). Not built.
|
|||
|
|
- **Real-time owner notifications** — journey is visible only after submission, not as a live feed of who's on the site.
|
|||
|
|
- **Cross-device journey** — anon_id is per-browser. A visitor who researches on phone then submits on laptop produces two separate (mostly empty) journeys.
|
|||
|
|
- **Consent banner** — NZ has no explicit cookie law today. If we ever serve EU/UK traffic, we need Consent Mode v2 before this pipeline is legal there for the GA4 forward.
|
|||
|
|
- **Pruning of `submission_journeys`** — these are kept indefinitely. If you want a max retention (e.g. delete journeys older than 12 months), add a cron or extend the probabilistic cleanup in `/api/track`.
|