#Data Engineering #Cloud Security #Snowflake #GDPR #FinTech

Data Engineering · Cloud Security

One clean data warehouse for a payments startup — 8 sources unified, GDPR-compliant from day one

8 sources → 1 warehouse · analysts off Excel · GDPR-ready on launch

A 50-person European payments startup had data scattered across Stripe, HubSpot, a Postgres transactional DB, and five other tools. Analysts were stitching CSVs in Excel. The CTO needed a single source of truth — with GDPR controls built in, not added later.

Challenge

Data lived in 8 disconnected tools. Every revenue report was a manual Excel job. No audit trail, no data lineage, and a GDPR compliance gap that blocked enterprise customer contracts.

Approach

Built a Snowflake-based data warehouse with Fivetran connectors for all 8 sources, dbt for transformation and documentation, and GDPR controls — role-based access, data classification, PII masking, audit logging — built into the architecture from day one.

Outcome

Analysts query live data in Snowflake instead of refreshing CSVs. The CTO closed two enterprise contracts that previously stalled on the GDPR question. Revenue reporting time dropped from 2 days to under 15 minutes.

8 → 1
Data sources consolidated into one warehouse
~15 min
Revenue reporting time — down from 2 days
2
Enterprise contracts unblocked by GDPR readiness
0
Manual Excel reports remaining in the workflow

Eight tools, zero shared truth, one GDPR gap blocking growth.

By the time the startup reached 50 people and a Series A close, their data infrastructure had grown organically into a patchwork. Stripe handled payments, HubSpot owned sales and marketing attribution, a Postgres database ran transactions, and five additional SaaS tools covered support, operations, and finance. None of them talked to each other. Every revenue meeting started with an analyst spending two days pulling CSVs, de-duplicating records, and stitching them together in Excel. The process was fragile, slow, and impossible to audit.

The harder problem was compliance. A pair of enterprise prospects — both with information security review processes — had stalled negotiations over GDPR readiness. The startup had no formal data lineage, no documented retention policies, no role-based access controls on sensitive customer data, and no audit trail. Their legal team had flagged the gap. The CTO needed a solution that didn't just fix reporting — it needed to make GDPR compliance a core property of the data architecture, not a layer bolted on later.

A warehouse, a pipeline, and a compliance layer — wired together.

Adimen scoped and delivered the full stack in a single 6-week retainer engagement:

Ingestion — 8 sources into one place

Fivetran connectors were configured for all 8 sources: Stripe, HubSpot, PostgreSQL (via log-based replication), and five additional SaaS tools. Each connector was configured with incremental sync and column-level filtering to avoid pulling PII fields that weren't needed downstream. All raw data lands in a dedicated raw schema in Snowflake, with no transformations applied at ingestion — preserving a complete, auditable record of what came in and when.

Transformation — dbt models, documented

A dbt project was built to clean, join, and aggregate raw data into the reporting layer. Each model includes schema tests, freshness assertions, and column-level documentation. The transformation layer exposes three core marts: revenue (Stripe + PostgreSQL), pipeline (HubSpot), and operations. Every model is version-controlled, lineage is fully traceable in the dbt DAG, and all transformations are deterministic and reviewable by the startup's engineering team.

GDPR controls built into the schema

PII fields — names, emails, payment identifiers — were classified at ingestion and masked in the transformation layer using Snowflake's dynamic data masking policies. Role-based access was configured at the schema level: analysts get the reporting mart, engineers get the staging layer, and PII access is restricted to a named data owner role. A GDPR Article 30 register was generated from the dbt documentation, mapping data assets to their sources and retention policies.

Audit trail & BI layer

Snowflake's access history logging was enabled across all schemas, providing a queryable audit trail of who accessed which data and when. A Looker instance was connected to the reporting mart with row-level security mirroring the Snowflake role structure. Three core dashboards were built: daily revenue, sales pipeline, and a data freshness monitor that alerts when any connector falls behind schedule.

From 8 scattered sources to one governed warehouse.

Sources Stripe HubSpot PostgreSQL + 5 others Ingest Fivetran 8 connectors incremental sync raw schema Transform Snowflake + dbt Staging layer dbt models RBAC + masking clean + govern Marts Revenue Pipeline Operations Visualize Looker 3 dashboards row-level security live queries GDPR & Compliance Layer Column-level PII masking Role-based access Data lineage Audit trail Art. 30 register

Eight sources land in Snowflake's raw schema via Fivetran, get cleaned and documented by dbt, and surface as three governed data marts. GDPR controls — PII masking, role-based access, and full audit logging — are built into the architecture at every layer, not added as an afterthought.

One source of truth. Two contracts unblocked.

8 → 1
All data sources consolidated. Analysts have one system to query — not eight to reconcile.
~15 min
Revenue reporting now runs on a scheduled dbt job. The 2-day Excel process is gone entirely.
2
Enterprise contracts that had stalled on GDPR were closed within 6 weeks of the warehouse going live.
0
Manual data exports or Excel stitching jobs remaining in the analyst workflow.

Tech stack

Snowflake Fivetran dbt Looker PostgreSQL Stripe HubSpot Snowflake RBAC Column-level masking Access history logging GDPR Art. 30 register AWS S3

Data scattered everywhere and GDPR keeping you up?
Let's fix both at once.

Get in touch →
← Back to Case studies