Data Engineering · Cloud Security
8 sources → 1 warehouse · analysts off Excel · GDPR-ready on launch
A 50-person European payments startup had data scattered across Stripe, HubSpot, a Postgres transactional DB, and five other tools. Analysts were stitching CSVs in Excel. The CTO needed a single source of truth — with GDPR controls built in, not added later.
Challenge
Data lived in 8 disconnected tools. Every revenue report was a manual Excel job. No audit trail, no data lineage, and a GDPR compliance gap that blocked enterprise customer contracts.
Approach
Built a Snowflake-based data warehouse with Fivetran connectors for all 8 sources, dbt for transformation and documentation, and GDPR controls — role-based access, data classification, PII masking, audit logging — built into the architecture from day one.
Outcome
Analysts query live data in Snowflake instead of refreshing CSVs. The CTO closed two enterprise contracts that previously stalled on the GDPR question. Revenue reporting time dropped from 2 days to under 15 minutes.
By the time the startup reached 50 people and a Series A close, their data infrastructure had grown organically into a patchwork. Stripe handled payments, HubSpot owned sales and marketing attribution, a Postgres database ran transactions, and five additional SaaS tools covered support, operations, and finance. None of them talked to each other. Every revenue meeting started with an analyst spending two days pulling CSVs, de-duplicating records, and stitching them together in Excel. The process was fragile, slow, and impossible to audit.
The harder problem was compliance. A pair of enterprise prospects — both with information security review processes — had stalled negotiations over GDPR readiness. The startup had no formal data lineage, no documented retention policies, no role-based access controls on sensitive customer data, and no audit trail. Their legal team had flagged the gap. The CTO needed a solution that didn't just fix reporting — it needed to make GDPR compliance a core property of the data architecture, not a layer bolted on later.
Adimen scoped and delivered the full stack in a single 6-week retainer engagement:
Fivetran connectors were configured for all 8 sources: Stripe, HubSpot, PostgreSQL (via log-based replication), and five additional SaaS tools. Each connector was configured with incremental sync and column-level filtering to avoid pulling PII fields that weren't needed downstream. All raw data lands in a dedicated raw schema in Snowflake, with no transformations applied at ingestion — preserving a complete, auditable record of what came in and when.
A dbt project was built to clean, join, and aggregate raw data into the reporting layer. Each model includes schema tests, freshness assertions, and column-level documentation. The transformation layer exposes three core marts: revenue (Stripe + PostgreSQL), pipeline (HubSpot), and operations. Every model is version-controlled, lineage is fully traceable in the dbt DAG, and all transformations are deterministic and reviewable by the startup's engineering team.
PII fields — names, emails, payment identifiers — were classified at ingestion and masked in the transformation layer using Snowflake's dynamic data masking policies. Role-based access was configured at the schema level: analysts get the reporting mart, engineers get the staging layer, and PII access is restricted to a named data owner role. A GDPR Article 30 register was generated from the dbt documentation, mapping data assets to their sources and retention policies.
Snowflake's access history logging was enabled across all schemas, providing a queryable audit trail of who accessed which data and when. A Looker instance was connected to the reporting mart with row-level security mirroring the Snowflake role structure. Three core dashboards were built: daily revenue, sales pipeline, and a data freshness monitor that alerts when any connector falls behind schedule.
Architecture
Eight sources land in Snowflake's raw schema via Fivetran, get cleaned and documented by dbt, and surface as three governed data marts. GDPR controls — PII masking, role-based access, and full audit logging — are built into the architecture at every layer, not added as an afterthought.