Work/2026

Lumina Bank

Enterprise big-data platform unifying a 12-bank federation on GCP — 5K+ TPS with sub-200ms fraud scoring.

TimelineFeb — Apr 2026
StatusMSc Big Data · graded 20/20
TypeCloud architecture · Streaming · ML
Lumina Bank — main visual
5K+ TPSsustained on payday peaks
<200msp95 fraud decision
12bank entities unified

A multi-tenant data platform that unifies a 12-entity banking federation (~9M client scenario) into one shared system, sustaining 5K+ TPS on payday peaks with sub-200ms p95 fraud scoring. The business case: cut fraud losses by an estimated 30–50% by detecting cross-bank fraud the fragmented model couldn’t see — with the whole platform declared as code in six Terraform modules.

Highlights

  • Real-time fraud decisioning. An Apache Beam pipeline on Dataflow scores every transaction against a Vertex AI endpoint serving a Spark MLlib GBT model, retrained daily on a 90-day window in ephemeral Dataproc. GBT was chosen over deep nets for microsecond inference and the per-feature explainability bank regulators can demand.
  • Fan-out that de-fragments 12 banks. A single Pub/Sub distribution topic with twelve subscriptions filtered by bank entity, each with its own retention and dead-letter policy — every bank consumes at its own pace while centralized processing catches fraud patterns across all of them.
  • Polyglot storage by access pattern. Memorystore Redis for sub-millisecond decisions, Bigtable profiles with hotspot-free row keys, BigQuery reaching sub-10s freshness, and a Parquet lake with lifecycle rules to Nearline and Coldline. Ephemeral Dataproc cuts batch compute cost by over 95% versus an always-on cluster.
Built with
GCPTerraformPub/SubDataflow + BeamDataproc + Spark MLlibBigQueryBigtableMemorystoreVertex AI