5x faster data transformations deliver better customer insights

Prophecy helps leading FSI to transform 2 billion daily transactions in minutes

5x
faster data
transformations 
85%
decrease in time
to deploy
Company Size
25K+ employees
Industry
Financial Services
Department
Data engineering and
lines of business
Technology
Spark Data Lake

Challenges of Proprietary ETL

Proprietary ETL systems struggle with scale, speed and using open source innovation
Scale limits exceeded by business data

The transaction growth was very high and the team needed to catch up with the increasing volumes that exceeded the capacity of existing on-premises ETL solutions at 2 Billion plus transactions a day

Locked out of Open Innovation

A lot of innovation happens in open-source and proprietary ETL solutions locked them out of this innovation. They could not integrate with other internal or open-source systems

Moving to the Data Lake

Data Lake move was manual, it gave freedom and code, but with productivity and standards suffered
Code reduced standardization

The company like most in financial services dealing with sensitive customer data has strict requirements around compliances such as SOX. Every developer writing their own code made ensuring and verifying compliance and standards hard

SQL only to ship robust data is hard

Using Apache Hive scripts made it harder to use software best practices such as test driven development, code coverage and unit testing of data transformations due to lack of first class support for these constructs

Prophecy accelerates and standardizes

ETL Modernization

Accelerated Migration

Manual migration from Ab Initio was tedious and manual. Prophecy automated the migration enabling the team to modernize faster and meet deadlines

Standardized Pipelines

The Transpiler generated standardized high-quality code. Prophecy visual development ensured the data developers were productive post migration

“Prophecy is incredibly flexible and powerful. For a technical user like me, I can write code in Prophecy using whichever library and programming language of choice, meaning I’m not locked in at all. And it empowers non-developers to be productive with our data through templates and a framework builder that can extend the data engineering standards we create to the wider team. It’s incredible.”

Staff Data Engineer, Major Credit Card Company
Prophecy delivers on

Standards, Productivity

Streamlined workflows lead to heightened performance

Since implementing Prophecy, developer timelines for new model features have shrunk from two weeks to just a few days, making iteration a breeze.Despite processing 2 billion approved transactions per day, they were also able to improve performance, decreasing the time required to perform fillers, joins and aggregates from 2 hours and 45 minutes to just 8 minutes.

Standardization across the company

Prophecy helped to achieve standardization of ETL pipelines across their vast enterprise. In addition, the various personas - data engineers, data stewards, production teams and business data consumers all have a single source of truth they can understand and trust.

How it was done - implementation

Highlights

Multiple mission critical use cases

The data engineering pipelines include streaming and batch pipelines developed on Spark for numerous business use case. There are pipelines to process data across the data lifecycle from ingesting billions of rows, through multiple layers of transformations for analytics, finally passing data on to external downstream customers such as banks where trusted data is mission critical.

Integration with existing systems

Prophecy seamlessly integrated with existing systems including single sign-on LDAP system for using existing authentication and authorization, SCIM system, legacy systems such as IBM MQ and reading and write files from IBM Mainframes.

Integrating into existing GIT and CI/CID

Integration with custom internal systems and processes included an active metadata system that pipelines read and write to during development and execution. There was existing GIT and CI/CD system and specific development mechanism such as using GIT forks.

Accelerated productivity

Building data pipelines visually from a set of standard components shared by all developers gave a large productivity boost. Code completion and hints reduced the need to wade through documentation of underlying functions and expressions available. Interactive single click run with data at each step made iteration faster. Combining these together cut the development time drastically.

Multiple personas enabled

In addition to data engineers, operations engineers are able to quickly build ad hoc pipelines to restate data and fix errors, and they can follow lineage to find errors faster. Business data users are able to build reports and marts. Data Stewards understand how data was computed.

Upgrades

Upgrade management for Prophecy is simple since it is a system built in Kubernetes and the next version upgrade merely needs pulling new images for a new version

eBook

Low-code Apache Spark™and Delta Lake

A guide to making data
lakehouse even easier
Get the eBook
On-demand webinar

Low-code data transformations

10x productivity on
cloud data platforms
Watch the Webinar

Ready to try out Prophecy for free?

Generates Apache Spark code (Python or Scala)
NO vendor lock-in
Seamless integration with Databricks
Code-based Git, testing and CI/CD
Available on AWS, Azure, and GCP