Data Integration Platform

Use Spark interims to troubleshoot and polish low-code Spark pipelines: Part 1

Let’s take advantage of Spark’s interim metadata to understand our Spark job behavior with low-code tooling.

Anya Bida

Assistant Director of R&D
Texas Rangers Baseball Club
‍

July 1, 2025

Let’s take advantage of Spark’s interim metadata to understand our Spark job behavior with low-code tooling. The Spark UI shows me some nice metrics for job completion time, num rows read, num rows written, and some related details...

...but I want to know how my pipeline behaves over time.

Ok, but manually checking for pipeline success is not a viable goal. I need testing and alerting!

Historical metadata gets super handy when I want to compare my pipeline runs using multiple Spark versions. Check out Part 2 of this blog where we troubleshoot individual dataframes.

How can I try Prophecy?

Prophecy is available as a SaaS product where you can add your Databricks credentials and start using it with Databricks. Or you can use an Enterprise Trial with Prophecy's Databricks account for a couple of weeks to kick the tires with examples. We also support installing Prophecy in your network (VPC or on-prem) on Kubernetes. Sign up for your 14 day free trial account here.

‍

Ready to give Prophecy a try?

You can create a free account and get full access to all features for 21 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.

Ready to see Prophecy in action?

Request a demo and we’ll walk you through how Prophecy’s AI-powered visual data pipelines and high-quality open source code empowers everyone to speed data transformation

Get started with the Low-code Data Transformation Platform

Meet with us at Gartner Data & Analytics Summit in Orlando March 11-13th. Schedule a live 1:1 demo at booth #600 with our team of low-code experts. Request a demo here.