Tame Your Data Asset Sprawl Part I: Minimizing Costs with Single Origin

Minimizing Costs with Single Origin

The Current State of Affairs

The one word that might best describe the current state of data assets at companies big and small is sprawl.

Companies now understand that the status quo is untenable. Every day, operators are building new pipelines and running new queries. Every dashboard or pipeline added is a new layer of complexity. Are these queries optimized? Do they leverage existing infrastructure to deliver results faster, using fewer resources? This is a complex problem to quantify. Just how many data assets are being created? And are all your queries making the best use of your resources?

We are excited to introduce Cost Savings Reports in Single Origin. Get immediate insight into the state of your data asset sprawl and how much you can compress your data assets to realize significant savings in time and compute costs.

When you run an audit on a set of queries in Single Origin, we surface key statistics like:

  • How many queries share similar semantics, e.g., how many queries join the same tables in the same way
  • How costly it is to run a set of queries, as well as how much you could save by using pre-calculated tables for common metric requests
Cost Savings Report in Single Origin

Unique Insights with a Click

Single Origin takes a unique approach to generate insights and optimize existing and future queries.

  1. Query audits in Single Origin analyze multiple queries at once to generate insights. Instead of optimizing your queries 1-by-1, you can audit thousands.
  2. Since we analyze thousands of queries simultaneously, the magnitude of savings you can get from an audit tends to be much higher than similar tools. Our case studies have highlighted opportunities to reduce overhead and query costs by 50-90%.
  3. We automatically generate cost statistics for the original and optimized queries and validate that the data returned is the same. This way, you can act quickly and confidently.
  4. Once opportunities for optimization are identified, Single Origin generates common computation logic and rewrites your queries to take advantage of the new optimized computation layer.

Optimizing your data assets varies based on where a set of queries is sourced from (pipelines or dashboards), so we cover both scenarios below.

Consolidate your pipelines

Data pipeline sprawl can be a significant problem for companies. Over time, a company may end up with thousands of running pipelines that must be maintained. Nobody has complete knowledge of every pipeline, which makes streamlining almost impossible.

Single Origin query audits group semantically similar pipelines and highlight where you can remove or consolidate models. Notably, the audit's summary quantifies how many pipelines are similar: is one version of a complex join running every day, or are dozens of versions spread across all your pipelines? This information helps you focus on the most redundant logic to streamline redundant pipelines and save time and money.

Audit your dashboards

Dashboard sprawl can also be a significant problem for companies. It can result in high costs, as users may unintentionally run numerous unoptimized and resource-intensive queries. A cost-savings report can help identify and address such issues.

When you audit your dashboard queries, Single Origin groups similar queries together, extracts the common metrics, and suggests a pre-calculated table that generates the same data. Instead of repeatedly running complex queries with a dashboard, you can run the complex query once & then have your dashboard reference the smaller, pre-calculated output. A Single Origin savings report quantifies how much you can save using such pre-calculated tables.

Automation at every step

Addressing data sprawl can be time-consuming, but Single Origin automates the process. All you need to do is:

  • Import queries from your history logs - no need to manually input each query
  • Run a query audit to find and group semantically similar queries.
  • Automatically extract the common logic in a group of similar queries and use a pre-calculated table to more efficiently generate your metrics.
  • Remove redundant pipelines and replace expensive queries with pre-calculated versions to achieve massive savings.

Reach out to generate your cost savings reports today! Connecting to your project, auditing your queries, and generating a report only takes a few minutes.

Kevin Penner

Product @ Single Origin