Bolstering Data Reliability with Fivetran
Meet the Authors
⇨ ELT pipelines are prone to breaking down, making it difficult for organizations to move their data and integrate their data sources with their targets.
⇨ Companies need a plan for bringing their data sources all together and also managing those data pipelines altogether.
⇨ Companies should evaluate their data to see if it is cleansed and harmonized enough to be useful for AI, ML, and LLMs.
With an increased push from corporate leaders for AI and analytics, data infrastructure has become increasingly vital for business agility. Companies must have resilient extract, load, and transform (ELT) pipelines to meet their growing and evolving data needs.
Unfortunately, these pipelines are often problematic. ELT pipelines are prone to breaking down, making it difficult for organizations to move their data and integrate their data sources with their targets. In 2024, many SAP organizations will look to move to SAP S/4HANA and leverage their data for analytics, automation and AI/ML tasks. It is critical that these companies are able to move data out of, into, and across cloud data platforms quickly and smoothly.
Companies are increasingly turning to companies like Fivetran to ensure they can access their data when and where they need it. Fivetran’s data movement platform helps address some of the more common issues found in ETL pipelines.
“Data pipelines are often built in a bespoke manner that involves workflow orchestration tools as well as various ways of reading triggers and databases. Customers want a hands-off, automated solution that works as a managed service, alerting them when something goes wrong, alerting them when there have been schema changes or other fundamental changes to their pipelines,” said Garrett Kelly, Lead Technical Product Marketing Manager at Fivetran.
Even an ELT process that runs smoothly can be time-consuming. Companies may have data scattered across apps, databases, data warehouses, data lakes, and in other miscellaneous files. Trying to move these manually not only adds significant time to the project, but can also introduce the risk of errors, duplication, and compliance issues. These are some of the issues that Fivetran aims to address.
Reliability, Scalability and Security
When considering data movement options, companies have several key factors to consider:
- Reliability – All too often, data teams spend the bulk of their time on maintenance projects for existing projects, rather than upgrading them or adding additional capabilities. A high percentage of uptime is crucial.
- Scalability – Data teams must ensure that any data movement solution they select meets their needs not just today, but tomorrow as well. Companies should have a forward-looking solution that can handle large data volumes and work with a wide variety of connectors including numerous SAP applications. Their pipeline solution must also be able to easily integrate into complex cloud, on-prem and hybrid IT infrastructure.
- Security – SAP data is business critical driving enterprises to a competitive advantage. To use this data, however, companies must follow stringent requirements on how they store, tag, and delete the data they collect. Automating governance in the pipeline protects data in-flight.
Fivetran’s platform offers end-to-end automation, allowing data movement projects to have a significant impact on business outcomes while also ensuring data quality.
“Fivetran’s data integration platform automates the trickier parts of ELT pipelines to enhance our customers’ ability to move their data and integrate a variety of data sources in their targets. It’s done in a low-code or even no-code environment and makes setup of new data pipelines easy with the click of a button. Fivetran can be deployed as SaaS, self-hosted or a hybrid to allow the customer to pick an offering that’s most beneficial for their use case and security requirements,” said Kelly.
Fivetran provides all customers with a reliable data integration solution that reports having 99.9% uptime as well as one-minute sync frequency for low latency. It can pull data from over 400 sources with pre-built connectors – including a number of SAP applications – and offers by-request, custom-built, and SDK connectors. It also offers users the ability to choose between a fully managed, self-hosted, or hybrid experience in the cloud.
“At Fivetran, we are trying to solve enterprises’ biggest problems of using a variety of different data sources. That always remains a challenge because the number of sources will only grow. Organizations want to bring those sources together in one experience. We see the uptake in RISE with SAP and also the public cloud and we want to bring those customers in with us and give them that same experience,” said Edwin Commandeur, Lead Product Manager at Fivetran.
While outcomes, reliability, scalability and security should be key considerations for data movement projects, the reality is that cost will always play a major factor. Data teams will have to make the business case for any solution they want to implement. Fivetran offers a consumption-based pricing model in which unit cost decreases as more unique data is synced. This means businesses won’t face steep increases as their data needs scale up. Fivetran charges for monthly active rows, which are the rows that are inserted, updated, or deleted by the source connector. Each row is only counted once, regardless of the number of changes in a month. Historical syncs are free of charge.
Data for AI
AI is one of the hottest topics in the SAP space, as businesses now see that these solutions are mature enough to deliver real business value. Many organizations are now pushing to get their data prepared so that it can be used for AI, ML, or large language models (LLMs), using Fivetran to normalize, standardize and centralize data so they can maximize the value of their data in this context.
“Customers are finding ways to utilize Fivetran in their data pipelines such as moving data into a Snowflake or Databricks staging database or data lake environment to then run models. Fivetran is a strong intermediary due to our breadth of 400-plus sources and our ability to handle even the largest data loads. We are powering the AI/ML models for large GPT-based solutions that customers are using to power strategic decision making and their ongoing business analytics,” said Kelly.
Fivetran also noted a growing demand for targets like Databricks and Google BigQuery, as those targets are often linked with machine learning. There is also more demand from an SAP sourcing site perspective, indicating that SAP data is starting to get used more in machine learning contexts.
“With many organizations having long-running historical data sets and back office data,, they can often get added quality and value by integrating the data from their sources together. Better quality data leads to greater value in their machine learning and AI models. I see greater uptake of those target systems than traditional databases that we saw more in the past,” said Commandeur.
Global shipping and mailing company Pitney Bowes is one Fivetran customer working hard to modernize its technology stack to unlock the value of its data. Pitney Bowes manages roughly 400 million parcels per year, serving hundreds of thousands of businesses, yet for a long time the company did not have real-time data. This lack of information made spending inefficient and scalability nearly impossible.
By moving to a cloud-based platform on Snowflake and Fivetran, Pitney Bowes was able to remove custom batch scripts and ETL processes. Out-of-the-box connectors helped establish pipelines for SAP, as well as several other business critical applications. Fivetran was able to reduce batch load times to under two hours – a 94% reduction. The company is now armed with near real-time visibility and analytics for its global distribution centers.
Pitney Bowes can now track each of its parcels, allowing the company to predict any changes in mail volumes. It now has the agility to forecast the labor it will need to handle these changes, allowing the entire operation to run more efficiently.
Going forward, Fivetran is aiming to serve as many use cases and deployments as possible, making large investments in low latency database sources and in the cloud. The company is also pushing to expand the number of different systems, configurations, deployments, and use cases where it can be effective in the SAP ecosystem.
Fivetran is an SAP Silver partner. Its solutions can interact with SAP Datasphere, SAP Rise, and other SAP applications. Many SAP users want to use an external solution to integrate SAP data with non-SAP data, which Fivetran offers the ability to do. Fivetran is continuing to invest in and grow their SAP data movement capabilities.
“We want to provide the go-to solution for our customers. For most enterprises, that also means that you need to have a very extensive solution to source from SAP. A lot of SAP customers want to integrate SAP data with low latency and a minimal footprint as it often contains valuable information that is a cornerstone for their data integration environment. Fivetran wants to provide that to them with support for different systems, different configurations, and different deployments,” said Commandeur.
What this Means for SAPinsiders
Develop a data strategy. Companies need a plan for bringing their data sources all together and also managing those data pipelines altogether. They should aim for a solution that brings together SAP and other data sources, allowing the business to manage them all together. A holistic end-to-end data approach enhances the value of a data movement solution, ensuring that all current and future data needs are met. Automated and scalable platforms like Fivetran can help save time when planning for significant moves or transformations.
Prepare for AI. Organizations are increasingly searching for solutions that prepare their data to be used in AI, ML, and LLMs. Companies should evaluate their data to see if it is cleansed and harmonized enough to be useful for these purposes. SAP source data can be an important addition to these use cases.
Build a business case. As more teams embrace managed service solutions for their data integration, they need to understand and accurately represent the value of the solution. With the automated capabilities of Fivetran, data engineers can garner enterprise wide buy-in and bolster support for a new solution not just within their own team but within their organization as a whole.