Informatica Strengthens Databricks Partnership with Native GenAI Capabilities for Databricks Data Intelligence Platform

Reading time: 4 mins

Meet the Authors

  • Mark Vigoroso

    CEO, ERP Today & Chief Content Officer, Wellesley Information Services

Key Takeaways

⇨ Informatica and Databricks are enhancing their partnership with deeper integration of Informatica’s Intelligent Data Management Cloud (IDMC) and Databricks Data Intelligence Platform, enabling no-code data pipelines directly on Databricks.

⇨ The new capabilities introduced, such as support for AI Functions in Native SQL ELT and over 250 native SQL functions, empower organizations to adopt GenAI functionalities and execute complex data operations with minimal coding, improving efficiency and performance.

⇨ Informatica's advancements in data governance, quality assurance, and automation through its CLAIRE engine, coupled with Databricks' scalable architecture, provide SAP customers with tools to efficiently manage data while ensuring compliance and leveraging advanced AI capabilities.

Informatica (NYSE: INFA), provider of enterprise AI-powered cloud data management, recently announced the latest advances in its continuing partnership with Databricks, the data and AI company. New areas of collaboration include deeper integration between Informatica’s Intelligent Data Management Cloud (IDMC) platform and the Databricks Data Intelligence Platform, including Informatica support for AI Functions on Databricks in Informatica’s Native SQL ELT that processes no-code data pipelines natively on Databricks. Informatica currently has customers in approximately 100 countries, including more than 80 of the Fortune 100.

“We are seeing phenomenal success with our Databricks-related business, with rapid growth and delivering impactful business outcomes for customers such as Takeda, KPMG and Point72 to name just a few,” said Amit Walia, Chief Executive Officer at Informatica. “One of our key priorities while partnering with Databricks is empowering customers to build enterprise-grade GenAI applications. These applications leverage high-quality, trusted enterprise data to provide high-impact Gen AI applications with rich business context and deep industry semantic understanding while adhering to enterprise data governance policies.”

“As demand for data intelligence increases, we want to help our customers build and deploy high-quality AI applications that deliver accurate, domain-specific outputs,” added Adam Conway, SVP of Products at Databricks. “As a leader in cloud-native, AI-powered data management, Informatica is a key partner of ours, supporting everything from data integration and transformation to data quality, governance and protection.”

Explore related questions

On the product front, the recently introduced support for Databricks AI Functions in Informatica’s Native SQL ELT enables organizations leveraging Databricks to rapidly adopt GenAI capabilities of the Databricks Data Intelligence platform with no-code data pipelines, opening Databricks GenAI capabilities to no-code users and ensuring consistency, maintainability and optimal performance of data pipelines running natively on Databricks. The AI Functions on Databricks enable customers to use key Gen AI capabilities, including sentiment analysis, similarity matching, summary generation, translation and grammar correction on your data directly from SQL.

In addition, Informatica’s new Native SQL ELT for Databricks makes it possible to “push down” data pipelines with 50+ out-of-the-box transformations and support for 250+ native Databricks SQL functions to run natively and efficiently on Databricks via Databricks SQL. This enables customers to easily build data pipeline flows in Databricks by leveraging Informatica’s powerful no-code/low-code SQL ELT capabilities.

Late last year, Informatica announced a GenAI Blueprint for Databricks and other enterprise-grade capabilities, including support for Databricks’ Unity Catalog in IDMC. The IDMC platform includes multiple features optimized for Databricks, such as 300-plus data connectors, the ability to create low-code/no-code data pipelines, data ingestion and replication, and GenAI-driven automation via Informatica’s CLAIRE GPT and CLAIRE Co-pilot.

Informatica was recognized by Databricks as its 2024 Data Integration Partner of the Year at the Data + AI Summit. Informatica and Databricks have worked closely over the past several years to deliver transformative data management and analytics solutions to many enterprise customers. Together, they are driving new ways to develop and manage GenAI and analytics workloads across their platforms.

What this means for SAPinsiders

Share your data management strategies. The focus on enterprise data management is intensifying with the proliferation of 5G, IoT, AI/ML and other transformative technologies. SAP customers are increasingly looking for new data management models for the storage, migration, integration, governance, protection, transformation, and processing of all kinds of data ranging from transactional to analytical. Balancing the risks, compliance needs, and costs of data management in SAP HANA on-premise and on the cloud while also providing reliable, secure data to the organization is increasingly important to the business. We will be releasing the 2025 Data Management Strategies research report in February 2025. Contribute to the research by completing this survey: https://www.research.net/r/DataMgt25.

SAP customers stand to gain from integration between Informatica’s IDMC and the Databricks Data Intelligence Platform. SAP customers can design and deploy data pipelines without extensive coding, simplifying the process of integrating SAP data with Databricks. This approach reduces development time and minimizes errors. Informatica’s Native SQL ELT allows data transformations to execute directly within the Databricks environment, enhancing performance and scalability. This native processing ensures efficient handling of large datasets typical in SAP systems. The integration enables the application of AI functions—such as sentiment analysis, similarity matching, summary generation, translation, and grammar correction—directly within SQL queries. This empowers SAP customers to enrich their data analytics with advanced AI capabilities without leaving the Databricks platform. Informatica’s CLAIRE GPT and CLAIRE Co-pilot provide generative AI-driven automation, assisting in data management tasks and offering intelligent recommendations, thereby improving decision-making processes. The combined solution offers robust data governance, quality, and lineage tracking features, ensuring that SAP data remains accurate, consistent, and compliant throughout its lifecycle. And integration with Databricks Unity Catalog allows for centralized metadata management, providing SAP customers with a unified view of their data assets and facilitating easier data discovery and compliance reporting.

Follow best practices for integrating SAP with Informatica and Databricks. First, prioritize high-value SAP data sources, focusing on SAP ERP (SAP S/4HANA), SAP BW, and SAP Data Services to extract valuable data, and determine whether to integrate real-time (streaming) or batch-based data pipelines. Leverage SAP Operational Data Provisioning (ODP) for incremental (delta) extraction to minimize data movement, and utilize Informatica Cloud Mass Ingestion (CMI) for automated SAP data replication. Use Informatica’s SAP-certified connectors to extract structured and unstructured SAP data, and integrate with Databricks Delta Lake to create a unified, scalable data repository. Store SAP transactional and operational data in Databricks Delta Lake for scalability and fast querying, and apply data lakehouse architecture for structured and semi-structured SAP data. Use Databricks MLflow for AI-driven predictive analytics on SAP financial or supply chain data, and integrate Informatica’s AI-powered CLAIRE engine for automated metadata management and data quality assurance. Use Informatica Data Governance and Databricks Unity Catalog for centralized metadata and data governance, and ensure GDPR, CCPA, and industry-specific compliance for SAP data integration. Implement Informatica’s Native SQL ELT to process SAP data natively within Databricks, reducing latency, and automate SAP data workflows using Informatica’s no-code orchestration. Utilize Databricks Auto-Optimize and Delta Caching to enhance SAP data performance, and set up alerts and monitoring for SAP-Data pipeline failures or performance bottlenecks. Ensure IDMC and Databricks are optimized for hybrid SAP landscapes (on-prem, AWS, Azure, GCP), and enable real-time SAP data synchronization for mission-critical applications.

More Resources

See All Related Content