Databricks Data Intelligence Platform Helps Heathrow Improve Customer Satisfaction and Optimize Passenger Flow
Meet the Authors
Key Takeaways
⇨ Heathrow Airport has selected the Databricks Data Intelligence Platform to unify and govern its data and AI, enabling improved passenger flow forecasting and operational efficiency.
⇨ Databricks provides bi-directional integration with SAP Datasphere, allowing Heathrow to leverage its extensive data (over 26TB) for accurate real-time decision-making and advanced AI capabilities.
⇨ With the implementation of Databricks, Heathrow has significantly enhanced its forecasting model, reducing insight generation time from two weeks to four hours and decreasing forecast error margin from 30% to 10%, ultimately improving service and operational effectiveness.
Databricks, one of the four original launch partners for SAP Datasphere, recently announced that Heathrow, the largest airport in Europe has chosen the Databricks Data Intelligence Platform. Heathrow will unify and govern its data and AI on Databricks to help predict and manage passenger flow.
Databricks and SAP deliver bi-directional integration between SAP Datasphere—with SAP data’s complete business context—with its Data Lakehouse platform on any cloud platform.
As one of the world’s best-connected airports welcoming over 82 million passengers and handling over 460,000 flights annually, Heathrow relies on over 26TB of data to make critical real-time decisions. This includes data on passenger numbers, flights, airlines, seasonal performance, as well as traffic and weather patterns — all of which are crucial to power daily flights.
Prior to partnering with Databricks, Heathrow already had impressive forecasting capabilities. However, the airport wanted to do more to realize the full potential of forecasting by refining its approach to data and AI, with tools for user education, data governance and security, and machine learning (ML) model training.
Heathrow chose to centralize its data and AI on the Databricks Data Intelligence Platform on Microsoft Azure. Built on a lakehouse architecture, the Databricks Data Intelligence Platform provides an open, unified foundation for all data and governance and is powered by a Data Intelligence Engine that understands an organization’s unique data.
Databricks is addressing what recent SAPinsider research uncovered as the number one obstacle to AI adoption. In the recent report AI: State of Adoption 2024, 53% of organizations cited challenges with legacy data and applications as the leading obstacle to adopting AI. High-quality data ensures that AI models make accurate predictions and decisions. If the data is noisy, incomplete, or biased, the AI model’s performance will suffer, leading to poor outcomes. Also AI models generally improve with larger datasets. The more data the model has access to, the more patterns it can learn and the better it performs in real-world applications.
With Databricks, Heathrow now has the capability required for advanced AI and ML use cases — starting with its passenger flow forecasting model. In just a year, Heathrow sped up forecast insights from two weeks to four hours while decreasing the margin of error from 30% to 10%, allowing the airport to become more efficient and accelerate productivity.
With its forecasting model in Databricks, the airport can plan predictive maintenance, cleaning and service interruptions during downtime to reduce costs and better avoid impacts on travelers. As a result, their teams are better prepared for peak travel periods so they can also proactively manage efficient passenger flow.
“Databricks enables us to provide a more efficient and accurate forecast than we’ve ever been able to before. Now, passengers and stakeholders get a greater level of service in a much more efficient airport,” said Andrew Isenman, Head of Technology, Cloud and Data at Heathrow.
With the Databricks Data Intelligence Platform, Heathrow was able to simplify its data architecture, improve security, and more efficiently conduct analytics and AI, which allowed its data teams to focus on opportunities to innovate on its business and customer experience. Other key success areas include:
- Data governance: The airport is implementing data governance with Databricks Unity Catalog to secure its data and AI assets under a single permission model, thereby furthering safe and consistent data use across the airport.
- Compute power: With improved, remote computations, the airport can now scale quickly — allowing high-level forecasts to be distributed downstream in easy-to-digest, self-serve analytics. This new capability also drastically speeds up time to insights.
- Cross-team collaboration: Shareable code and insights via team-based notebooks fosters cross-team collaboration. Furthermore, Heathrow integrated Power BI with Databricks to power dashboards and visualizations with the most up-to-date data for business reporting to improve decision making.
- Certification capabilities: To efficiently scale internal adoption and use, Heathrow’s users are capitalizing on the self-paced training and certification available through Databricks Academy.
- Model tracking and data sharing: Heathrow is also leveraging MLflow to track and promote its ML models from development to production. The team is also looking at Databricks Delta Sharing for data sharing with airlines, ground handlers and other Heathrow companies, and is in the early stages of establishing a center of excellence to support its ongoing data and AI strategy.
“Data-driven, accurate forecasting is a non-negotiable in the travel and hospitality industry — with thousands of daily flights totally dependent on this being done right,” said Michael Green, VP, Head of Northern Europe at Databricks. “With the Data Intelligence Platform, Heathrow is able to understand future passenger volumes and plan accordingly, resulting in better service for passengers and a much more efficient operation.”
What this means for SAPinsiders
Share your data management strategies. The focus on enterprise data management is intensifying with the proliferation of 5G, IoT, AI/ML and other transformative technologies. SAP customers are increasingly looking for new data management models for the storage, migration, integration, governance, protection, transformation, and processing of all kinds of data ranging from transactional to analytical. Balancing the risks, compliance needs, and costs of data management in SAP HANA on-premise and on the cloud while also providing reliable, secure data to the organization is increasingly important to the business We will be releasing the 2025 Data Management Strategies research report in February 2025. Contribute to the research by completing this survey: https://www.research.net/r/DataMgt25.
Build strong data foundation for AI-enabled forecasting. Collect relevant historical data that is representative of the forecasting target. This data may include sales records, market data, customer data, inventory data, or external factors like weather or economic indicators. Ensure your data is accurate, consistent, and free from errors or outliers. Data cleaning and preprocessing steps include handling missing values, normalizing data, and removing irrelevant features. Create additional features that may improve the forecasting model. This could include time-based features (e.g., day of the week, seasonality), external variables (e.g., marketing spend, holidays), and lag variables (previous time steps).
Carefully evaluate data lakehouse providers. Databricks coined the term “lakehouse.” When selecting a data lakehouse provider, it’s essential to evaluate multiple factors to ensure the solution aligns with your organization’s data management, analytics, and scalability needs. A data lakehouse combines the benefits of a data warehouse (structured data and ACID transactions) with the flexibility of a data lake (unstructured data storage). Factors to consider include data storage and scalability; performance and speed; unified data management; data governance and security; integration with existing tools and ecosystem; support for advanced analytics; data lakehouse architecture; cost structure and flexibility; data sharing and collaboration; data lakehouse management and usability; support, documentation and community; and future-proofing.