Logistics Cyber Resiliency Lessons from NOTAM Disruption
Meet the Experts
⇨ As supply chain and logistics processes get more and more automated, the importance of cyber resiliency increases.
⇨ Cyber resiliency needs to be embedded in the design of end-to-end supply chain and logistics system architectures.
⇨ The recent brief disruption of NOTAM provides few lessons that can be leveraged to design cyber resilient logistics systems.
If you are a cold-war era movie buff, you may have watched the movie “Fail Safe.” The movie’s crux is that failure in a system designed to never fail results in Moscow and New York getting nuked. The system was so robustly designed to prevent the former Soviet Union from interfering with it that when the system sends an erroneous code to US bombers in the air to nuke Moscow, even Pentagon cannot stop these fighters from nuking Moscow. Fortunately, the horror of the cold war era is behind us, and the movie was fiction, but that movie did teach a lesson in system design. When designing system resiliency, every failure mode needs to be accounted for. As supply chains and logistics processes race to get more and more automated, designing true “Fail Safe” architectures becomes increasingly essential.
It is beyond doubt that logistics, specifically transportation, acts as the backbone of supply chains in many industries. And often, in the corporate world, we forget that some of the largest logistics and transportation networks are run by federal agencies. And in many cases, there are best practices that these public organizations have developed that can be leveraged by the corporate world. Similarly, lessons learned from any hiccups in that world can also be leveraged into the corporate world to build better supply chain processes and systems. The recent, very brief but disruptive failure of NOTAM is one such example.
NOTAM is a critical system used for all operating domestic flights, and the data it generates is data that pilots must examine before take-off. The system had a brief failure earlier this week, leading to thousands of flights getting stranded for hours across the U.S. In this article, we will review the critical cyber resiliency lessons we can learn to design more resilient logistics systems in the corporate world.
What is Cyber Resiliency?
While many different definitions of the term cyber resiliency exist, this definition from NIST  captures all aspects eloquently:
“The ability to anticipate, withstand, recover from, and adapt to adverse conditions, stresses, attacks, or compromises on systems that use or are enabled by cyber resources. Cyber resiliency is intended to enable mission or business objectives that depend on cyber resources to be achieved in a contested cyber environment.”
What is NOTAM?
NOTAM is an abbreviation for Notice To Air Missions. It is a computer system owned by Federal Aviation Administration (FAA) that captures and summarizes critical preflight data that pilots need before take-off, like chances of bad weather on the flight path, runway, taxiway scheduling challenges, restricted airspace that needs to be avoided, etc. The initial avatar of the system was commissioned in 1947, modeled after a system used for maritime navigation. The official FAA definition  is:
“A NOTAM is a notice containing information essential to personnel concerned with flight operations but not known far enough in advance to be publicized by other means. It states the abnormal status of a component of the National Airspace System (NAS) – not the normal status.
- NOTAMs indicate the real-time and abnormal status of the NAS impacting every user.
- NOTAMs concern the establishment, condition, or change of any facility, service, procedure, or hazard in the NAS.
- NOTAMs have a unique language using special contractions to make communication more efficient. “
The definition itself should be enough to help understand how critical the data generated. from this system is. So when the system went down, the safest path was. to ground all departing flights and that is what FAA did.
Cyber Resiliency Issues
But what exactly went wrong with the system? As per fortune magazine , the system failed due to a bad data file generated by employees working for a contractor. Bad data files led to errors and the system needed to be rebooted. This happens all the time in the IT world. The fact that this is the first time we have heard about this system failing explains the system’s robustness. However, in operations, every incidence, big or small, imparts specific lessons. Leveraging this incident to extract these lessons in no way indicates that the system is not robust. As per some news articles, the current system is almost three decades old. But we need to keep in mind that when it was designed and introduced, it was one of its kind in the world and it still works robustly. However, cyber resiliency is not just about the technology but the people and processes around the technology as well. Some cyber resiliency issues in this incident were:
- People: A system that is associated with national security was being accessed by contract workers.
- Process: System redundancy was not designed properly. If the safeguard was that an erroneous file would stop the system from generating the data (which makes sense), a parallel redundancy needed to be in place for such a critical system.
- Technology: For such a critical system, safeguards were not in place to ensure that bad data files would not get in the live system. Perhaps they were and were overridden by the employees (which then makes it a “people” issue).
Lessons Learned for Logistics Systems Cyber Resiliency
While within corporate environments, we do not run systems that may not be as mission-critical as NOTAM, some systems like those in the warehousing and manufacturing world that pertain to operations, are critical in their world. And with the end goal of running almost autonomous supply chains, cyber resiliency becomes much more critical. Some lessons to consider are:
- Data, whether historical or captured in real-time, forms the foundation of most logistics systems. Poor data quality can not only lead to issues like unreliable output, but severe issues can also lead to system failure, like in this case. A critical element of architecture hence needs to be evaluating, enhancing, and harmonizing the data and also raising flags that may lead to catastrophic failures like the NOTAM example. Pre-processing steps need to be in place to ensure that critical systems are being fed quality data seamlessly. Most systems absorb data in a specific format. While this creates rigidity, it also means that you can relatively easily leverage machine learning algorithms to identify anomalies even before the data gets into the system.
- System redundancy is critical as well. This has been a common practice in the IT world though. Indian. IT companies have typically built mirror sites for their clients. As an example, the Bangalore delivery hub of a company may have a mirror hub in Chennai and/or vice versa. While this is an example of a cyber-physical setup (like a data center), the same can be set up for the cyber resiliency of logistics systems.
- This a reminder that no matter how robust your processes and technology are, people make everything run seamlessly and in sync. So no matter how “autonomous” your systems are, eventually, it will be your people who will ensure that you extract the total value from your technology investments. This failure, as reported, was triggered by human error and propagated due to a lack of safeguards against such errors. Training and recruitment should be your most critical focus areas.
 NIST glossary of computer security resources: https://csrc.nist.gov/glossary/term/cyber_resiliency
 FAA definition- What is NOTAM? : https://www.faa.gov/about/initiatives/notam/what_is_a_notam
 Fortune article on NOTAM failure: https://fortune.com/2023/01/13/faa-computer-failure-grounded-thousands-flights-caused-2-contractors-introduced-data-errors-notam-system/