Generate the most reliable and accurate sales data history for your statistical forecasting as part of your SAP Advanced Planning & Optimization Demand Planning project. See how to implement an efficient and effective history cleansing process.
Key Concept
The baseline history of a product is its normal historical demand without promotion, external stimulation, or any other abnormal situation. An outlier is a too-high or too-low sales figure in a product’s history that may occur under special conditions.
History cleansing is the process of cleansing the history of sales data, which is a prerequisite when a company is using the Statistical Forecasting functionality of SAP Advanced Planning & Optimization (SAP APO) Demand Planning (DP). The better the history is, the better the forecast results.
The purpose of history cleansing is to produce a baseline history (i.e., the sales data for a product during its normal lifespan when there are no promotions, shortages, or any other unexpected or abrupt market conditions or changes). Statistical forecasting tools attempt to identify trends that might repeat in the future. Because events such as shortages or product promotions cannot always be predicted, that related data needs to be eliminated from the history used in forecasting.
Though promotions are often regular occurrences for consumer goods companies, these events might not always happen in the same time frames (i.e., in the same weeks or month in the future) as they did in the past, or the durations for such promotions might also be different. Once the sales history is cleansed, forecasting uses a baseline history, called baseline forecast. After baseline forecast numbers have been obtained, you can add data from future promotions and other events to the baseline forecast to get final forecast numbers.
History cleansing is one of the most important activities for a DP project, but it can also be the most time-consuming activity. For organizations handling thousands of stock-keeping units (SKUs), cleansing history for periods of two to three years can consume between 25 percent to 30 percent of the overall time of a DP project. Beyond a one-time history cleansing during the implementation of SAP APO DP software, cleansing needs to an ongoing process.
Having led many DP projects that involve history cleansing for statistical forecasting, I have formulated 10 best practices that will help organizations to achieve the objective of having clean sales history data in a short span of time for effective DP.
Best Practice 1: Define the Scope of the History Cleansing Project
When an organization takes on a history cleansing project, the common practice is to try to cleanse the history of every possible product that the organization had sold in the past. However, if the purpose of history cleansing is to get the best statistical forecast data, then defining the scope of the project is critical. Ideally, history should be cleansed only for those products that can be statistically forecasted and not necessarily for all products.
There are certain products for which statistical forecasting is difficult, so organizations often adopt the practice of forecasting demand for these manually. For example:
New products for which very little history is available.
Products that the organization wants to discontinue in the next few months. If an organization plans to withdraw a product, its sales history will not be useful moving forward.
Promotional products that are sold for a very short period during the year, such as Christmas, Independence Day, or Thanksgiving products.
Seasonal products for which very little history is available. Ideally, a statistical forecasting tool needs 24 to 36 months of sales history to produce reliable statistical data.
Products for which there is a capacity constraint (i.e., the organization cannot produce these products per the market demand because of its limited production capacity). In this case, whatever the organization produces is sold and past sales cannot reflect the actual market demand. If the production capacity increases, sales will increase, although the statistical forecast might not show this increase because it is based on past data.
For these cases, history cleansing may not be very important from a statistical forecasting perspective, as these products will be out of scope for a statistical forecasting project.
Best Practice 2: Start With an 80/20 Analysis
Start your history cleansing project with a simple 80/20 analysis (i.e., identify the top 20 percent of your products that cumulatively contribute to 80 percent net of your total sales volume). You can use your last year’s or last three years’ sales history for this. It is always better to start with most recent history (i.e., the last full year) because there might be products in the previous three years’ history that have been discontinued. The advantage of starting your history cleansing project with the top 20 percent products is that these products will show the maximum improvement in forecast accuracy through statistical forecasting. Therefore, their history should be the cleanest.
Improvements of just a few percentage points in forecast accuracy for these products can help in achieving the overall project target. Once history for these products is cleansed thoroughly, the cleansing effort can be extended to other products. In some companies the 80/20 analysis is also known as ABC analysis where the top 20 percent of products that contribute to 80 percent of sales are defined as A-class products. The A class is most important from a cleansing perspective, while the C class needs the least attention.
Best Practice 3: Group Products for History Cleansing
This third best practice complements the first two with another approach that I call grouping of products, which can further focus history cleansing efforts on selected products instead of all of them.
Grouping of products means classifying all product lines of a company into selected categories for effective history cleansing. Each category is treated differently from a history cleansing perspective. Refer to the sidebar “Grouping Products for More Effective History Cleansing” for a real-life case study of how a large global food company adopted this practice for cleansing the sales history of its 2,000-plus products.
Grouping Projects for More Effective History Cleansing
I helped to implement a history cleansing process at a company that classified all its products into four different categories: Base, Promotional, Mix, and Erratic. Base products were those that did not see much trade and marketing activities during the year. Promotional product demand was heavily driven by sales and marketing activities. Mix products had a steady base, but demand was also driven by marketing activities. Finally, Erratic products showed a very irregular demand pattern with a demand of zero for certain weeks or months, and at other times the demand was very high.
History cleansing was not necessary for Promotional or Erratic products because they were not suitable candidates for forecasting. History cleansing for Base products was not critical because there was not much promotion for these products or much effect of increased sales in history during the promotion period. However, some history cleansing was needed for these products for cases when sales were too low during out-of-stock periods and history cleansing data needed to be adjusted, or when a new product replaced an old one and sales data history needed to be realigned.
Base products also suffered from the cannibalization effect because promotional products launched in the same category could negatively affect the demand for Base products. In such cases, the history of Base products needed to be cleansed, separating them from this negative effect. Mix products needed the maximum amount of cleansing because they experienced the most irregularities: regular promotions, cannibalization, realignment, or out-of-stock. The company gave maximum focus in history cleansing for this category.
Percentage distribution along these different product groups can vary depending on the nature of the industry. For example, for a commodity industry that deals with metals and minerals, there can be many products from a Base group for which there is not much promotion. Therefore, history cleansing is only needed for cases of production shortage or non-delivery due to, for example, a transport strike. A consumer goods or food company perhaps will see most products in a Mix category and a few in promotional categories that are launched during festive seasons or special events. In the sidebar example, the food company found that most of its products fell into the Mix category, except for a few ready-to-eat products that were classified as Promotional.
Best Practice 4: Identify What Is Part of the Baseline History and Which History to Take
If the purpose of history cleansing is to get a baseline history within a short span of time, then it is important to identify what should be part of the baseline history and what should not. Anything that is not part of the baseline should be cleansed from the history.
Typical examples of what should be considered part of baseline history include: permanent price promotions, permanent consumer communications (e.g., advertisements and TV spots), month- or quarter-end picks (that occur regularly), and known seasonal behavior of products (e.g., high sales of cold drinks or ice cream in summer). On the other hand, typical examples of what should not be part of baseline are: trade and marketing activities limited to a specific time period, short-term price changes, and special promotion for new products. For these cases, the related sales data history should be cleansed.
You need to clearly identify what should be included in the baseline history at the beginning of the project, so project team members do not have conflicting ideas or opinions about this. In my experience, this is an area that creates confusion among team members if it is not defined early in the project.
One challenge is deciding which history to use in the statistical forecasting: the history of invoiced quantities or shipped quantities. Ideally, there should not be much difference between the two because whatever is invoiced should be shipped instantly or after a few days. However, the reality is that they can be different, and organizations regularly monitor huge numbers of reports of stock that is invoiced but not yet shipped. This is a common occurrence during month- or quarter-ends when stock that had not been shipped is invoiced out to achieve the sales target and incentive of sales team. In these cases, the history of shipped quantities is a better history choice then invoiced quantity.
Best Practice 5: Decide On the Time Bucket and Product Hierarchy for History Cleansing
There is no strict guideline on time buckets for history cleansing (i.e., whether to do the cleansing at a weekly or monthly level). The decision is mostly influenced by the time bucket for forecasting. If the forecasting happens at a weekly level, the history also needs to be cleansed at a weekly level. However, if the forecasting is on a monthly level, then history can be cleansed either at a monthly level or at a weekly level. Therefore, the ground rule is the history cleansing time bucket needs to be the same or one level lower than the forecasting time bucket.
If a business needs to perform cleansing at a more granular level (e.g., a weekly level) or at more of an aggregated level (e.g., monthly), the decision needs be made after a careful evaluation of advantages and disadvantages of both approaches. The disadvantage of cleansing at a more granular level is that there are more data points, more effort in cleansing, and more abrupt week-to-week variations. The advantage of this approach is that if there is a promotion during a particular week within a month, the effect can be quickly captured in that week, and the supply plan can be adjusted accordingly.
In the case of an aggregated cleansing approach, the volume of a product spreads over the longer time period (e.g., an entire month). If the promotion occurs in the third week of the month, the supply can be increased during that week. However, in a monthly planning scenario, this weekly variation cannot be predicted, so the overall quantity for the month is high because of the promotion. The supply will be increased at the beginning of the month itself, rendering it idle for two weeks before being consumed during the third week. A weekly planning scenario can better predict week-to-week variations. There are also businesses that run on weekly cycles (e.g., planning for the next 13 business weeks) where monthly cleansing may not be effective. Monthly cleansing means fewer data points to clean, less effort, and less variation due to aggregation of data.
It is also important to decide on the product hierarchy for history cleansing (i.e., whether the cleansing needs to be done at the individual product level or at the product group level). Again, the decision is influenced by the level at which the organization is doing forecasting. The ground rule is that cleansing needs to happen at least at the same level or one level below the level at which forecasting is done. It is not possible to forecast at an individual product level while cleansing is happening at the product group level. Most companies generally perform the cleansing at a lower level. For example, if cleansing is done at the product level, then forecasting is done at the product group level for better forecast accuracy.
Best Practice 6: Determine How Much Data to Clean
“How much history to clean” is a common debate at the beginning of any history cleansing project. Although many companies choose a three-year history cleansing approach for products, it’s important to remember that the end purpose of this exercise is to generate the best statistical forecasts.
Determining how much history to cleanse is predominantly decided by which algorithms the company will use for its statistical forecasting. In an SAP APO DP system, different statistical forecasting algorithms need history for different durations. For example, a seasonal algorithm might need history for a longer duration (e.g., two to three years), whereas a trend algorithm with a high alpha value will calculate forecast values that are 95 percent dependent on history of the last eight time periods. Most statistical forecasting models give more weight to the most recent history, so it’s always important to focus on the most recent history and work back from there.
Remember this ground rule: Do not follow a one-size-fits-all approach. Don’t assume history cleansing for three years for all products is needed, when for some products just one year of cleansed history is enough to get a good forecast.
Best Practice 7: Prepare Data
With cleansing strategies in place, you then need to prepare the data before cleansing can begin. Typically, data preparation for history cleansing involves activities such as:
Extracting historical data from the SAP ERP Central Component (SAP ECC) or SAP NetWeaver BW system: At this stage, data is extracted from a source system into the cleansing tool. The cleansing tool varies across organizations. It can be a simple Microsoft Excel spreadsheet or Microsoft Access database, or it can be a company-specific application specially developed for data cleansing.
Realigning history: Mapping old products to new products, or mapping existing customers to new customers, needs to be done at this stage as well. The output of this mapping exercise is a realignment table with columns for earlier products, new products, and changeover dates. Realignment can be done in SAP APO or Excel. Wherever realignment is done, it’s important to check the data thoroughly post-realignment. Note that if data has been extracted to Access, the realigned data can be imported into Excel and then loaded in SAP APO cubes.
Building graphs of historical data for each product: Visualization of data can be very important for effective data cleansing. Once the data is extracted and realigned, this data can be used to create graphs that demonstrate a product’s sales pattern over years, trends, seasonality, and abnormal peaks and drops (also known as outliers). You can draw graphs in Excel or SAP APO. Graphs are very useful tool for analysis by sales, marketing, or demand planners, who actually sell or plan the products.
Best Practice 8: Use a Mix of Automated and Manual Tasks for Data Cleansing
Actual history cleansing is usually done by demand planners, sales teams, or marketing teams. Typically, the following tasks are involved:
Identifying the outliers in the history (i.e., the periods in history where sales are abnormally high or low). These need to be corrected. For example, an outlier can be when the history shows the last two years of a product’s average sales was 10,000 per month, but suddenly in one month, the sale dropped to zero because there was a strike in the factory. That one month’s zero figure can have an overall impact on the entire history and can bring down average monthly sales drastically. This is clearly an outlier for an extraordinary event. Therefore the history needs to be adjusted by correcting the history for this month to 10,000, which would have been the most probable sale during this month if the strike had not happened.
Identify periods in which past promotions happened, and determine the effect the promotion had on sales; then clearing the promotional effect from the history.
Identify cases of product shortages in the past, and neutralize this effect
Although some of these tasks can be automated by developing some mathematical logic and building a macro to cleanse the data, most of this cleansing exercise will still be manual.
Mathematical Automation
Although logics for mathematical automation or macros are often company-specific, there are a few areas where you will find automation works well:
Outlier correction: SAP APO provides the capability of automatic outlier correction (i.e., correction of past sales values that are too high or too low). Some companies also use their own logic for such outlier correction. For example, whenever any data points cross two or three standard deviation values more or less than the mean, the data point is automatically corrected and bought back to the mean. As a one-time exercise before an SAP APO DP project go-live, an outlier correction can be done in Excel as well by using Excel macros, but post go-live, this needs to be done on an ongoing basis in SAP APO only.
Promotional effect correction: Macros can be developed to calculate the approximate effect of promotions by averaging the weekly or monthly sales of the promotional period to that of the mean average sales for the entire year. Once the exact promotional periods of the past are known, such macros can be applied to take out the effect of such promotions and correct the data. However, the same logic often may not work for different types of promotions, and manual cleansing by planners may still be required post-auto cleansing.
In every data cleansing project, companies build Excel or Access macros for automated cleansing of historical data. For less important or C-category products of very low volume, history cleansed by these tools can be taken as it is. However, for A-category products, history cleansed by such automation tools needs to be validated by business users and might need a second level of manual cleansing effort.
Mathematical automation also comes in handy for ongoing data cleansing when the sales history of the last week or month moves from SAP ECC to SAP NetWeaver BW to SAP APO. This data is automatically cleansed by background jobs before it is available in an SAP APO planning book. Planners can then manually adjust the data if required.
Manual Cleansing
For manual cleansing, sales teams or planners handling the sales or planning of particular products go through the historical sales data and correct it manually. The effectiveness of the process depends on the person’s knowledge of the product’s behavior. There are also a number of accelerators can help perform this task better, such as:
Historical promotion calendars that provide information about the products that were promoted and when
Past promotion reports that provide data that quantifies the effect of particular past promotions
Past stock-out reports that provide information about deliveries that could not be made because of lack of product
Past minutes of monthly forecast review meetings that can provide critical details about promotional effects not readily available in the calendar or reports
In most cases, getting information on these accelerators becomes a challenge. So manual cleansing mostly depends on the business experience of the person who is doing the cleansing. The same data set might be cleansed by two different people in two different ways. Ideally, the best option is to give it to someone who had handled planning or sales of the product for the entire duration for which data is extracted. This person should know the product for the last three years if history cleansing is planned for last three year’s data.
Because manual cleansing depends on the expertise and judgment of the person, it’s important to develop clear guidelines and use cleansing methodologies that help to manage the bias or subjectivity of the cleaner. Although over the years companies have developed history cleansing guidelines, it’s difficult to apply history cleansing guidelines industry wide. Every organization treats challenges, such as how to handle month- or quarter-end picks, the cannibalization effect, or product shortages, in different ways and builds its own rules. However, the important point is that the rules be documented, and that the person who will do the actual cleansing be trained on them before starting the cleansing exercise.
Best Practice 9: Cleansing for Promotion
The history of promotions needs to be cleansed. If past sales promotions are not cleansed and this history is loaded for statistical forecasting, chances are high that the system will over-forecast and add promotions during the periods when they occurred in the past but are not sanctioned for the future.
However, how to cleanse promotion history is perhaps one of the most common challenges every organization faces during a history cleansing project because there can be a number of issues that need to be addressed beforehand:
Promotion calendar: If past promotion calendars are not available, then the time frames and duration of the promotions might not be known.
Promotion effect: Even if occurrences of past promotions are known, the results of the promotion on sales need to be evaluated.
Cannibalization effect: Estimating the decrease in sales of normal SKUs that were not part of a promotion needs to be correlated to the increased sales of the promotion SKUs during the same period to determine the drop in sales of the normal SKUs.
Approximating history for periods having only promotion SKUs active: You need to be able to calculate base sales figures during the periods when only promotion SKUs were sold. If the sale of promotion SKUs are taken out of the history, then the history becomes zero during those periods, which is not a true picture because Base products most likely continued to sell during these periods.
Normalization of post promotion sales dips: A business reality is that some customers will store up excess stock during promotional periods and then order less just after the promotion. If the promotion had not occurred, their ordering would have been normal, so these post promotion dips need to be normalized.
Every organization finds its own way to sort out these issues. If the promotion calendar is not maintained in the past for first time cleansing during go-live, there is not much option beyond using the judgment of the planner or sales person who was handling the product. However, going forward, the ongoing cleansing maintenance of promotion calendars is a must. This can be maintained in the system or as a separate Excel file with a few simple columns, such as promotion name, type of promotion, planned start and end dates, planned percentages, or absolute sales increases expected by the marketing team during the promotion period.
For the estimation of the promotion effect and cannibalization effect, analysis of sales during normal and promotion periods can be helpful. You can use some macro logics for such cleansing followed by a manual validation. Post-promotion sales dips can also be approximated using similar formulas.
Best Practice 10: Maintain History Cleansing Learning Logs and Reports
It is important to maintain a history cleansing learning log during an entire history cleansing exercise. You also need to have a method or tool that you can use to measure the data that has been cleansed.
First, a good history cleansing learning log should include information, such as:
Which product’s history has been cleansed
How much the percentage history is adjusted (increased/decreased) to accommodate the past effect of promotion
Why the history needs to be adjusted
What the reasons for the cleansing are
Date of cleansing
Cleansing decisions
Comments or observations
This log should be maintained religiously from the start of a DP project to actual go-live, which may take anywhere between nine to 12 months, to post go-live. This log becomes a readily available reference for the future, especially helpful when any clarification is required from someone or a department outside the sales and marketing teams, providing explanations for why sales data was adjusted in a particular way. These logs are also knowledge repositories for someone new joining the team who might be responsible for carrying on this cleansing exercise in the future. Without them, you have to rely on the experience and judgment of your sales and marketing teams and demand planners for effective history cleansing.
Additionally, because history cleansing is such a critical part of a DP project, the effectiveness of this activity needs to be measured. Although most companies use KPIs, such as forecast accuracy, to measure the effectiveness of the overall DP process, there is generally no common measure across organizations to evaluate the quality of historical data on which a forecast is based. Some companies monitor a few regular reports to track the effectiveness of history cleansing such as:
Report on obsolete products, locations, and characteristic value combinations (CVCs)
Report on cases of product/location realignment
Comparison report on history vs. cleansed history
Rajesh Ray
Rajesh Ray currently leads the SAP SCM product area at IBM Global Business Services. He has worked with SAP SE and SAP India prior to joining IBM. He is the author of two books on ERP and retail supply chain published by McGraw-Hill, and has contributed more than 52 articles in 16 international journals. Rajesh is a frequent speaker at different SCM forums and is an honorary member of the CII Logistics Council, APICS India chapter and the SCOR Society.
You may contact the author at rajesray@in.ibm.com.
If you have comments about this article or publication, or would like to submit an article idea, please contact the editor.
10 History Cleansing Best Practices for Reliable Statistical Forecasting
Reading time: 19 mins
Generate the most reliable and accurate sales data history for your statistical forecasting as part of your SAP Advanced Planning & Optimization Demand Planning project. See how to implement an efficient and effective history cleansing process. Key Concept The baseline history of a product is its normal historical demand without promotion, external stimulation, or any other abnormal…
Access exclusive SAP insights, expert marketing strategies, and high-value services including research reports, webinars, and buyers' guides, all designed to boost your campaign ROI by up to 50% within the SAP ecosystem.
Always have access to the latest insights with articles, Q&As, whitepapers, webinars, and podcasts. Gain
the
inside edge. The SAPinsider Weekly helps you stay SAP savvy. Access exclusive bonus materials, discounts,
and
more.
This website uses cookies. If you continue to use the site you consent to our use of cookies in accordance with our Cookie Policy.ACCEPTRead More
Privacy & Cookies Policy
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.