Profiling key figures and characteristics in ODS objects, PSA data, and BW queries can be difficult, time consuming, and inefficient. The updated Analysis Process Designer tool in BW 3.5 vastly improves your ability to profile data, giving you more options to understand your data. Here’s how you do it.
Key Concept
SAP BW Analysis Process Designer (APD) is a tool that helps move BW beyond reporting and analysis and into the realm of knowledge discovery, which along with data mining allows companies to glean a better understanding of the information stored in their massive data sets.SAP BW Analysis Process Designer (APD) is a tool that helps move BW beyond reporting and analysis and into the realm of knowledge discovery, which along with data mining allows companies to glean a better understanding of the information stored in their massive data sets.
Hidden away in BW are a number of tools and functionalities that allow you do much more than simply generate reports. One of the more interesting tools is Analysis Process Designer (APD). First available in BW 3.0, APD uses a graphical drag-and-drop interface to create sophisticated processes that prepare, transform, mine, display, and store data. It can provide you with quick answers to questions that would otherwise require more time-consuming BW configuration and lead to a proliferation of data marts.
APD is able to source data from InfoProviders such as InfoCubes, ODS objects, InfoObjects, database tables, BW queries, and flat files. Transformations include joins, sorts, transpositions, and with BW 3.5, integration with BW’s Data Mining Workbench (RSDMWB).
Data Mining Workbench is another well-kept BW secret and offers a range of data mining algorithms such as decision trees, clustering, and association analysis. Data targets for APD are InfoObject attributes, ODS objects, and SAP CRM system target groups. BW 3.5 also introduces the ability to transport analysis processes between systems, whereas before the only option was an XML export and import. The sidebar, “APD Scenarios,” on page 11 shows two common ways you might put APD to work.
Note
An overview of APD is available online in the SAP Library (help.sap.com). Follow the menu path Documentation>SAP NetWeaver>(your release/language)>Information Integration> SAP Business Intelligence>BI Platform>Analysis Process Designer.
Data Profiling with APD
Let me share with you a very useful function within the APD — statistical data profiling. A client recently asked me to establish which fields were populated in a series of ODS objects that contained hundreds of InfoObjects. As they were running BW 3.1, the profiling function was not available. While I could have done this using BW queries and a counter key figure, it would have been necessary to select every InfoObject in turn and execute the query — clearly a time-consuming process. In this case, I had to use a third-party product. However, I could have used APD in BW 3.5 had the client been running it. So now I’ll show you how this could be accomplished using APD in BW 3.5.
First, I enter APD using transaction RSANWB (it is also available under Special Analysis Process in the SAP menu). I then create a new analysis process in one of the folders, or “applications,” as SAP calls them. This creates a blank workspace, into which I can drag the relevant data sources, transformation, and targets.
For my example, I need to select an appropriate source for my analysis process — in this case an ODS that contains data about the survival rate of passengers on the Titanic. I do this by drag-ging across the InfoProvider icon to the workspace and then selecting the appropriate ODS (Figure 1). Then, I right-click on the InfoProvider icon in the workspace and select Display Basic Statistics (Figure 2).

Figure 1
Drag the InfoProvider icon into the workspace and select the appropriate ODS

Figure 2
Right-click on the InfoProvider icon and select Display Basic Statistics
Next, I need to select the relevant characteristics and key figures that I want to profile and then execute. A new window is then displayed, showing a profile of the data. The two screen captures in Figure 3 shows sample results from the Titanic survival data that I mentioned earlier.


Figure 3
Sample Titanic survivor statistics
As you can see, it does not pick up any associated texts for the characteristic; it just operates on raw values. Figure 4 shows a summary Web query of the same data. So how could this be useful?

Figure 4
Web query summary
I mentioned data quality earlier. I often want to look at data to determine the spread of distinct values. How many null entries are there? You can do this via queries and database statistics, but using APD is the simplest and most visual way I have found.
As APD can also source data from database tables, you can use it to examine data in a PSA table: for example, to work out exactly which InfoObjects from a DataSource are actually populated. This information could also be useful when it comes to data modeling and designing your InfoCubes. If a characteristic has a very large number of distinct values, then you might decide to create it as a line-item dimension or use the high cardinality flag for performance reasons.
You can profile key figures, too, which is something that would be hard to do via a BW query. If I run Display Basic Statistics for the sales value figure, I get the result shown in the two screen captures in Figure 5. This tells me more than I probably ever wanted to know about my key figure and its values.


Figure 5
Display Basic Statistics results for the sales value figure
APD Scenarios
What makes APD different from a typical ETL scenario is that it creates new information from your existing BW data and usually stores the results somewhere for future reuse. In the basic scenario shown in Figure A, the results of a customer sales query are passed to an ABC analysis (customer ranking) in BW’s Data Mining Workbench. The results are then written back to the 0ABC_CLASS attribute of 0CUSTOMER. This places your customers into predefined categories depending on sales value, volume, or any other categorization that you want. The populated attribute will then be available for use in BW queries.

Figure A
APD used with Data Mining Workbench to create a customer ranking
Another APD scenario allows you to identify all customers who have not bought anything within a given time period. You can do this via BW configuration using a MultiProvider on top of customer master data and sales transaction data. However, a much faster and more flexible way would be to use the APD to join sales data with customer master data, identify customers with a zero sales value, and write a flag to a customer attribute, as Figure B shows.

Figure B
Use APD to find customers who have made no purchases
Mike Curl
Mike Curl is technical director and principal consultant for Bluefin Solutions, a UK-based SAP partner. Mike has more than 10 years’ consultancy experience, covering a wide range of SAP technologies and industries. Mike is a recognized BW expert who has been working with BW since its earliest versions. He helps organizations define and implement their BW strategies, and he also has a particular interest and track record in optimizing the performance of high-volume BW systems.
You may contact the author at mike.curl@bluefinsolutions.com.
If you have comments about this article or publication, or would like to submit an article idea, please contact the editor.