Learn how to use data normalization in SAP NetWeaver 7.0 to reduce, if not eliminate, reporting problems with data.
Key Concept
Normalization seeks to organize data in a database. The objective is to look at the incoming data and then create new tables and relationships between these tables to protect the data and eliminate redundancy and inconsistency. Data normalization follows a strict set of rules. Each rule can be referred to as a “normal form.” You can reduce data to first normal form or second normal form and so on depending on which set of rules has been satisfied. The first normal form requires that there be no multi-valued attributes and no repeating groups. A multi-valued attribute would contain more than one value for that field in each row.
A poor database design can cause problems related to redundancy, accuracy, consistency, and concurrency of your data because data does not always come in the form best suited for reporting. A data modeler needs to analyze the reporting requirements and modify the incoming data format accordingly.
You need to reduce data containing multi-valued attributes or repeating groups—which may come in via a DataStore object (formerly operational data store [ODS])—to first normal form. This ensures proper aggregation at the InfoCube level so you can analyze the data for the characteristic that represents the repeating group or is multi- valued.
In BI versions prior to SAP NetWeaver 7.0, you could normalize data to first normal form by either of two methods:
- Write ABAP code and modify the records in the DATA_PACKAGE in the start routine
- Implement the Return Tables option on all the key figures in the update rules
In SAP NetWeaver 7.0, you can bring the data into first normal form by two different methods:
- Use ABAP code in the start routine and modify records in the SOURCE_PACKAGE
- Write ABAP in the expert routine
Note
SAP NetWeaver 7.0 was formerly known as SAP NetWeaver 2004s.
The code in the start routine in the new versions follows the object-oriented ABAP syntax.
I’ll use an example business scenario to explain normalization and to create both routines. Then I’ll look at each option—using an expert routine or a start routine—and discuss why you might choose one routine over another.
Normalization and First Normal Form
To understand the business need for reducing data to first normal form, consider the scenario in Table 1. Let’s say that company A’s purchasing department sends weekly requirements for materials including quantity and type that it will buy from multiple vendors via flat file.
01111 |
FERT |
2000 EA |
099991 |
099992 |
099993 |
20 |
40 |
40 |
01112 |
ROH |
2200 EA |
099991 |
099994 |
099995 |
55 |
15 |
30 |
01113 |
ROH |
1200 EA |
099993 |
099995 |
099996 |
15 |
45 |
40 |
|
Table 1 |
Unnormalized data sent to the BI system for reporting quantity bought from each material/vendor combination |
This data represents an “unnormalized table.” You cannot report by material/vendor and quantity ordered without normalizing the data. Because each material has multiple vendors, the Vendor columns represent a multi-valued attribute (repeating group) and you need to reduce the data to the first normal form. From a reporting standpoint, you need to break down each row into three rows. You can see the first normalized form of the incoming data in Table 2.
01111 |
FERT |
2000 EA |
099991 |
20 |
01111 |
FERT |
2000 EA |
099992 |
40 |
01111 |
FERT |
2000 EA |
099993 |
40 |
01112 |
ROH |
2200 EA |
099991 |
55 |
01112 |
ROH |
2200 EA |
099994 |
15 |
01112 |
ROH |
2200 EA |
099995 |
30 |
01113 |
ROH |
1200 EA |
099993 |
15 |
01113 |
ROH |
1200 EA |
099996 |
45 |
01113 |
ROH |
1200 EA |
099996 |
40 |
|
Table 2 |
First normal form reporting quantity bought from each material/vendor combination |
Because each material has a unique type, you can further normalize the data to the second normal form, which seeks to eliminate redundant data. In my example, Material type is repeated for each material and thus is redundant data. When reducing the data model to its second normal form, you can break Table 2 down into Tables 3 and 4.
01111 |
2000 EA |
099991 |
20 |
01111 |
2000 EA |
099992 |
40 |
01111 |
2000 EA |
099993 |
40 |
01112 |
2200 EA |
099991 |
55 |
01112 |
2200 EA |
099994 |
15 |
01112 |
2200 EA |
099995 |
30 |
01113 |
1200 EA |
099993 |
15 |
01113 |
1200 EA |
099996 |
45 |
01113 |
1200 EA |
099996 |
40 |
|
Table 3 |
Second normal form to remove redundancy (material type) |
01111 |
FERT |
01112 |
ROH |
01113 |
ROH |
|
Table 4 |
Second normal form creating a separate table for Material type |
Table 4 has Material number as the primary key with Material type as the data field. From a data modeling standpoint, Material type is an attribute of 0MATERIAL. You can make this attribute navigational so users can report on it.
Example Scenario in the SAP System
In the sample data shown in Table 1, consider a DataStore object MATDSO with key field 0MATERIAL (Figure 1). Characteristics VEND1 (Vendor 1), VEND2 (Vendor 2), and VEND3 (Vendor 3) represent repeating groups (i.e., they are the same InfoObject [0VENDOR] and you need to apportion the key figures among these three vendors). Although each record row has a fixed total quantity (0QUANTITY) to be purchased, fields ZVEND1, ZVEND2, and ZVEND3 contain the percent allocation of this quantity from each of the three vendors.

Figure 1
Structure of DataStore object MATDSO containing unnormalized data
You can see the incoming data in the DataStore object MATDSO in Figure 2. Note that the records have more than one vendor and each vendor has a respective allocation percentage in each row. The goal is to normalize the data so that each row has one material, one vendor, and the corresponding purchase quantity. Business reporting requirements dictate that you make data available for reporting by vendor (e.g., 099991, 099992, 099993) in terms of material and quantity ordered, which you can do by reducing data to first normal form.

Figure 2
Unnormalized incoming data for which you need first form normalization to facilitate reporting
Users or process chains load data daily via the data transfer process (DTP) from the DataStore object MATDSO into InfoCube MATCUBE for reporting. You can see the structure of MATCUBE in Figure 3.

Figure 3
Structure of the InfoCube in which you perform first normalization
There are two dimensions of cube MATCUBE: Material and Vendor. These contain the characteristics 0MATERIAL and 0VENDOR, respectively. Material type is an attribute of 0MATERIAL in the BI capabilities of SAP NetWeaver 7.0 and you can report it as a navigational attribute of the material in the query. You can also see the Key Figure in the InfoCube 0QUANTITY, which represents the quantity that company A ordered from the vendor. The system calculates this as the total material quantity multiplied by the distribution percentage.
There are two ways to accomplish first form normalization to meet the reporting requirement: using an expert routine and a start routine. I’ll go over each of these next.
Option 1: Use an Expert Routine
Create the transformation from the DataStore object to the InfoCube. In transaction RSA1, under InfoProvider tab, locate your InfoCube, (MATCUBE in this example) and right-click on it (Figure 4). Select Create Transformation… from the context menu to bring up the pop-up screen shown in Figure 4.

Figure 4
Select cube and enter DataStore object name
In the Object Type field, select DataStore Object and enter the DataStore name MATDSO
in the Name field. Click on the green arrow and the system presents a screen with a transformation in which the 0MATERIAL and 0QUANTITY fields are already mapped. Select Edit>Expert Routine and choose Yes in the resulting pop-up screen to replace the created transformation with an expert routine (Figure 5).

Figure 5
Replace the transformation with an expert routine
Figure 6 appears, and you need to insert the ABAP code shown in Figure 7 to carry out the data normalization. Each row of the incoming record is split into three rows, with material, vendor, and the calculated quantity based on allocation percentages shown.

Figure 6
Insert ABAP code for expert routine
*------------------------------------------------------------------- ---* * Method expert_routine *----------------------------------------------------------------------* * Calculation of result package via expert routine *----------------------------------------------------------------------* * -> package of source segments * <- result package *----------------------------------------------------------------------* METHOD expert_routine. *=== Segments === FIELD-SYMBOLS: TYPE _ty_s_SC_1. DATA: RESULT_FIELDS TYPE _ty_s_TG_1. *$*$ begin of routine - insert your code only below this line *-* * LOOP AT SOURCE_PACKAGE ASSIGNING . * For each line , create 3 rows of data. move-corresponding to RESULT_FIELDS . DO 3 TIMES . CASE SY-INDEX. WHEN '1'. * For first row , populate Vendor RESULT_FIELDS-VENDOR = -/BIC/VEND1 . * For first row , populate Quantity RESULT_FIELDS-QUANTITY = -QUANTITY * -/BIC/ZVEND1 / 100. WHEN '2'. * For second row , populate Vendor RESULT_FIELDS-VENDOR = -/BIC/VEND2 . * For second row , populate Quantity RESULT_FIELDS-QUANTITY = -QUANTITY * -/BIC/ZVEND2 / 100. WHEN '3'. * For third row , populate Vendor RESULT_FIELDS-VENDOR = -/BIC/VEND3 . * For third row , populate Quantity RESULT_FIELDS-QUANTITY = -QUANTITY * -/BIC/ZVEND3 / 100. WHEN OTHERS . ENDCASE. * Add row to Result Package APPEND RESULT_FIELDS TO RESULT_PACKAGE . ENDDO.. ENDLOOP.. *$*$ end of routine - insert your code only before this line *-* ENDMETHOD. "expert routine
|
Figure 7 |
ABAP code for expert routine |
Activate the transformation by clicking on the activate icon. If you go back into the transformation in display mode, you see the transformation displayed (Figure 8). You can double-click on the Expertenroutine text on the arrow in the middle of the screen to see the ABAP code you just entered.

Figure 8
Transformation with expert routine in display mode
Option 2: Use a Start Routine
Create the transformation from the DataStore object to the InfoCube using the same method I described in the previous section. The system creates two mappings by default (0MATERIAL and 0QUANTITY). However, the system has not mapped the vendor, which you can tell because there is no arrow to 0VENDOR. The quantity field is mapped but you need to calculate the quantity in the start routine based on percentages provided (Figure 9).

Figure 9
Transformation with default mapping
Map field VEND1 to 0VENDOR. Click on the Start Routine button. The system takes you to the screen in which you need to enter the ABAP code to normalize the data (Figure 10). Enter the code shown in Figure 11 in the start routine.

Figure 10
Screen to add code for start routine
* DECLARE A TEMPORARY STORAGE INTERNAL TABLE LIKE SOURCE_PACKAGE AND ITS * CORRESPONDING WORK AREA. DATA: SOURCE_PACKAGE_TEMP LIKE SOURCE_PACKAGE . DATA: WA_SOURCE_PACKAGE type _ty_s_SC_1. *FOR EACH RECORD IN THE INCOMING SOURCE PACKAGE, CREATE RECORDS IN THE *TEMPORARY TABLE * WITH THE DE-AGGREGATION AS SHOWN. LOOP AT SOURCE_PACKAGE ASSIGNING . move-corresponding to WA_SOURCE_PACKAGE. DO 3 TIMES . CLEAR : WA_SOURCE_PACKAGE-/BIC/ZVEND1 , WA_SOURCE_PACKAGE-/BIC/ZVEND2 , WA_SOURCE_PACKAGE-/BIC/ZVEND3 . CASE SY-INDEX. WHEN '1'. * For first row , populate Vendor WA_SOURCE_PACKAGE-/BIC/VEND1 = -/BIC/VEND1. * For first row , populate Distribution Amount WA_SOURCE_PACKAGE-QUANTITY = -QUANTITY * -/BIC/ZVEND1 / 100. WHEN '2'. * For second row , populate VENDOR WA_SOURCE_PACKAGE-/BIC/VEND1 = -/BIC/VEND2. * For second row , populate Distribution Amount WA_SOURCE_PACKAGE-QUANTITY = -QUANTITY * -/BIC/ZVEND2 / 100. WHEN '3'. * For third row , populate VENDOR WA_SOURCE_PACKAGE-/BIC/VEND1 = -/BIC/VEND3. * For third row , populate Distribution Amount WA_SOURCE_PACKAGE-QUANTITY = -QUANTITY * -/BIC/ZVEND3 / 100. WHEN OTHERS . ENDCASE. APPEND WA_SOURCE_PACKAGE TO SOURCE_PACKAGE_TEMP. ENDDO.. ENDLOOP. REFRESH SOURCE_PACKAGE. SOURCE_PACKAGE[] = SOURCE_PACKAGE_TEMP[]. FREE SOURCE_PACKAGE_TEMP. *-- fill table "MONITOR" with values of structure "MONITOR_REC" *- to make monitor entries ... "to cancel the update process * raise exception type CX_RSROUT_ABORT.
|
Figure 11 |
ABAP code for start routine |
The code breaks each record into three records. From the incoming record, the system creates two new records. It recalculates the quantity for the incoming record based on field VEND1 and allocation percentage ZVEND1.
You create the two new records by moving VEND2 and VEND3 to field VEND1, and then moving their respective percentages to ZVEND1. In the transformation, the system maps only VEND1 to 0VENDOR and calculates the corresponding Quantity correctly because you moved the percentages in the start routine above to ZVEND1. Use transaction LISTCUBE to see the first form normalized data in the InfoCube after loading the data into the InfoCube (Figure 12).

Figure 12
Normalized data in InfoCube available for reporting via queries
The system shows the incoming three rows of data as normalized to nine rows, with each quantity split by vendor. You can modify the code to account for additional key figures.
When Should You Use a Start or Expert Routine?
The difference in the two approaches is the point at which the records are broken down.
The expert routine option breaks down the records bypassing the start routine, transformations, and end routine. It acts as the sole creator of records for the RESULTS_PACKAGE.
The start routine option creates additional rows in the start routine itself, effectively overwriting the incoming record set. As indicated earlier, implementing an expert routine calls for coding transformations for all characteristics and key figures in the cube. In case of a large number of characteristics that require code, you need to be very careful when putting code in the expert routine to avoid data errors.
The disadvantage of using the expert routine is that you need to do the mapping of all characteristics in the routine individually, which increases the complexity of the coding. If you use the start routine for data normalization, you can use the transformations routines for each characteristic as in any other general transformation.
Data normalization is one application of the use of start routines and expert routines, but you can use a similar procedure for other processes, such as to deaggregate incoming data. For example, each record might be a monthly record that you need to break down by week. The same logic would apply with additional code to determine calendar week as a key characteristic in that scenario.
You can use expert routines if the number of characteristics and key figures is low. Start routines are a better option when there are a lot of key figures and characteristics because the code in the expert routine would become very complex.
Rajiv Kalra
Rajiv Kalra is an independent SAP consultant with more than 11 years of SAP experience. With more than five years of BI experience in data modeling, design, configuration, and performance tuning of BI systems, he has helped develop many out-of-the-box solutions to resolve complicated reporting requirements at various client sites. Besides SAP NetWeaver BI, Rajiv has worked on the ERP side for more than seven years and is an SAP-certified ABAP consultant.
You may contact the author at kalrar_99@yahoo.com.
If you have comments about this article or publication, or would like to submit an article idea, please contact the editor.