Pages

Thursday, March 21, 2013

How to Populate header values to each detail records from a Flat File Source


Many times it is possible that you get flat files with a header record, which has a different structure from the other detail record.  The header record gives some audit information like Number of total records in the file, File creation date, Unique file identifier etc... which aslo makes some business value. You can be asked to populate these header record column to the target along with each detail records from a Flat File Source.
Lets take an example to make our explanation easy. We are receiving a source file from the stock exchange with the daily closing price of shares. This file comes with a header record with the report name and the report generation date. When we load this data to the target table, report generation date has to be loaded into the target table along with the other detail records.

Here is our Sample File dailyStockPrices.csv 
Populate header values to each detail records from a Flat File Source
Before we start working on the mapping, we will prepare the source file. We are going to generate a new file from the source file which has just the header record using the command below.
  • head -1 $PMSourceFileDir\dailyStockPrices.csv >  $PMSourceFileDir\dailyStockPriceHeads.csv
We will leave the newly created file for now and go to the mapping building process. Lets start working the mappingand create an expression transformation EXP_HEADER_ID with an output port.
  • HEADER_ID as Output Port
Add the expression as follows.
  • HEADER_ID :- 'DAILY_STOCK_PRICE'
Below is the structure of the mapping now.

Populate header values to each detail records from a Flat File Source

Now lets add a LOOKUP transformation that looks into the header file, which we created in the beginning of the tutorial. When we do the look up, we will use the HEADER_ID column for the look up condition. This transformation will give you the REPORT_DATE from the header record. 

LookUp Condition
  • IN_HEADER_ID = HEADER_ID
LookUp Output
  • REPORT_DATE
Note : Configure the  LOOKUP transformation to use the Header file created in the first step, in the session configuration (Explained at the end of the tutorial).

Populate header values to each detail records from a Flat File Source

Now just map all columns to the target table from the expression and look up transformation. We are done with the mapping configuration.

Populate header values to each detail records from a Flat File Source

We are not done yet. Lets create the workflow, session and set some Pre-Session property as it is in the below image. We will have to use a Command as a Pre-Session Command, which we used to split the header to a different file in the beginning of the tutorial.
  • head -1 $PMSourceFileDir\dailyStockPrices.csv >  $PMSourceFileDir\dailyStockPriceHeads.csv
This Pre-Session Command will make sure that the source file is split into header file, before it is needed by the look up transformation.

Populate header values to each detail records from a Flat File Source

Now set the look up property to use the Header file created in the first step, in the session configuration as it is in the below image.
  • Lookup source file directory :  $PMSourceFileDir\
  • Lookup source filename :  dailyStockPriceHeads.csv
Populate header values to each detail records from a Flat File Source

Note : While the session properties are configured, Make sure that the Source file property is set to skip the 1st record, which is the header record.

2 comments: