Delta Loading a Table with SqlDeltaGenerator

In my previous Post I did describe what needs to be considered to implement a fast and stable delta load process. In this post I am going to introduce a tool that will help you to implement such a process within a couple of minutes.

The SqlDeltaGenerator is basically a wizard that will create all the necessary C# and SQL Merge statements required for delta loading a table from a source system into a Data Warehouse. In this quick tutorial I will show how to delta load the SalesHeaders table from Microsofts AdventureWorks2012 sample database.

First we will need to download the SqlDeltaGenerator here.

Extended editor in flat file source

After starting we connect to the source database and enter the table that we would like to delta load. For my example here settings are:

  • SourceServer: localhost
  • SourceDatabase: AdventureWorks2012
  • SourceSchema: Sales
  • SourceTable: SalesOrderDetail
  • DestinationServer: localhost
  • DestinationDatabase: StagingDb

The wizard will show all the available columns and we will need to select a delta handle and one (or multiple) key columns. Furthermore we can chose which columns should be transferred and which can be ignored. In this example we chose ModifiedDate as DeltaHandle and SalesOrderID and SalesOrderDetailID as primary key column.

Next we have to decide whether we want a Historizing or a MostRecent delta load for the table. Depending on what´s needed we click the corresponding button so that all required scripts are generated.

The SqlDeltaGenerator will generate the following scripts that can be added to your SSIS solution.

  • Source To Buffer: This is a C# script that will load all the changed /new records (based on the DeltaHandle) from the source system to the target system.
  • SSIS SourceToBuffer: basically the same script as above but with some smaller changes so it can easily be used as an SSIS script task
  • Create Buffer: this SQL script will generate the buffer table
  • Buffer to Staging: this SQL Merge statement will do either a Historizing or a MostRecent merge into the staging table
  • Create Staging: this SQL script will generate the staging table
  • Create HlpNextOffsets: this SQL script will create a table that is used to store the DeltaHandle timestamps
  • Create HlpRequestLog: this SQL script will create a table that is used for logging every delta load

Now we can start implementing the ETL process.

  • If not already done for a previous table we execute the Create HlpNextOffsets and Create HlpRequestLog in our staging database.
  • We execute the Create Buffer and Create Staging scripts to create the corresponding tables.
  • We add a Script Task to our SSIS solution
  • we paste the SSIS Source to Buffer script into the Script Task
  • We add an Execute SQL Task to our SSIS solution
  • we paste the Buffer to Staging SQL statement into this task
  • Finally we run both tasks

Extended editor in flat file source

We can check the results by looking at the staging and logging tables:

SELECT TOP 1000 [FromTo],[NextDeltaOffset] FROM [StagingDb].[dbo].[HlpNextOffsets]

Extended editor in flat file source

SELECT TOP 1000 * FROM [StagingDb].[dbo].[STAGING_SalesOrderDetail]

Extended editor in flat file source

Delta Test

Now we want to test whether delta loading works correctly. There we manually change one record in the source system and start loading another time.

update [AdventureWorks2012].sales.SalesOrderDetail set ProductID = 711, ModifiedDate='2015-07-09 00:00:00' where SalesOrderID = 43659 and SalesOrderDetailID = 1

Looking at the buffer table we see that the record has been loaded correctly

SELECT TOP 1000 * FROM [StagingDb].[dbo].[BUFFER_SalesOrderDetail]

Extended editor in flat file source

Furthermore the record has been updated correctly in the staging table.

So, happy delta loading!

Download

Source Code