Checksum Transformation
The Checksum Transformation computes a hash value, the checksum, across one or more columns, returning the result in the Checksum output column. The transformation provides functionality similar to the T-SQL CHECKSUM function, but is encapsulated within SQL Server Integration Services, for use within the pipeline without code or a SQL Server connection.
As featured in The Microsoft Data Warehouse Toolkit by Joy Mundy and Warren Thornthwaite from the Kimbal Group. Have a look at the book samples especially Sample package for custom SCD handling.
All input columns are passed through the transformation unaltered, those selected are used to generate the checksum which is passed out through a single output column, Checksum. This does not restrict the number of columns available downstream from the transformation, as columns will always flow through a transformation. The Checksum output column is in addition to all existing columns within the pipeline buffer.
The Checksum Transformation uses an algorithm based on the .Net framework GetHashCode method, it is not consistent with the T-SQL CHECKSUM() or BINARY_CHECKSUM() functions. The transformation does not support the following Integration Services data types, DT_NTEXT, DT_IMAGE and DT_BYTES.
ChecksumAlgorithm Property
There ChecksumAlgorithm property is defined with an enumeration. It was first added in v1.3.0, when the FrameworkChecksum was added. All previous algorithms are still supported for backward compatibility as ChecksumAlgorithm.Original (0).
Original - Orginal checksum function, with known issues around column separators and null columns. This was deprecated in the first SQL Server 2005 RTM release.
FrameworkChecksum - The hash function is based on the .NET Framework GetHash method for object types. This is based on the .NET Object.GetHashCode() method, which unfortunately differs between x86 and x64 systems. For that reason we now default to the CRC32 option.
CRC32 - Using a standard 32-bit cyclic redundancy check (CRC), this provides a more open implementation.
The component is provided as an MSI file, however to complete the installation, you will have to add the transformation to the Visual Studio toolbox by hand. This process has been described in detail in the related FAQ entry for How do I install a task or transform component?, just select Checksum from the SSIS Data Flow Items list in the Choose Toolbox Items window.
Downloads
The Checksum Transformation is available for SQL Server 2005, SQL Server 2008 (includes R2) and SQL Server 2012. Please choose the version to match your SQL Server version, or you can install multiple versions and use them side by side if you have more than one version of SQL Server installed.
Checksum Transformation for SQL Server 2005
Checksum Transformation for SQL Server 2008
Checksum Transformation for SQL Server 2012
Version History
SQL Server 2012
Version 3.0.0.27 – SQL Server 2012 release. Includes upgrade support for both 2005 and 2008 packages to 2012.
(5 Jun 2010)
SQL Server 2008
Version 2.0.0.27 – Fix for CRC-32 algorithm that inadvertently made it sort dependent. Fix for race condition which sometimes lead to the error Item has already been added. Key in dictionary: '79764919' . Fix for upgrade mappings between 2005 and 2008.
(19 Oct 2010)
Version 2.0.0.24 - SQL Server 2008 release. Introduces the new CRC-32 algorithm, which is consistent across x86 and x64.. The default algorithm is now CRC32.
(29 Oct 2008)
Version 2.0.0.6 - SQL Server 2008 pre-release. This version was released by mistake as part of the site migration, and had known issues.
(20 Oct 2008)
SQL Server 2005
Version 1.5.0.43 – Fix for CRC-32 algorithm that inadvertently made it sort dependent. Fix for race condition which sometimes lead to the error Item has already been added. Key in dictionary: '79764919' .
(19 Oct 2010)
Version 1.5.0.16 - Introduces the new CRC-32 algorithm, which is consistent across x86 and x64. The default algorithm is now CRC32.
(20 Oct 2008)
Version 1.4.0.0 - Installer refresh only.
(22 Dec 2007)
Version 1.4.0.0 - Refresh for minor UI enhancements.
(5 Mar 2006)
Version 1.3.0.0 - SQL Server 2005 RTM. The checksum algorithm has changed to improve cardinality when calculating multiple column checksums. The original algorithm is still available for backward compatibility. Fixed custom UI bug with Output column name not persisting.
(10 Nov 2005)
Version 1.2.0.1 - SQL Server 2005 IDW 15 June CTP. A user interface is provided, as well as the ability to change the checksum output column name.
(29 Aug 2005)
Version 1.0.0 - Public Release (Beta).
(30 Oct 2004)
Screenshot