Spread the love

Fastest way to compare multiple column values

I was recently working on a project with stored procedures that had a significant amount of column comparisons in a MERGE statement. This was turning into a real performance bottleneck (relatively speaking) as the entire row of data had to be updated if any one of the numerous fields were modified.

Here’s the same code. Pay particular attention to all the comparisons that were being performed:

MERGE [dbo].[The_Table] 
USING (
    SELECT [Account_No],
    -- rest of columns here
    FROM [#The_TableStaging]
) b
-- the primary keys on the table
ON b.[Account_No] = a.[Account_No]
AND b.[Transaction_Id] = a.[Transaction_Id]
WHEN MATCHEDAND (
    --Surely there must be a better, faster way to do this?
    COALESCE(b.[Account_Type],'') != COALESCE(a.[Account_Type],'')
    OR
    COALESCE(b.[Transaction_Type],'') != COALESCE(a.[Transaction_Type],'')
    OR
    COALESCE(b.[Branch_Code],'') != COALESCE(a.[Branch_Code],'')
    OR
    COALESCE(b.[Pay_Method_Cd],'') != COALESCE(a.[Pay_Method_Cd],'')
    OR
    COALESCE(b.[Trans_Date],'') != COALESCE(a.[Trans_Date],'')
    OR
    COALESCE(b.[Effective_Date],'') != COALESCE(a.[Effective_Date],'')
    OR
    COALESCE(b.[Amount],CAST(0 ASDECIMAL(38, 12))) != COALESCE(a.[Amount],CAST(0 ASDECIMAL(38, 12)))
    OR
    COALESCE(b.[Fund_Code], 0) != COALESCE(a.[Fund_Code], 0)
    OR
    COALESCE(b.[Batch_No], 0) != COALESCE(a.[Batch_No], 0)
    OR
    COALESCE(b.[Updated_Date],'') != COALESCE(a.[Updated_Date],'')
    OR
    COALESCE(b.[Updated_Time],'') != COALESCE(a.[Updated_Time],'')
    OR
    COALESCE(b.[Updated_Userid],'') != COALESCE(a.[Updated_Userid],'')
    OR
    COALESCE(b.[Archive_Ind],'') != COALESCE(a.[Archive_Ind],'')
    OR
    COALESCE(b.[Posted_Tran], 0) != COALESCE(a.[Posted_Tran], 0)
    OR
    COALESCE(b.[Updated_Session], 0) != COALESCE(a.[Updated_Session], 0)
    OR
    -- the column list continues...
)
THEN UPDATE
SET [Account_Type] = b.[Account_Type]
, -- etc etc etc
WHEN NOT MATCHED BY TARGET THEN
INSERT (
    -- column list
)
VALUES (
    -- b.<column list>
)
WHEN NOT MATCHED BY SOURCE AND @FullLoadInd = 1
THEN DELETE
-- so on and so forth --

MERGE [dbo].[The_Table]

USING (

SELECT [Account_No],

-- rest of columns here

FROM [#The_TableStaging]

) b

-- the primary keys on the table

ON b.[Account_No] = a.[Account_No]

AND b.[Transaction_Id] = a.[Transaction_Id]

WHEN MATCHEDAND (

--Surely there must be a better, faster way to do this?

COALESCE(b.[Account_Type],'') != COALESCE(a.[Account_Type],'')

COALESCE(b.[Transaction_Type],'') != COALESCE(a.[Transaction_Type],'')

COALESCE(b.[Branch_Code],'') != COALESCE(a.[Branch_Code],'')

COALESCE(b.[Pay_Method_Cd],'') != COALESCE(a.[Pay_Method_Cd],'')

COALESCE(b.[Trans_Date],'') != COALESCE(a.[Trans_Date],'')

COALESCE(b.[Effective_Date],'') != COALESCE(a.[Effective_Date],'')

COALESCE(b.[Amount],CAST(0 ASDECIMAL(38, 12))) != COALESCE(a.[Amount],CAST(0 ASDECIMAL(38, 12)))

COALESCE(b.[Fund_Code], 0) != COALESCE(a.[Fund_Code], 0)

COALESCE(b.[Batch_No], 0) != COALESCE(a.[Batch_No], 0)

COALESCE(b.[Updated_Date],'') != COALESCE(a.[Updated_Date],'')

COALESCE(b.[Updated_Time],'') != COALESCE(a.[Updated_Time],'')

COALESCE(b.[Updated_Userid],'') != COALESCE(a.[Updated_Userid],'')

COALESCE(b.[Archive_Ind],'') != COALESCE(a.[Archive_Ind],'')

COALESCE(b.[Posted_Tran], 0) != COALESCE(a.[Posted_Tran], 0)

COALESCE(b.[Updated_Session], 0) != COALESCE(a.[Updated_Session], 0)

-- the column list continues...

)

THEN UPDATE

SET [Account_Type] = b.[Account_Type]

, -- etc etc etc

WHEN NOT MATCHED BY TARGET THEN

INSERT (

-- column list

)

VALUES (

-- b.<column list>

)

WHEN NOT MATCHED BY SOURCE AND @FullLoadInd = 1

THEN DELETE

-- so on and so forth --

Seeing those column comparisons, that’s when this curious consultant started wondering what’s the fastest way to compare multiple column values?

The Alternatives

The research began, and besides the technique above (which is probably the most common as it’s pretty straight forward), here are a few other ways to do the same thing:

What To Do

Sample Code Snippet

Nothing. Leave code as is.

Alter the tables to add a BINARY_CHECKSUM column that incorporates all the table’s columns; then compare the checksum columns

ADD the following column definition to the tables:

ChecksumCol as BINARY_CHECKSUM([account_no],[transaction_id],...)

--all those comparisons are now condensed to this:
a.ChecksumCol <> b.ChecksumCol

ChecksumCol as BINARY_CHECKSUM([account_no],[transaction_id],...)

--all those comparisons are now condensed to this:

a.ChecksumCol <> b.ChecksumCol

Implement the STUFF method

STUFF(cast(a.[Account_No] as varchar) + cast(a.[Transaction_Id] as varchar) + ... , 1, 1, '')
<>
STUFF(cast(b.[Account_No] as varchar) + cast(b.[Transaction_Id] as varchar) + ... , 1, 1, '')

STUFF(cast(a.[Account_No] as varchar) + cast(a.[Transaction_Id] as varchar) + ... , 1, 1, '')

STUFF(cast(b.[Account_No] as varchar) + cast(b.[Transaction_Id] as varchar) + ... , 1, 1, '')

Use CONCAT to concatenate all the values into a single value for comparison

CONCAT(a.[account_no],a.[transaction_id], ...)
<>
CONCAT(b.[account_no],b.[transaction_id], ...)

CONCAT(a.[account_no],a.[transaction_id], ...)

CONCAT(b.[account_no],b.[transaction_id], ...)

Use HASHBYTES with CONCAT to hash all values into a single value

HASHBYTES('sha',CONCAT(a.[account_no],a.[transaction_id], ...))
<>
HASHBYTES('sha',CONCAT(b.[account_no],b.[transaction_id], ...))

HASHBYTES('sha',CONCAT(a.[account_no],a.[transaction_id], ...))

HASHBYTES('sha',CONCAT(b.[account_no],b.[transaction_id], ...))

Perform an on-the-fly BINARY_CHECKSUM comparison. This is case-sensitive.

BINARY_CHECKSUM(a.[account_no],a.[transaction_id], ...)
<>
BINARY_CHECKSUM(b.[account_no],b.[transaction_id], ...)

BINARY_CHECKSUM(a.[account_no],a.[transaction_id], ...)

BINARY_CHECKSUM(b.[account_no],b.[transaction_id], ...)

Perform an on-the-fly normal CHECKSUM. This is case insensitive.

CHECKSUM(a.[account_no],a.[transaction_id], ...)
<>
CHECKSUM(b.[account_no],b.[transaction_id], ...)

CHECKSUM(a.[account_no],a.[transaction_id], ...)

CHECKSUM(b.[account_no],b.[transaction_id], ...)

Setting Things Up for the Speed Test

For testing purposes, the SQL code was executed on a Windows 2012 Server with 128GB memory, 16 core CPU rated at 2.54 Ghz, using Microsoft SQL Server 2014.

To ensure SQL Server didn’t keep any queries (or anything for that matter) cached, the following code was run before each test:

checkpoint
go
DBCC DROPCLEANBUFFERS
go
DBCC FREESESSIONCACHE
go
DBCC FREEPROCCACHE
go
DBCC FREESYSTEMCACHE('ALL')
go

checkpoint

DBCC DROPCLEANBUFFERS

DBCC FREESESSIONCACHE

DBCC FREEPROCCACHE

DBCC FREESYSTEMCACHE('ALL')

Two tables are created, and populated with 20 million rows, using a subset of columns from an actual table that has over 100 million records. The subset of columns has the same schema as that of the original table. The primary keys are the same.

Here is the code that’s common across every test:

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[AAA_DAVE_TEST_Source]') AND type in (N'U'))
BEGIN
    CREATE TABLE [AAA_DAVE_TEST_Source](
    [Account_No] [int] NOT NULL,
    [Transaction_Id] [int] NOT NULL,
    [Account_Type] [char](2) NOT NULL,
    [Transaction_Type] [char](4) NOT NULL,
    [Branch_Code] [char](2) NOT NULL,
    [Pay_Method_Cd] [char](3) NULL,
    [Trans_Date] [datetime] NOT NULL,
    [Effective_Date] [datetime] NULL,
    [Amount] [decimal](18, 2) NOT NULL,
    [Fund_Code] [int] NOT NULL,
    [Batch_No] [int] NULL,
    [Updated_Date] [datetime] NOT NULL,
    [Updated_Time] [char](8) NULL,
    [Updated_Userid] [char](8) NOT NULL,
    [Archive_Ind] [char](1) NULL,
    [Posted_Tran] [int] NULL,
    [Updated_Session] [int] NULL,
    CONSTRAINT [PK_AAA_DAVE_TEST_Source] PRIMARY KEY CLUSTERED
    (
        [Account_No] ASC,
        [Transaction_Id] ASC
        )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    )
    ON [PRIMARY]

    -- Only used for the test with the checksum column added
    --Alter table [AAA_DAVE_TEST_Source]
    --ADD ChecksumCol as BINARY_CHECKSUM([account_no],[transaction_id],[account_type],[transaction_type],[branch_code],[pay_method_cd],[trans_date],[effective_date],[amount],[fund_code],[batch_no],[Updated_date],[updated_time],[updated_userid],[archive_ind],[posted_tran],[updated_session])
END
ELSE
BEGIN
        truncate table aaa_dave_test_source
END

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[AAA_DAVE_TEST_Target]') AND type in (N'U'))
BEGIN
    CREATE TABLE [AAA_DAVE_TEST_Target](
    [Account_No] [int] NOT NULL,
    [Transaction_Id] [int] NOT NULL,
    [Account_Type] [char](2) NOT NULL,
    [Transaction_Type] [char](4) NOT NULL,
    [Branch_Code] [char](2) NOT NULL,
    [Pay_Method_Cd] [char](3) NULL,
    [Trans_Date] [datetime] NOT NULL,
    [Effective_Date] [datetime] NULL,
    [Amount] [decimal](18, 2) NOT NULL,
    [Fund_Code] [int] NOT NULL,
    [Batch_No] [int] NULL,
    [Updated_Date] [datetime] NOT NULL,
    [Updated_Time] [char](8) NULL,
    [Updated_Userid] [char](8) NOT NULL,
    [Archive_Ind] [char](1) NULL,
    [Posted_Tran] [int] NULL,
    [Updated_Session] [int] NULL,
    CONSTRAINT [PK_AAA_DAVE_TEST_Target] PRIMARY KEY CLUSTERED
    (
        [Account_No] ASC,
        [Transaction_Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    )
    ON [PRIMARY]
    -- Only used for the test with the checksum column added
    --Alter table [AAA_DAVE_TEST_Target]
    --ADD ChecksumCol as BINARY_CHECKSUM([account_no],[transaction_id],[account_type],[transaction_type],[branch_code],[pay_method_cd],[trans_date],[effective_date],[amount],[fund_code],[batch_no],[Updated_date],[updated_time],[updated_userid],[archive_ind],[posted_tran],[updated_session])
END
ELSE
BEGIN
        truncate table aaa_dave_test_target
END

-- insert 20,000,000 records into the first table
insert into AAA_DAVE_TEST_Source
select top 20000000 *
from dbo.The_Table

-- insert 20,000,000 records into the second table
insert into AAA_DAVE_TEST_Target
select top 20000000 *
from dbo.The_Table


-- ensure there are differences so an update has to be performed
Update AAA_DAVE_TEST_Target
Set Fund_Code = 0

DECLARE @START_TIME datetime
DECLARE @END_TIME datetime

SELECT 'Starting Test'
SET @START_TIME = GETDATE()

UPDATE a
SET [Account_Type] = b.[Account_Type]
,[Transaction_Type] = b.[Transaction_Type]
,[Branch_Code] = b.[Branch_Code]
,[Pay_Method_Cd] = b.[Pay_Method_Cd]
,[Trans_Date] = b.[Trans_Date]
,[Effective_Date] = b.[Effective_Date]
,[Amount] = b.[Amount]
,[Fund_Code] = b.[Fund_Code]
,[Batch_No] = b.[Batch_No]
,[Updated_Date] = b.[Updated_Date]
,[Updated_Time] = b.[Updated_Time]
,[Updated_Userid] = b.[Updated_Userid]
,[Archive_Ind] = b.[Archive_Ind]
,[Posted_Tran] = b.[Posted_Tran]
,[Updated_Session] = b.[Updated_Session]
FROM AAA_DAVE_TEST_Target a inner join [AAA_DAVE_TEST_Source] b
    ON b.[Account_No] = a.[Account_No]
    AND b.[Transaction_Id] = a.[Transaction_Id]
WHERE
(
    -- [ PLACEHOLDER ]
    -- This is the code that varies between techniques.
    -- See below for the code snippets.
    --
)

SET @END_TIME = GETDATE()
SELECT 'Finished', @END_TIME, 'Time To Run: ' + CAST(DATEDIFF(SECOND, @START_TIME,@END_TIME) as varchar) + ' seconds'

-- drop the tables to also ensure nothing's cached and
-- no statistics are kept which might affect results
DROP TABLE AAA_DAVE_TEST_Source
DROP TABLE AAA_DAVE_TEST_Target

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[AAA_DAVE_TEST_Source]') AND type in (N'U'))

BEGIN

CREATE TABLE [AAA_DAVE_TEST_Source](

[Account_No] [int] NOT NULL,

[Transaction_Id] [int] NOT NULL,

[Account_Type] [char](2) NOT NULL,

[Transaction_Type] [char](4) NOT NULL,

[Branch_Code] [char](2) NOT NULL,

[Pay_Method_Cd] [char](3) NULL,

[Trans_Date] [datetime] NOT NULL,

[Effective_Date] [datetime] NULL,

[Amount] [decimal](18, 2) NOT NULL,

[Fund_Code] [int] NOT NULL,

[Batch_No] [int] NULL,

[Updated_Date] [datetime] NOT NULL,

[Updated_Time] [char](8) NULL,

[Updated_Userid] [char](8) NOT NULL,

[Archive_Ind] [char](1) NULL,

[Posted_Tran] [int] NULL,

[Updated_Session] [int] NULL,

CONSTRAINT [PK_AAA_DAVE_TEST_Source] PRIMARY KEY CLUSTERED

(

[Account_No] ASC,

[Transaction_Id] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

)

ON [PRIMARY]

-- Only used for the test with the checksum column added

--Alter table [AAA_DAVE_TEST_Source]

--ADD ChecksumCol as BINARY_CHECKSUM([account_no],[transaction_id],[account_type],[transaction_type],[branch_code],[pay_method_cd],[trans_date],[effective_date],[amount],[fund_code],[batch_no],[Updated_date],[updated_time],[updated_userid],[archive_ind],[posted_tran],[updated_session])

END

ELSE

BEGIN

truncate table aaa_dave_test_source

END

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[AAA_DAVE_TEST_Target]') AND type in (N'U'))

BEGIN

CREATE TABLE [AAA_DAVE_TEST_Target](

[Account_No] [int] NOT NULL,

[Transaction_Id] [int] NOT NULL,

[Account_Type] [char](2) NOT NULL,

[Transaction_Type] [char](4) NOT NULL,

[Branch_Code] [char](2) NOT NULL,

[Pay_Method_Cd] [char](3) NULL,

[Trans_Date] [datetime] NOT NULL,

[Effective_Date] [datetime] NULL,

[Amount] [decimal](18, 2) NOT NULL,

[Fund_Code] [int] NOT NULL,

[Batch_No] [int] NULL,

[Updated_Date] [datetime] NOT NULL,

[Updated_Time] [char](8) NULL,

[Updated_Userid] [char](8) NOT NULL,

[Archive_Ind] [char](1) NULL,

[Posted_Tran] [int] NULL,

[Updated_Session] [int] NULL,

CONSTRAINT [PK_AAA_DAVE_TEST_Target] PRIMARY KEY CLUSTERED

(

[Account_No] ASC,

[Transaction_Id] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

)

ON [PRIMARY]

-- Only used for the test with the checksum column added

--Alter table [AAA_DAVE_TEST_Target]

END

ELSE

BEGIN

truncate table aaa_dave_test_target

END

-- insert 20,000,000 records into the first table

insert into AAA_DAVE_TEST_Source

select top 20000000 *

from dbo.The_Table

-- insert 20,000,000 records into the second table

insert into AAA_DAVE_TEST_Target

select top 20000000 *

from dbo.The_Table

-- ensure there are differences so an update has to be performed

Update AAA_DAVE_TEST_Target

Set Fund_Code = 0

DECLARE @START_TIME datetime

DECLARE @END_TIME datetime

SELECT 'Starting Test'

SET @START_TIME = GETDATE()

UPDATE a

SET [Account_Type] = b.[Account_Type]

,[Transaction_Type] = b.[Transaction_Type]

,[Branch_Code] = b.[Branch_Code]

,[Pay_Method_Cd] = b.[Pay_Method_Cd]

,[Trans_Date] = b.[Trans_Date]

,[Effective_Date] = b.[Effective_Date]

,[Amount] = b.[Amount]

,[Fund_Code] = b.[Fund_Code]

,[Batch_No] = b.[Batch_No]

,[Updated_Date] = b.[Updated_Date]

,[Updated_Time] = b.[Updated_Time]

,[Updated_Userid] = b.[Updated_Userid]

,[Archive_Ind] = b.[Archive_Ind]

,[Posted_Tran] = b.[Posted_Tran]

,[Updated_Session] = b.[Updated_Session]

FROM AAA_DAVE_TEST_Target a inner join [AAA_DAVE_TEST_Source] b

ON b.[Account_No] = a.[Account_No]

AND b.[Transaction_Id] = a.[Transaction_Id]

WHERE

(

-- [ PLACEHOLDER ]

-- This is the code that varies between techniques.

-- See below for the code snippets.

)

SET @END_TIME = GETDATE()

SELECT 'Finished', @END_TIME, 'Time To Run: ' + CAST(DATEDIFF(SECOND, @START_TIME,@END_TIME) as varchar) + ' seconds'

-- drop the tables to also ensure nothing's cached and

-- no statistics are kept which might affect results

DROP TABLE AAA_DAVE_TEST_Source

DROP TABLE AAA_DAVE_TEST_Target

Here are the full code snippets that are in the WHERE clause’s [ PLACEHOLDER ] section for each technique:

Code

COALESCE(b.[Account_Type],'') != COALESCE(a.[Account_Type],'')
OR
COALESCE(b.[Transaction_Type],'') != COALESCE(a.[Transaction_Type],'')
OR
COALESCE(b.[Branch_Code],'') != COALESCE(a.[Branch_Code],'')
OR
COALESCE(b.[Pay_Method_Cd],'') != COALESCE(a.[Pay_Method_Cd],'')
OR
COALESCE(b.[Trans_Date],'') != COALESCE(a.[Trans_Date],'')
OR
COALESCE(b.[Effective_Date],'') != COALESCE(a.[Effective_Date],'')
OR
COALESCE(b.[Amount], CAST(0 AS DECIMAL(36, 12))) != COALESCE(a.[Amount], CAST(0 AS DECIMAL(36, 12)))
OR
COALESCE(b.[Fund_Code], 0) != COALESCE(a.[Fund_Code], 0)
OR
COALESCE(b.[Batch_No], 0) != COALESCE(a.[Batch_No], 0)
OR
COALESCE(b.[Updated_Date],'') != COALESCE(a.[Updated_Date],'')
OR
COALESCE(b.[Updated_Time],'') != COALESCE(a.[Updated_Time],'')
OR
COALESCE(b.[Updated_Userid],'') != COALESCE(a.[Updated_Userid],'')
OR
COALESCE(b.[Archive_Ind],'') != COALESCE(a.[Archive_Ind],'')
OR
COALESCE(b.[Posted_Tran], 0) != COALESCE(a.[Posted_Tran], 0)
OR
COALESCE(b.[Updated_Session], 0) != COALESCE(a.[Updated_Session], 0)

COALESCE(b.[Account_Type],'') != COALESCE(a.[Account_Type],'')

COALESCE(b.[Transaction_Type],'') != COALESCE(a.[Transaction_Type],'')

COALESCE(b.[Branch_Code],'') != COALESCE(a.[Branch_Code],'')

COALESCE(b.[Pay_Method_Cd],'') != COALESCE(a.[Pay_Method_Cd],'')

COALESCE(b.[Trans_Date],'') != COALESCE(a.[Trans_Date],'')

COALESCE(b.[Effective_Date],'') != COALESCE(a.[Effective_Date],'')

COALESCE(b.[Amount], CAST(0 AS DECIMAL(36, 12))) != COALESCE(a.[Amount], CAST(0 AS DECIMAL(36, 12)))

COALESCE(b.[Fund_Code], 0) != COALESCE(a.[Fund_Code], 0)

COALESCE(b.[Batch_No], 0) != COALESCE(a.[Batch_No], 0)

COALESCE(b.[Updated_Date],'') != COALESCE(a.[Updated_Date],'')

COALESCE(b.[Updated_Time],'') != COALESCE(a.[Updated_Time],'')

COALESCE(b.[Updated_Userid],'') != COALESCE(a.[Updated_Userid],'')

COALESCE(b.[Archive_Ind],'') != COALESCE(a.[Archive_Ind],'')

COALESCE(b.[Posted_Tran], 0) != COALESCE(a.[Posted_Tran], 0)

COALESCE(b.[Updated_Session], 0) != COALESCE(a.[Updated_Session], 0)

a.ChecksumCol
<>
b.ChecksumCol

a.ChecksumCol

b.ChecksumCol

STUFF(
    cast(a.[Account_No] as varchar)
    +
    cast(a.[Transaction_Id] as varchar)
    +
    COALESCE(a.[Account_Type],'')
    +
    COALESCE(a.[Transaction_Type],'')
    +
    COALESCE(a.[Branch_Code],'')
    +
    COALESCE(a.[Pay_Method_Cd],'')
    +
    cast(COALESCE(a.[Trans_Date], CAST('1900-01-01' as datetime)) as varchar) 
    +
    cast(COALESCE(a.[Effective_Date], CAST('1900-01-01' as datetime)) as varchar)
    +
    cast(COALESCE(a.[Amount], CAST(0 AS DECIMAL(18, 2))) as varchar)
    +
    cast(COALESCE(a.[Fund_Code], 0) as varchar) 
    +
    cast(COALESCE(a.[Batch_No], 0) as varchar)
    +
    cast(COALESCE(a.[Updated_Date], CAST('1900-01-01' as datetime)) as varchar)
    +
    cast(COALESCE(a.[Updated_Time], CAST('1900-01-01' as datetime)) as varchar)
    +
    COALESCE(a.[Updated_Userid],'')
    +
    COALESCE(a.[Archive_Ind],'')
    +
    cast(COALESCE(a.[Posted_Tran], 0) as varchar)
    +
    cast(COALESCE(a.[Updated_Session], 0) as varchar)
, 1, 1,'')
<>
STUFF(
    cast(b.[Account_No] as varchar)
    +
    cast(b.[Transaction_Id] as varchar)
    +
    COALESCE(b.[Account_Type],'')
    +
    COALESCE(b.[Transaction_Type],'')
    +
    COALESCE(b.[Branch_Code],'')
    +
    COALESCE(b.[Pay_Method_Cd],'')
    +
    cast(COALESCE(b.[Trans_Date], CAST('1900-01-01' as datetime)) as varchar) 
    +
    cast(COALESCE(b.[Effective_Date], CAST('1900-01-01' as datetime)) as varchar)
    +
    cast(COALESCE(b.[Amount], CAST(0 AS DECIMAL(18, 2))) as varchar)
    +
    cast(COALESCE(b.[Fund_Code], 0) as varchar) 
    +
    cast(COALESCE(b.[Batch_No], 0) as varchar)
    +
    cast(COALESCE(b.[Updated_Date], CAST('1900-01-01' as datetime)) as varchar)
    +
    cast(COALESCE(b.[Updated_Time], CAST('1900-01-01' as datetime)) as varchar)
    +
    COALESCE(b.[Updated_Userid],'')
    +
    COALESCE(b.[Archive_Ind],'')
    +
    cast(COALESCE(b.[Posted_Tran], 0) as varchar)
    +
    cast(COALESCE(b.[Updated_Session], 0) as varchar)
, 1, 1,'')

STUFF(

cast(a.[Account_No] as varchar)

cast(a.[Transaction_Id] as varchar)

COALESCE(a.[Account_Type],'')

COALESCE(a.[Transaction_Type],'')

COALESCE(a.[Branch_Code],'')

COALESCE(a.[Pay_Method_Cd],'')

cast(COALESCE(a.[Trans_Date], CAST('1900-01-01' as datetime)) as varchar)

cast(COALESCE(a.[Effective_Date], CAST('1900-01-01' as datetime)) as varchar)

cast(COALESCE(a.[Amount], CAST(0 AS DECIMAL(18, 2))) as varchar)

cast(COALESCE(a.[Fund_Code], 0) as varchar)

cast(COALESCE(a.[Batch_No], 0) as varchar)

cast(COALESCE(a.[Updated_Date], CAST('1900-01-01' as datetime)) as varchar)

cast(COALESCE(a.[Updated_Time], CAST('1900-01-01' as datetime)) as varchar)

COALESCE(a.[Updated_Userid],'')

COALESCE(a.[Archive_Ind],'')

cast(COALESCE(a.[Posted_Tran], 0) as varchar)

cast(COALESCE(a.[Updated_Session], 0) as varchar)

, 1, 1,'')

STUFF(

cast(b.[Account_No] as varchar)

cast(b.[Transaction_Id] as varchar)

COALESCE(b.[Account_Type],'')

COALESCE(b.[Transaction_Type],'')

COALESCE(b.[Branch_Code],'')

COALESCE(b.[Pay_Method_Cd],'')

cast(COALESCE(b.[Trans_Date], CAST('1900-01-01' as datetime)) as varchar)

cast(COALESCE(b.[Effective_Date], CAST('1900-01-01' as datetime)) as varchar)

cast(COALESCE(b.[Amount], CAST(0 AS DECIMAL(18, 2))) as varchar)

cast(COALESCE(b.[Fund_Code], 0) as varchar)

cast(COALESCE(b.[Batch_No], 0) as varchar)

cast(COALESCE(b.[Updated_Date], CAST('1900-01-01' as datetime)) as varchar)

cast(COALESCE(b.[Updated_Time], CAST('1900-01-01' as datetime)) as varchar)

COALESCE(b.[Updated_Userid],'')

COALESCE(b.[Archive_Ind],'')

cast(COALESCE(b.[Posted_Tran], 0) as varchar)

cast(COALESCE(b.[Updated_Session], 0) as varchar)

, 1, 1,'')

," a.[transaction="id]," a.[account="type]," a.[transaction="type]," a.[branch="code],"
    a.[pay="method_cd]," a.[trans="date]," a.[effective="date]," a.[amount], a.[fund="code]," a.[batch="no],"
    a.[Updated="date]," a.[updated="time]," a.[updated="userid]," a.[archive="ind]," a.[posted="tran]," a.[updated="session])"
<>
CONCAT(b.[account="no]," b.[transaction="id]," b.[account="type]," b.[transaction="type]," b.[branch="code],"
    b.[pay="method_cd]," b.[trans="date]," b.[effective="date]," b.[amount], b.[fund="code]," b.[batch="no],"
    b.[Updated="date]," b.[updated="time]," b.[updated="userid]," b.[archive="ind]," b.[posted="tran]," b.[updated="session])"
</pre>
</td>
</tr>
<tr>
<td valign=top style='border-top="none;border-left:solid" black 1.0pt; border-bottom="solid" black 1.0pt;border-right="solid" windowtext 1.0pt; padding="0cm" 5.4pt 0cm 5.4pt'>
<p align=center style='text-align="center'><b>5</b></p>"
</td>
<td width=843 valign=top style='width="632.1pt;border-top:none;border-left:" none;border-bottom="solid" black 1.0pt;border-right="solid" black 1.0pt; padding="0cm" 5.4pt 0cm 5.4pt'>
<pre class= lang:tsql decode:true]
CONCAT(a.[account_no], a.[transaction_id], a.[account_type], a.[transaction_type], a.[branch_code],
    a.[pay_method_cd], a.[trans_date], a.[effective_date], a.[amount], a.[fund_code], a.[batch_no],
    a.[Updated_date], a.[updated_time], a.[updated_userid], a.[archive_ind], a.[posted_tran], a.[updated_session])
<>
CONCAT(b.[account_no], b.[transaction_id], b.[account_type], b.[transaction_type], b.[branch_code],
    b.[pay_method_cd], b.[trans_date], b.[effective_date], b.[amount], b.[fund_code], b.[batch_no],
    b.[Updated_date], b.[updated_time], b.[updated_userid], b.[archive_ind], b.[posted_tran], b.[updated_session])

," a.[transaction="id]," a.[account="type]," a.[transaction="type]," a.[branch="code],"

a.[pay="method_cd]," a.[trans="date]," a.[effective="date]," a.[amount], a.[fund="code]," a.[batch="no],"

a.[Updated="date]," a.[updated="time]," a.[updated="userid]," a.[archive="ind]," a.[posted="tran]," a.[updated="session])"

CONCAT(b.[account="no]," b.[transaction="id]," b.[account="type]," b.[transaction="type]," b.[branch="code],"

b.[pay="method_cd]," b.[trans="date]," b.[effective="date]," b.[amount], b.[fund="code]," b.[batch="no],"

b.[Updated="date]," b.[updated="time]," b.[updated="userid]," b.[archive="ind]," b.[posted="tran]," b.[updated="session])"

</pre>

</td>

</tr>

<tr>

5"

</td>

<pre class= lang:tsql decode:true]

CONCAT(a.[account_no], a.[transaction_id], a.[account_type], a.[transaction_type], a.[branch_code],

a.[pay_method_cd], a.[trans_date], a.[effective_date], a.[amount], a.[fund_code], a.[batch_no],

a.[Updated_date], a.[updated_time], a.[updated_userid], a.[archive_ind], a.[posted_tran], a.[updated_session])

CONCAT(b.[account_no], b.[transaction_id], b.[account_type], b.[transaction_type], b.[branch_code],

b.[pay_method_cd], b.[trans_date], b.[effective_date], b.[amount], b.[fund_code], b.[batch_no],

b.[Updated_date], b.[updated_time], b.[updated_userid], b.[archive_ind], b.[posted_tran], b.[updated_session])

," a.[transaction="id]," a.[account="type]," a.[transaction="type]," a.[branch="code],"
    a.[pay="method_cd]," a.[trans="date]," a.[effective="date]," a.[amount], a.[fund="code]," a.[batch="no],"
    a.[Updated="date]," a.[updated="time]," a.[updated="userid]," a.[archive="ind]," a.[posted="tran]," a.[updated="session]))"
<>
HASHBYTES('sha',
    CONCAT(b.[account="no]," b.[transaction="id]," b.[account="type]," b.[transaction="type]," b.[branch="code],"
    b.[pay="method_cd]," b.[trans="date]," b.[effective="date]," b.[amount], b.[fund="code]," b.[batch="no],"
    b.[Updated="date]," b.[updated="time]," b.[updated="userid]," b.[archive="ind]," b.[posted="tran]," b.[updated="session]))"
</pre>
</td>
</tr>
<tr>
<td valign=top style='border-top="none;border-left:solid" black 1.0pt; border-bottom="solid" black 1.0pt;border-right="solid" windowtext 1.0pt; padding="0cm" 5.4pt 0cm 5.4pt'>
<p align=center style='text-align="center'><b>6</b></p>"
</td>
<td width=843 valign=top style='width="632.1pt;border-top:none;border-left:" none;border-bottom="solid" black 1.0pt;border-right="solid" black 1.0pt; padding="0cm" 5.4pt 0cm 5.4pt'>
<pre class= lang:tsql decode:true"]
HASHBYTES('sha',
    CONCAT(a.[account_no], a.[transaction_id], a.[account_type], a.[transaction_type], a.[branch_code],
    a.[pay_method_cd], a.[trans_date], a.[effective_date], a.[amount], a.[fund_code], a.[batch_no],
    a.[Updated_date], a.[updated_time], a.[updated_userid], a.[archive_ind], a.[posted_tran], a.[updated_session]))
<>
HASHBYTES('sha',
    CONCAT(b.[account_no], b.[transaction_id], b.[account_type], b.[transaction_type], b.[branch_code],
    b.[pay_method_cd], b.[trans_date], b.[effective_date], b.[amount], b.[fund_code], b.[batch_no],
    b.[Updated_date], b.[updated_time], b.[updated_userid], b.[archive_ind], b.[posted_tran], b.[updated_session]))

," a.[transaction="id]," a.[account="type]," a.[transaction="type]," a.[branch="code],"

a.[pay="method_cd]," a.[trans="date]," a.[effective="date]," a.[amount], a.[fund="code]," a.[batch="no],"

a.[Updated="date]," a.[updated="time]," a.[updated="userid]," a.[archive="ind]," a.[posted="tran]," a.[updated="session]))"

HASHBYTES('sha',

CONCAT(b.[account="no]," b.[transaction="id]," b.[account="type]," b.[transaction="type]," b.[branch="code],"

b.[pay="method_cd]," b.[trans="date]," b.[effective="date]," b.[amount], b.[fund="code]," b.[batch="no],"

b.[Updated="date]," b.[updated="time]," b.[updated="userid]," b.[archive="ind]," b.[posted="tran]," b.[updated="session]))"

</pre>

</td>

</tr>

<tr>

6"

</td>

<pre class= lang:tsql decode:true"]

HASHBYTES('sha',

CONCAT(a.[account_no], a.[transaction_id], a.[account_type], a.[transaction_type], a.[branch_code],

a.[pay_method_cd], a.[trans_date], a.[effective_date], a.[amount], a.[fund_code], a.[batch_no],

a.[Updated_date], a.[updated_time], a.[updated_userid], a.[archive_ind], a.[posted_tran], a.[updated_session]))

HASHBYTES('sha',

CONCAT(b.[account_no], b.[transaction_id], b.[account_type], b.[transaction_type], b.[branch_code],

b.[pay_method_cd], b.[trans_date], b.[effective_date], b.[amount], b.[fund_code], b.[batch_no],

b.[Updated_date], b.[updated_time], b.[updated_userid], b.[archive_ind], b.[posted_tran], b.[updated_session]))

BINARY_CHECKSUM(a.[account_no], a.[transaction_id], a.[account_type], a.[transaction_type], a.[branch_code], 
    a.[pay_method_cd], a.[trans_date], a.[effective_date], a.[amount], a.[fund_code], a.[batch_no], 
    a.[Updated_date], a.[updated_time], a.[updated_userid], a.[archive_ind], a.[posted_tran], a.[updated_session])
<>
BINARY_CHECKSUM(b.[account_no], b.[transaction_id], b.[account_type], b.[transaction_type], b.[branch_code], 
    b.[pay_method_cd], b.[trans_date], b.[effective_date], b.[amount], b.[fund_code], b.[batch_no], 
    b.[Updated_date], b.[updated_time], b.[updated_userid], b.[archive_ind], b.[posted_tran], b.[updated_session])

BINARY_CHECKSUM(a.[account_no], a.[transaction_id], a.[account_type], a.[transaction_type], a.[branch_code],

a.[pay_method_cd], a.[trans_date], a.[effective_date], a.[amount], a.[fund_code], a.[batch_no],

a.[Updated_date], a.[updated_time], a.[updated_userid], a.[archive_ind], a.[posted_tran], a.[updated_session])

BINARY_CHECKSUM(b.[account_no], b.[transaction_id], b.[account_type], b.[transaction_type], b.[branch_code],

b.[pay_method_cd], b.[trans_date], b.[effective_date], b.[amount], b.[fund_code], b.[batch_no],

b.[Updated_date], b.[updated_time], b.[updated_userid], b.[archive_ind], b.[posted_tran], b.[updated_session])

CHECKSUM(a.[account_no], a.[transaction_id], a.[account_type], a.[transaction_type], a.[branch_code], 
    a.[pay_method_cd], a.[trans_date], a.[effective_date], a.[amount], a.[fund_code], a.[batch_no], 
    a.[Updated_date], a.[updated_time], a.[updated_userid], a.[archive_ind], a.[posted_tran], a.[updated_session])
<>
CHECKSUM(b.[account_no], b.[transaction_id], b.[account_type], b.[transaction_type], b.[branch_code], 
    b.[pay_method_cd], b.[trans_date], b.[effective_date], b.[amount], b.[fund_code], b.[batch_no], 
    b.[Updated_date], b.[updated_time], b.[updated_userid], b.[archive_ind], b.[posted_tran], b.[updated_session])

CHECKSUM(a.[account_no], a.[transaction_id], a.[account_type], a.[transaction_type], a.[branch_code],

a.[pay_method_cd], a.[trans_date], a.[effective_date], a.[amount], a.[fund_code], a.[batch_no],

a.[Updated_date], a.[updated_time], a.[updated_userid], a.[archive_ind], a.[posted_tran], a.[updated_session])

CHECKSUM(b.[account_no], b.[transaction_id], b.[account_type], b.[transaction_type], b.[branch_code],

b.[pay_method_cd], b.[trans_date], b.[effective_date], b.[amount], b.[fund_code], b.[batch_no],

b.[Updated_date], b.[updated_time], b.[updated_userid], b.[archive_ind], b.[posted_tran], b.[updated_session])

The Results Are In!

Here are the results, in seconds, as to how long each technique took to execute.

The winner is highlighted in green.

#	Time to execute, in seconds, over 3 runs:
1: Original	95, 96, 95
2: Checksum Column	84, 84, 84
3: STUFF	247, 240, 244
4: CONCAT	157, 157, 155
5: HASHBYTES	254, 253, 253
6: BINARY_CHEKSUM	84, 84, 84
7: CHECKSUM	88, 88, 87

The Way To Go

I didn’t know what to expect when running the tests, so made no guesses as to which way might run the fastest.

I was surprised by two things:

The Checksum column and the BINARY_CHECKSUM took the same time to complete
Both of the above ran in the same amount of time across 3 individual runs. There was no fluctuation.

For me, the way to go will be using BINARY_CHECKSUM in the stored procedures (which are being developed) because we can’t modify the 500+ tables which might need a checksum column.

As for everyone else, I’ve left you the SQL code above, so feel free to use it as a basis for conducting your own performance benchmarks.

If anyone else has any tricks or techniques for comparing multiple columns quickly in a MERGE or WHERE clause, definitely leave a comment below and share the knowledge! 🙂

(Visited 45,459 times, 1 visits today)

Spread the love

SQL: Fastest way to compare multiple column values

ByDavid Lozinski

Fastest way to compare multiple column values

The Alternatives

Setting Things Up for the Speed Test

The Results Are In!

The Way To Go

Related Post

SQL: comparing values IN vs INNER JOIN

SQL: LIKE vs SUBSTRING vs LEFT/RIGHT vs CHARINDEX

SQL: Fastest way to insert new records where one doesn’t already exist

You missed

Comparing Microsoft Copilot, Copilot M365, and ChatGPT

Dating Multiple AI’s

Choosing CloudFlare over Microsoft Azure

Convert a Text field to Proper Case using Power Automate