11

SQL: LIKE vs SUBSTRING vs LEFT/RIGHT vs CHARINDEX

Spread the love

LIKE vs SUBSTRING vs LEFT/RIGHT vs CHARINDEX

All too often developers need to query a database table for records which begin and/or end with a certain textual pattern. For instance, find all employees where their first name begins with “DAV”.

In my experience, there tends to be 4 common functions SQL developers rely upon for this.

That’s when this curious consultant decided to put them head-to-head: LIKE vs SUBSTRING vs LEFT / RIGHT vs CHARINDEX to see which is the fastest.

 

The Pre-Game Show

For testing purposes, the SQL code was executed on a Windows 2012 Server with 128GB memory, 16 core CPU rated at 2.54 Ghz, using Microsoft SQL Server 2014.

 

To make things a bit more interesting, the test is split into two parts:

  • will test the speed of each against table table
  • will test the speed of each against regular “string” data

 

The assumption here is there will be no race conditions or multi-threaded calls to this SQL code. This is just a straight up, head-to-head test.

To ensure SQL Server didn’t keep any queries (or anything for that matter) cached, the following code was run before each test:

Two tables are created, and populated. The primary keys are the same. Three columns in each table will hold the exact same data, but:

  1. one will have a clustered index
  2. another will have a non-clustered index
  3. the third will have no index

just to see if they have any effect on performance.

A query is done using each function to search for a particular string inserted within a UniqueIdentifier (guid). The results are inserted into the secondary table to make sure we have the same number of results for each test.

The complete source code can be found at the end of this article.

 

Are you a Betting Individual?

Here are the results, in milli-seconds, in our LIKE vs SUBSTRING vs LEFT/RIGHT vs CHARINDEX speed tests.

The winner is highlighted in green. There are no points for second place in this speed test.

Function

Time to execute against Clustered Indexed column, in milliseconds, over 3 runs:

# Records:

50,000

500,000

5,000,000

50,000,000

1: LIKE

46, 43, 40

406, 413, 403

4016, 3986, 3996

39940, 39756, 40423

2: SUBSTRING

46, 46, 46

440, 443, 443

2513, 2513, 2603

24760, 24873, 24270

3: LEFT/RIGHT

40, 43, 43

406, 406, 410

2523, 2526, 2516

24713, 24770, 24823

4: CHARINDEX

10, 10, 10

56, 56, 56

590, 593, 576

5713, 5683, 5730

 

Function

Time to execute against Non-Clustered Indexed column, in milliseconds, over 3 runs:

# Records:

50,000

500,000

5,000,000

50,000,000

1: LIKE

43, 40, 40

70, 63, 73

680, 666, 670

7203, 6756, 6716

2: SUBSTRING

103, 100, 103

256, 256, 260

2750, 2750, 2763

27076, 26940, 27053

3: LEFT/RIGHT

100, 86, 100

253, 246, 250

2730, 2733, 2730

27166, 26633, 27123

4: CHARINDEX

10, 10, 10

56, 53, 60

590, 590, 586

5810, 5763, 5690

 

Function

Time to execute against Non-Indexed column, in milliseconds, over 3 runs:

# Records:

50,000

500,000

5,000,000

50,000,000

1: LIKE

43, 43, 40

63, 66, 70

670, 676, 663

6790, 7153, 6726

2: SUBSTRING

126, 123, 123

396, 396, 396

3016, 2810, 2933

25463, 25643, 25420

3: LEFT/RIGHT

46, 46, 43

246, 250, 253

2800, 2496, 2546

24690, 24750, 24810

4: CHARINDEX

10, 6, 6

56, 56, 60

576, 590, 573

5790, 5836, 6276

 

Function

Time to execute against varchar string, in milliseconds, over 3 runs:

# Records:

50,000

500,000

5,000,000

50,000,000

1: LIKE

126, 87, 75

909, 882, 822

8358, 8607, 8667

88610, 87349, 85341

2: SUBSTRING

60, 45, 69

585, 582, 576

5571, 5673, 5670

57849, 54552, 56344

3: LEFT/RIGHT

45, 45, 24

474, 396, 351

3945, 4044, 3990

39969, 39135, 40919

4: CHARINDEX

60, 36, 66

618, 636, 564

5766, 5937, 5904

59773, 58412, 60198

 

Look Who Owned It!

CHARINDEX is clearly the undisputed king when it comes to querying a table column looking for a value. The 2 of the other 3 didn’t even come close in speed for me to acknowledge them as “competition”. I expected LIKE to do better, especially on index columns, but was quite surprised by CHARINDEX’s domination.

When it comes to searching a varchar/string variable, LEFT/RIGHT commanded the top spot.

 

In a nutshell, when you need to search for a substring at the beginning or end of data:

  • when performing a query against a table column, use CHARINDEX
  • when searching within a @varchar string variable, use LEFT/RIGHT

 

I’ve left you the SQL code below, so feel free to use it as a basis for conducting your own performance benchmarks.

Leave a comment below and share the knowledge if you have any suggestions or other ways of doing this!

 

The SQL Source Code

 


Spread the love

David Lozinski