9

SQL: LIKE vs SUBSTRING vs LEFT/RIGHT vs CHARINDEX

LIKE vs SUBSTRING vs LEFT/RIGHT vs CHARINDEX

All too often developers need to query a database table for records which begin and/or end with a certain textual pattern. For instance, find all employees where their first name begins with “DAV”.

In my experience, there tends to be 4 common functions SQL developers rely upon for this.

That’s when this curious consultant decided to put them head-to-head: LIKE vs SUBSTRING vs LEFT / RIGHT vs CHARINDEX to see which is the fastest.

The Pre-Game Show

For testing purposes, the SQL code was executed on a Windows 2012 Server with 128GB memory, 16 core CPU rated at 2.54 Ghz, using Microsoft SQL Server 2014.

To make things a bit more interesting, the test is split into two parts:

  • will test the speed of each against table table
  • will test the speed of each against regular “string” data

The assumption here is there will be no race conditions or multi-threaded calls to this SQL code. This is just a straight up, head-to-head test.

To ensure SQL Server didn’t keep any queries (or anything for that matter) cached, the following code was run before each test:

Two tables are created, and populated. The primary keys are the same. Three columns in each table will hold the exact same data, but:

  1. one will have a clustered index
  2. another will have a non-clustered index
  3. the third will have no index

just to see if they have any effect on performance.

A query is done using each function to search for a particular string inserted within a UniqueIdentifier (guid). The results are inserted into the secondary table to make sure we have the same number of results for each test.

The complete source code can be found at the end of this article.

Are you a Betting Individual?

Here are the results, in milli-seconds, in our LIKE vs SUBSTRING vs LEFT/RIGHT vs CHARINDEX speed tests.

The winner is highlighted in green. There are no points for second place in this speed test.

FunctionTime to execute against Clustered Indexed column, in milliseconds, over 3 runs:

# Records:

50,000500,0005,000,00050,000,000
1: LIKE46, 43, 40406, 413, 4034016, 3986, 399639940, 39756, 40423
2: SUBSTRING46, 46, 46440, 443, 4432513, 2513, 260324760, 24873, 24270
3: LEFT/RIGHT40, 43, 43406, 406, 4102523, 2526, 251624713, 24770, 24823
4: CHARINDEX10, 10, 1056, 56, 56590, 593, 5765713, 5683, 5730

 

FunctionTime to execute against Non-Clustered Indexed column, in milliseconds, over 3 runs:

# Records:

50,000500,0005,000,00050,000,000
1: LIKE43, 40, 4070, 63, 73680, 666, 6707203, 6756, 6716
2: SUBSTRING103, 100, 103256, 256, 2602750, 2750, 276327076, 26940, 27053
3: LEFT/RIGHT100, 86, 100253, 246, 2502730, 2733, 273027166, 26633, 27123
4: CHARINDEX10, 10, 1056, 53, 60590, 590, 5865810, 5763, 5690

 

FunctionTime to execute against Non-Indexed column, in milliseconds, over 3 runs:

# Records:

50,000500,0005,000,00050,000,000
1: LIKE43, 43, 4063, 66, 70670, 676, 6636790, 7153, 6726
2: SUBSTRING126, 123, 123396, 396, 3963016, 2810, 293325463, 25643, 25420
3: LEFT/RIGHT46, 46, 43246, 250, 2532800, 2496, 254624690, 24750, 24810
4: CHARINDEX10, 6, 656, 56, 60576, 590, 5735790, 5836, 6276

 

FunctionTime to execute against varchar string, in milliseconds, over 3 runs:

# Records:

50,000500,0005,000,00050,000,000
1: LIKE126, 87, 75909, 882, 8228358, 8607, 866788610, 87349, 85341
2: SUBSTRING60, 45, 69585, 582, 5765571, 5673, 567057849, 54552, 56344
3: LEFT/RIGHT45, 45, 24474, 396, 3513945, 4044, 399039969, 39135, 40919
4: CHARINDEX60, 36, 66618, 636, 5645766, 5937, 590459773, 58412, 60198

 

Look Who Owned It!

CHARINDEX is clearly the undisputed king when it comes to querying a table column looking for a value. The 2 of the other 3 didn’t even come close in speed for me to acknowledge them as “competition”. I expected LIKE to do better, especially on index columns, but was quite surprised by CHARINDEX’s domination.

When it comes to searching a varchar/string variable, LEFT/RIGHT commanded the top spot.

In a nutshell, when you need to search for a substring at the beginning or end of data:

  • when performing a query against a table column, use CHARINDEX
  • when searching within a @varchar string variable, use LEFT/RIGHT

I’ve left you the SQL code below, so feel free to use it as a basis for conducting your own performance benchmarks.

Leave a comment below and share the knowledge if you have any suggestions or other ways of doing this!

The SQL Source Code

 

(Visited 1,429 times, 3 visits today)
  • Urban Lukas

    * Sql Express 2008 R2 (SP2) x64

  • Urban Lukas

    On Sql Express x64 I get different output, LEFT/RIGHT is allways fastest.
    https://uploads.disquscdn.com/images/51b4bd47b763871da3d638d53ce5810275c6a39aa5a29b4c00d4ec3efe933976.png

    • Urban Lukas

      I tried it also on Sql Server 2008 R2 x64 with similar output.

  • Rachael Padman

    Thanks very much for this. I’ve just tried your tests on our SQL Server 2012 machine to see if we could explain its slowness on a LIKE search on about 13000 records in a non-clustered index field (takes around 400 ms to match the first character, 60 ms to match the first two).

    I had to fix two small errors in the posted code (a missing space, and a missing @ sign), and a single misreporting (of LEFT/RIGHT rather than CHARINDEX). But having done that, I got timings exactly the other way round — SUBSTR and LEFT/RIGHT factors of several *faster* than for CHARSTR and LIKE.

    I’m not an expert on MS-SQL, but find this puzzling. Any thoughts? In case it helps, the machine is a VM with 4 cores and 8GB non-dynamic memory.

    From the message:

    5 ——– NULL NULL NULL NULL ——–

    6 LIKE test against Nonclustered
    indexed column 50000 16666 2017-01-17
    14:41:39.367 2017-01-17
    14:41:39.670 303 milliseconds

    7 SUBSTRING test against
    Nonclustered indexed column 50000 16666 2017-01-17
    14:41:39.673 2017-01-17
    14:41:39.720 46 milliseconds

    8 LEFT/RIGHT test against
    Nonclustered indexed column 50000 16666 2017-01-17
    14:41:39.723 2017-01-17
    14:41:39.767 43 milliseconds

    9 CHARINDEX test against Nonclustered
    indexed column 50000 16666 2017-01-17
    14:41:39.770 2017-01-17
    14:41:40.017 246 milliseconds

  • Jon Culp

    Nice post! I ran a similar test for Like vs Left/Right and got some different results. I will go through your analysis to see what I missed in mine.

    jonculp.com/2015/12/performance-of-like-vs-left-for-indexed-and-non-indexed-columns/

    • FireMystdl

      Two things to try:
      1) my test was performed using SQL Server 2014. Microsoft altered the way it calculates its execution plan or something from versions 2012 & earlier to 2014 and later. They’ve definitely altered the cardinality estimator which affects query plan and performance. Your blog article currently doesn’t state what version of SQL Server you’re using, so that could have an impact.

      2) You have the source code I used. Try doing a blanket copy/paste into your SQL server and see what you get. 🙂