C# .Net: Fastest Way to clear Collections
In this article I’ll investigate different ways to clear out commonly used Collections such as Hashsets, Dictionaries, ConcurrentDictionaries, ArrayLists, and Arrays, benchmarking the results. This will determine in C# .Net: Fastest way to clear Collections.
In almost every major C# application, there’s at least one of the following Collection Types used:
- Array (A)
- ArrayList (AL)
- ConcurrentDictionary (CD)
- Dictionary (D)
- Hashset (H)
The majority of the programmers behind said applications tend to be “lazy”. That is, not explicitly cleaning up and clearing out their data when finished. Rather, they leave it to the internals of the C# runtime to do so.
That’s when this curious consultant started wondering in C# .Net: what is the fastest way to clear Collections while minimizing overall performance impact? Is it using their .Clear() methods? Iterating over each element setting it to null? Another way?
The Set Up:
I wrote a C# Console application to test numerous techniques.
The code is written in Visual Studio 2013 targeting .Net Framework version 4.5 x64. The source code is available at the end of this blog so you can benchmark it on your own system if you wish.
In a nutshell, here are the methods:
# |
Technique |
Code Snippet |
||
A1 |
Set each array element = null |
|
||
A2 |
Array.Clear |
|
||
A3 |
Set each array element = null using Parallel.For |
|
||
AL1 |
Remove each ArrayList item |
|
||
AL2 |
ArrayList.Clear() |
|
||
AL3 |
Remove each ArrayList item using Parallel.For |
|
||
CD1 |
Set each item in ConcurrentDictionary = null |
|
||
CD2 |
ConcurrentDictionary.Clear() |
|
||
CD3 |
Set each item in ConcurrentDictionary = null using Parallel.For |
|
||
D1 |
Set each item in Dictionary = null |
|
||
D2 |
Dictionary.Clear() |
|
||
D3 |
Set each item in Dictionary = null using Parallel.For |
|
||
H1 |
Remove each item from the Hashset |
|
||
H2 |
Hashset.Clear() |
|
||
H3 |
Remove each item from the Hashset using Parallel.For |
|
The exe file was run on Windows 7 64-bit with 16 GB memory.
The code assumes all numbers are positive integers.
The test was run for each technique for 10,737,418 and 2,147,483 integer “objects”.
Let the Benchmarks… BEGIN!
Before starting, my hypothesis was that I expected the .Clear() methods to be the fastest. They’re certainly the easiest to use, only needing one line of code.
Let’s see what happened on my machine over multiple runs.
All times are indicated in minutes:seconds.milliseconds format. Lower numbers indicate faster performance.
Winner marked in green; there are no points for second place.
BENCHMARK RUN #1 |
@ 10,737,418 |
@ 2,147,483 |
A1: o[x] = null |
00:00.0320018 |
00:00.0050003 |
A2: Array.Clear() |
00:00.0050003 |
00:00.0010000 |
A3: o[x] = null w/ Parallel.For |
00:00.0310017 |
00:00.0060004 |
|
||
AL1: al1.Remove() |
– too long – |
– too long – |
AL2: ArrayList.Clear() |
00:00.0070004 |
00:00.0020001 |
AL3: al1.Remove() w/ Parallel.For |
– too long – |
– too long – |
|
||
CD1: cd1[x] = null |
00:00.6440368 |
00:00.1270073 |
CD2: ConcurrentDictionary.Clear() |
00:00.2070118 |
00:00.0470027 |
CD3: cd1[x] = null w/ Parallel.For |
00:00.1880108 |
00:00.0440025 |
|
||
D1: d1[x] = null |
00:00.2160124 |
00:00.0420024 |
D2: Dictionary.Clear() |
00:00.0430025 |
00:00.0080004 |
D3: d1[x] = null w/ Parallel.For |
00:00.8700497 |
00:00.1870107 |
|
||
H1: h1.Remove() |
00:00.4250243 |
00:00.0870050 |
H2: Hashset.Clear() |
00:00.0240014 |
00:00.0050003 |
H3: h1.Remove() w/ Parallel.For |
00:01.8021030 |
00:00.2790159 |
BENCHMARK RUN #2 |
@ 10,737,418 |
@ 2,147,483 |
A1: o[x] = null |
00:00.0370013 |
00:00.0050002 |
A2: Array.Clear() |
00:00.0050001 |
00:00.0010000 |
A3: o[x] = null w/ Parallel.For |
00:00.0320010 |
00:00.0060009 |
|
||
AL1: al1.Remove() |
– too long – |
– too long – |
AL2: ArrayList.Clear() |
00:00.0070006 |
00:00.0020005 |
AL3: al1.Remove() w/ Parallel.For |
– too long – |
– too long – |
|
||
CD1: cd1[x] = null |
00:00.6451269 |
00:00.1260902 |
CD2: ConcurrentDictionary.Clear() |
00:00.2108122 |
00:00.0469031 |
CD3: cd1[x] = null w/ Parallel.For |
00:00.1854128 |
00:00.0420175 |
|
||
D1: d1[x] = null |
00:00.2160572 |
00:00.0430008 |
D2: Dictionary.Clear() |
00:00.0440257 |
00:00.0070009 |
D3: d1[x] = null w/ Parallel.For |
00:00.8809255 |
00:00.1902134 |
|
||
H1: h1.Remove() |
00:00.4260029 |
00:00.0830387 |
H2: Hashset.Clear() |
00:00.0240008 |
00:00.0050002 |
H3: h1.Remove() w/ Parallel.For |
00:01.7945270 |
00:00.2794756 |
Only One Surprise in the Results
Well, everything performed as expected except for the ConcurrentDictionary object. For whatever reason, the native .Clear() method ran slower than a Parallel.For loop with locking! Obviously in the underlying native code Microsoft didn’t implement parallelism. Even I could work for Microsoft by having such simple code run faster than their native methods. 😉
Otherwise, no tears were shed for any of the other results. Using the native .Clear() methods are not only the fastest, but certainly the simplest code wise.
Obviously results may vary, and you should test on your system before micro-optimizing this functionality in your C# .Net application.
To All the “Lazy” Programmers
Really… you have no excuse now to add one line of code to clear these collections when you’re done using it.
The code below so you can run your own benchmarks to see I’m not making this stuff up. 🙂
The Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 |
using System; namespace TestApplication { class Program { static void Main(string[] args) { DateTime end; DateTime start = DateTime.Now; Console.WriteLine("### Overall Start Time: " + start.ToLongTimeString()); Console.WriteLine(); FastestWayToClearDataStructures(1); FastestWayToClearDataStructures(2); FastestWayToClearDataStructures(3); end = DateTime.Now; Console.WriteLine(); Console.WriteLine("### Overall End Time: " + end.ToLongTimeString()); Console.WriteLine("### Overall Run Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Hit Enter to Exit"); Console.ReadLine(); } //Does a comparison of clearing out Hashsets, Dictionaries, ConcurrentDictionaries, Arrays, ArrayLists static void FastestWayToClearDataStructures(int action) { int MAX = (int)Math.Floor((double)(Int32.MaxValue / 200)); Console.WriteLine("######## " + System.Reflection.MethodBase.GetCurrentMethod().Name + " Action: " + action); Console.WriteLine("######## Creating " + MAX + " objects to add to the data structures."); Dictionary<int, object> d1 = new Dictionary<int, object>(MAX); ConcurrentDictionary<int, object> cd1 = new ConcurrentDictionary<int, object>(MAX, MAX); HashSet<object> h1 = new HashSet<object>(); object[] o1 = new object[MAX]; ArrayList al1 = new ArrayList(MAX); DateTime end; DateTime start = DateTime.Now; Console.WriteLine("Starting Add to everything: " + start); for (int x = 0; x < MAX; x++) { d1.Add(x, (object)x); cd1.TryAdd(x, (object)x); h1.Add((object)x); o1[x] = (object)x; al1.Add((object)x); } end = DateTime.Now; Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); if (action == 1) // #### clear each element individually { Console.WriteLine("Starting Single Thread For Clearing from Dictionary: " + DateTime.Now); start = DateTime.Now; for (int x = 0; x < MAX; x++) { d1[x] = null; } end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Single Thread For Clearing from ConcurrentDictionary: " + DateTime.Now); start = DateTime.Now; for (int x = 0; x < MAX; x++) { cd1[x] = null; } end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Single Thread For Clearing from HashSet: " + DateTime.Now); start = DateTime.Now; for (int x = 0; x < MAX; x++) { h1.Remove((object)x); } end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Single Thread For Clearing from Array: " + DateTime.Now); start = DateTime.Now; for (int x = 0; x < MAX; x++) { o1[x] = null; } end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); /***** * Commenting this out because it took over 30 minutes and still didn't complete Console.WriteLine("Starting Single Thread For Clearing from ArrayList: " + DateTime.Now); start = DateTime.Now; for (int x = 0; x < MAX; x++) { al1.Remove((object)x); } end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); * * **/ } else if (action == 2) //#### using the .Clear method { Console.WriteLine("Starting Clearing from Dictionary using .Clear(): " + DateTime.Now); start = DateTime.Now; d1.Clear(); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Clearing from ConcurrentDictionary using .Clear(): " + DateTime.Now); start = DateTime.Now; cd1.Clear(); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Clearing from HashSet using .Clear(): " + DateTime.Now); start = DateTime.Now; h1.Clear(); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Clearing from Array using ArrayClear: " + DateTime.Now); start = DateTime.Now; Array.Clear(o1, 0, o1.Length); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Clearing from ArrayList using .Clear(): " + DateTime.Now); start = DateTime.Now; al1.Clear(); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); } else if (action == 3) // #### clear each element in parallel loops { object o = new object(); Console.WriteLine("Starting Parallel For Clearing from Dictionary: " + DateTime.Now); start = DateTime.Now; Parallel.For(0, MAX, x => { lock (o) //because it's not thread safe { d1[x] = null; } }); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Parallel For Clearing from ConcurrentDictionary: " + DateTime.Now); start = DateTime.Now; Parallel.For(0, MAX, x => { cd1[x] = null; }); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Parallel For Clearing from HashSet: " + DateTime.Now); start = DateTime.Now; Parallel.For(0, MAX, x => { lock (o) //because it's not thread safe { h1.Remove((object)x); } }); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); Console.WriteLine("Starting Parallel For Clearing from Array: " + DateTime.Now); start = DateTime.Now; Parallel.For(0, MAX, x => { //arrays are thread safe if only one thread is writing to an index o1[x] = null; }); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); /***** * Commenting this out because it took over 30 minutes and still didn't complete Console.WriteLine("Starting Parallel For Clearing from ArrayList: " + DateTime.Now); start = DateTime.Now; Parallel.For(0, MAX, x => { lock (o) { al1.Remove((object)x); } }); end = DateTime.Now; GC.Collect(); Console.WriteLine("Finished at: " + end); Console.WriteLine("Time: " + (end - start)); Console.WriteLine(); * * **/ } //Clean up d1 = null; cd1 = null; //cd2 = null; h1 = null; o1 = null; al1 = null; GC.Collect(); } } } |