High-Performance .NET: Async, Multithreading, and Parallel Programming Parallel Loops in .NET Created: 19 Jan 2026 Updated: 19 Jan 2026

Eliminating Bottlenecks: Reducing Contention with ThreadLocal<T>

In the world of high-performance computing, the biggest enemy of speed isn't necessarily a slow algorithm—it's contention. Contention occurs when multiple threads try to access or modify the same shared resource simultaneously. This forces threads to wait for locks, effectively turning your parallel code back into slow, sequential code.

As discussed in Jeff McNamara’s Ultimate C# for High-Performance Applications, the ThreadLocal<T> class is a powerful tool to eliminate this overhead by giving every thread its own "private" workspace.

The Problem with Shared Data

When threads compete for a single variable, the synchronization overhead (locking) often costs more time than the actual calculation. To solve this, we can use Thread-Local Storage. Instead of sharing one variable, each thread gets its own instance. Once the work is done, we simply aggregate the results from all threads.

Strategy 1: Aggregating Calculations with ThreadLocal<T>

In this example, we want to perform a complex calculation on a range of numbers. Instead of updating a global sum, each thread maintains its own running total.

Example: Parallel Squared Sums

using System;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;

public class ContentionReducer
{
public void CalculateParallelSums()
{
// 'trackAllValues: true' allows us to access all thread values at the end
using var threadLocalSum = new ThreadLocal<long>(trackAllValues: true);

Parallel.For(1, 5001, number =>
{
// Each thread works on its own 'Value' without locking
threadLocalSum.Value += (long)Math.Pow(number, 2);
});

// Combine the totals from every thread used in the loop
long grandTotal = threadLocalSum.Values.Sum();
int threadsUsed = threadLocalSum.Values.Count;

Console.WriteLine($"Grand Total: {grandTotal:N0}");
Console.WriteLine($"Calculated using {threadsUsed} independent threads.");
}
}

Strategy 2: Per-Thread Object Instances

Sometimes, the contention isn't just about a number; it's about an expensive object that isn't thread-safe, like a Random generator or a StringBuilder. Creating a new instance inside every single iteration is too slow, but sharing one across threads causes errors. ThreadLocal<T> provides a perfect middle ground.

Example: Parallel String Building

using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using System.Text;

public class StringProcessor
{
public void BatchProcessStrings()
{
var words = new List<string> { "High", "Performance", "Parallel", "Programming", "C#" };
// Each thread gets its own private StringBuilder
using var localBuilder = new ThreadLocal<StringBuilder>(() => new StringBuilder(), true);

Parallel.ForEach(words, word =>
{
// No contention: each thread appends to its own builder
localBuilder.Value.Append(word).Append(" processed; ");
});

Console.WriteLine("Consolidated Results from all threads:");
foreach (var builder in localBuilder.Values)
{
Console.WriteLine($"Thread Result: {builder}");
}
}
}

Why Use ThreadLocal<T>?

FeatureShared Variable with lockThreadLocal<T> Strategy
PerformanceSlow (Threads wait for each other)Fast (Zero waiting during execution)
SafetyHigh (Synchronization ensures accuracy)High (Isolation ensures accuracy)
ComplexitySimple but prone to bottlenecksSlightly more complex aggregation
Resource UsageLow (One variable)Moderate (One variable per thread)

Best Practices to Remember

  1. Always Dispose: ThreadLocal<T> implements IDisposable. Always use a using statement or call Dispose() to free up memory once the loop is finished.
  2. Tracking Values: You must set trackAllValues: true in the constructor if you plan on accessing the .Values collection to sum or merge them after the loop.
  3. Aggregation Cost: While the loop itself becomes much faster, remember that you will need a final (usually sequential) step to combine the thread-local values. Ensure this final step doesn't become a new bottleneck.
Share this lesson: