Tag: semgrep

Catching common C# string performance fixes with Semgrep

Following up on my previous post, I’ve put together a new set of Semgrep rules focused specifically on string-related performance issues in C#.

These are the kinds of things that rarely show up in code reviews, and not everything is covered by Resharper or IDEs. The issues adds up tho, especially if they are in a hot path. Things related to strings can also cause a lot of allocations, leading to plenty of work for the garbage collector. These rules are designed to be lightweight, easy to integrate into your workflow, and catch the kind of subtle inefficiencies that can quietly degrade performance over time. Some of the have auto fixes, meaning you can apply the rules to your code base, and it will sort it out. This is still work in progress, but let’s go through the rules.

1. String Comparison

Calling ToLower() or ToUpper() just to compare strings is wasteful, it allocates a new string, converts every character, and then compares. Use string.Equals(str1, str2, StringComparison.OrdinalIgnoreCase) to compare the strings without creating any temporary strings. Resharper does not flag this.

 public bool ToLower_Different()
 {
     // Here ToLower allocates a new string.
     return TestString1.ToLower().Equals(TestString2);
 }
 public bool StringEquals_OrdinalIgnoreCase_SameIgnoreCase()
{
      // Here we compare without allocating new strings
      return string.Equals(TestString1, TestString2,       
                           StringComparison.OrdinalIgnoreCase);
}

 public bool ToLower_Different()
 {
     // Here ToLower allocates a new string.
     return TestString1.ToLower().Equals(TestString2);
 }
 public bool StringEquals_OrdinalIgnoreCase_SameIgnoreCase()
{
      // Here we compare without allocating new strings
      return string.Equals(TestString1, TestString2,       
                           StringComparison.OrdinalIgnoreCase);
}

csharp-inefficient-string-comparison.yaml

rules:
  - id: csharp-inefficient-string-comparison
    patterns:
      - pattern-either:
          - pattern: $STR.ToLower().Equals($OTHER)
          - pattern: $STR.ToLowerInvariant().Equals($OTHER)
          - pattern: $STR.ToUpper().Equals($OTHER)
          - pattern: $STR.ToUpperInvariant().Equals($OTHER)
      - pattern-not: String.Equals($STR, $OTHER, StringComparison.OrdinalIgnoreCase)
    message: >
      Inefficient string comparison. Use String.Equals(s1, s2, StringComparison.OrdinalIgnoreCase) 
      instead of ToLower()/ToUpper().Equals() for better performance and clarity.
    fix: String.Equals($STR, $OTHER, StringComparison.OrdinalIgnoreCase)
    languages: [csharp]
    severity: WARNING
    metadata:
      category: performance
      subcategory:
      - easyfix
      - strings
      references:
      - "https://blog.smistad.me/semgrep-rules-for-c-performance/"

rules:
  - id: csharp-inefficient-string-comparison
    patterns:
      - pattern-either:
          - pattern: $STR.ToLower().Equals($OTHER)
          - pattern: $STR.ToLowerInvariant().Equals($OTHER)
          - pattern: $STR.ToUpper().Equals($OTHER)
          - pattern: $STR.ToUpperInvariant().Equals($OTHER)
      - pattern-not: String.Equals($STR, $OTHER, StringComparison.OrdinalIgnoreCase)
    message: >
      Inefficient string comparison. Use String.Equals(s1, s2, StringComparison.OrdinalIgnoreCase) 
      instead of ToLower()/ToUpper().Equals() for better performance and clarity.
    fix: String.Equals($STR, $OTHER, StringComparison.OrdinalIgnoreCase)
    languages: [csharp]
    severity: WARNING
    metadata:
      category: performance
      subcategory:
      - easyfix
      - strings
      references:
      - "https://blog.smistad.me/semgrep-rules-for-c-performance/"

2. Avoid string.Format for cases where interpolation is enough

string.Format adds overhead and is harder to read. Interpolation ($"...") is faster and alloc-free in simple cases. For more complex formatting you should continue to use string.Format, but where it is used for basic string concatenation you should switch to string interpolation. Resdharper suggests fixing this if you use string.Format, but not in cases where you use string.Concat.

public string Format()
{
    // Resharper suggests switching to interpolation
    return string.Format("{0} {1} {2}", Left, Right, Middle);
}

public string Interpolation()
{
    return $"{Left} {Right} {Middle}";
}

public string Concat()
{
    // No suggestion to fix this from Resharper
    return string.Concat(Left, " ", Right, " ", Middle);
}

public string Format()
{
    // Resharper suggests switching to interpolation
    return string.Format("{0} {1} {2}", Left, Right, Middle);
}

public string Interpolation()
{
    return $"{Left} {Right} {Middle}";
}

public string Concat()
{
    // No suggestion to fix this from Resharper
    return string.Concat(Left, " ", Right, " ", Middle);
}

Method	Mean	Ration	Allocated
Interpolation	0.4472 ns	1.00	–
Concat	19.2632 ns	43.16	56 B
Format	44.2375 ns	99.12	56 B

Interpolation is much faster than the alternatives. The benchmark here is the code shown above, so I guess the interpolation just gets optimized away in the end. This will also apply to your actual code in situations where you use it for simple string concatenations. It also causes less allocations than the alternatives.

The reason is that we have to avoid parsing the format parameters, in the cases where you just refer to a variable. So this improvement only really works in simple use cases.

csharp-string-format-to-interpolation.yaml

The regex for detecting more complex format parameters is not quite working. So this rule currently picks up some false-positives.

rules:
  - id: csharp-string-format-to-interpolation
    languages: [csharp]
    severity: WARNING
    message: "Use string interpolation ($\"...\") instead of string.Format for simple cases"
    metadata:
      description: "Detects simple string.Format calls that could be replaced with string interpolation"
      category: "performance"
      references:
        - "https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated"
        - "https://blog.smistad.me/semgrep-rules-for-c-performance/"
      technology:
        - csharp
      subcategory:
        - easyfix
        - strings
    pattern-either:
      - pattern: string.Format("$FMT", $A1)
      - pattern: string.Format("$FMT", $A1, $A2)
      - pattern: string.Format("$FMT", $A1, $A2, $A3)
      - pattern: string.Format("$FMT", $A1, $A2, $A3, $A4)
    pattern-not-regex: \{\d+:[^}]+\}

rules:
  - id: csharp-string-format-to-interpolation
    languages: [csharp]
    severity: WARNING
    message: "Use string interpolation ($\"...\") instead of string.Format for simple cases"
    metadata:
      description: "Detects simple string.Format calls that could be replaced with string interpolation"
      category: "performance"
      references:
        - "https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated"
        - "https://blog.smistad.me/semgrep-rules-for-c-performance/"
      technology:
        - csharp
      subcategory:
        - easyfix
        - strings
    pattern-either:
      - pattern: string.Format("$FMT", $A1)
      - pattern: string.Format("$FMT", $A1, $A2)
      - pattern: string.Format("$FMT", $A1, $A2, $A3)
      - pattern: string.Format("$FMT", $A1, $A2, $A3, $A4)
    pattern-not-regex: \{\d+:[^}]+\}

3. Use AsSpan() Instead of Substring()

In some cases we can avoid allocating a new string with string.Substring() and instead use .AsSpan().

Some typical cases we can avoid is inputs to int/double/Guid.Parse() methods, or comparing a substring to a string literal.

// allocates new string
int.Parse(tName.Substring("VariantArray".Length),;
if (s.Substring(i) == "INF")

// Using AsSpan()
int.Parse(tName.AsSpan("VariantArray".Length));
if (s.AsSpan(i).SequenceEqual("INF"))

// allocates new string
int.Parse(tName.Substring("VariantArray".Length),;
if (s.Substring(i) == "INF")

// Using AsSpan()
int.Parse(tName.AsSpan("VariantArray".Length));
if (s.AsSpan(i).SequenceEqual("INF"))

Here Resharp will suggest to use a range index instead of substring, but this is actually slower than using the substring method. You do not avoid any allocations either, as you do if you use .AsSpan().

[Benchmark(Baseline = true)]
public string Substring()
{
    return "this is my wonderful string".Substring("this".Length);
}

[Benchmark]
public ReadOnlySpan<char> AsSpan()
{
    return "this is my wonderful string".AsSpan("this".Length);
}

[Benchmark]
public string RangeIndex()
{
    // Resharper will sugest changing your code to this
    return "this is my wonderful string"["this".Length..];
}

[Benchmark(Baseline = true)]
public string Substring()
{
    return "this is my wonderful string".Substring("this".Length);
}

[Benchmark]
public ReadOnlySpan<char> AsSpan()
{
    return "this is my wonderful string".AsSpan("this".Length);
}

[Benchmark]
public string RangeIndex()
{
    // Resharper will sugest changing your code to this
    return "this is my wonderful string"["this".Length..];
}

Method	Mean	Ratio	Allocated
AsSpan	0.2085 ns	0.05	–
Substring	4.5797 ns	1.00	72 B
RangeIndex	6.5919 ns	1.44	72 B

csharp-substring.yaml

rules:
  - id: csharp-avoid-substring-for-span-accepting-methods
    languages: [csharp]
    message: Use AsSpan instead of Substring to avoid string allocations when passing to methods accepting ReadOnlySpan<char>.
    severity: WARNING
    metadata:
      category: performance
      subcategory:
        - easyfix
        - strings
      likelihood: LOW
      impact: LOW
    patterns:
      - pattern: $METHOD($STR.Substring($IDX))
      - metavariable-regex:
          metavariable: $METHOD
          regex: >
            (int|float|double|decimal|uint|long|bool|Guid|DateTime|DateTimeOffset)\.(Parse(Exact)?|TryParse(Exact)?)
    fix: $METHOD($STR.AsSpan($IDX))
  - id: csharp-avoid-substring-for-suffix
    pattern: $STR.Substring($IDX)
    message: Use AsSpan instead of Substring to avoid string allocations.
    languages: [csharp]
    severity: INFO
    metadata:
      category: performance
      subcategory:
      - easyfix
      - strings
      likelihood: LOW
      impact: LOW
    fix: $STR.AsSpan($IDX)

  - id: csharp-avoid-substring-equals
    pattern: $STR.Substring($IDX) == "$SUFFIX"
    message: Use AsSpan(...).SequenceEqual("...") instead of Substring == "..." for performance.
    languages: [csharp]
    severity: WARNING
    metadata:
      category: performance
      subcategory:
      - easyfix
      - strings
      likelihood: LOW
      impact: LOW
    fix: $STR.AsSpan($IDX).SequenceEqual("$SUFFIX")

rules:
  - id: csharp-avoid-substring-for-span-accepting-methods
    languages: [csharp]
    message: Use AsSpan instead of Substring to avoid string allocations when passing to methods accepting ReadOnlySpan<char>.
    severity: WARNING
    metadata:
      category: performance
      subcategory:
        - easyfix
        - strings
      likelihood: LOW
      impact: LOW
    patterns:
      - pattern: $METHOD($STR.Substring($IDX))
      - metavariable-regex:
          metavariable: $METHOD
          regex: >
            (int|float|double|decimal|uint|long|bool|Guid|DateTime|DateTimeOffset)\.(Parse(Exact)?|TryParse(Exact)?)
    fix: $METHOD($STR.AsSpan($IDX))
  - id: csharp-avoid-substring-for-suffix
    pattern: $STR.Substring($IDX)
    message: Use AsSpan instead of Substring to avoid string allocations.
    languages: [csharp]
    severity: INFO
    metadata:
      category: performance
      subcategory:
      - easyfix
      - strings
      likelihood: LOW
      impact: LOW
    fix: $STR.AsSpan($IDX)

  - id: csharp-avoid-substring-equals
    pattern: $STR.Substring($IDX) == "$SUFFIX"
    message: Use AsSpan(...).SequenceEqual("...") instead of Substring == "..." for performance.
    languages: [csharp]
    severity: WARNING
    metadata:
      category: performance
      subcategory:
      - easyfix
      - strings
      likelihood: LOW
      impact: LOW
    fix: $STR.AsSpan($IDX).SequenceEqual("$SUFFIX")

4. Optimize UTF-8 Transcoding

Not avoiding any allocations with this one, but you save some CPU cycles. This is also caught by Resharper

return Encoding.UTF8.GetBytes("ThIs A StRiNG");
// Can be shortend to this:
return "ThIs A StRiNG"u8.ToArray();

return Encoding.UTF8.GetBytes("ThIs A StRiNG");
// Can be shortend to this:
return "ThIs A StRiNG"u8.ToArray();

csharp-avoid-transcoding.yaml

rules:
- id: csharp-avoid-transcoding
  patterns:
  - pattern-either:
    - pattern: Encoding.UTF8.GetBytes("$STR")
  message: Use u8 to avoid csharp-avoid-transcoding
  fix: \"$STR\"u8.ToArray()
  languages: [csharp]
  severity: WARNING
  metadata:
    category: performance
    subcategory:
    - easyfix
    - strings
    likelihood: LOW
    impact: LOW

rules:
- id: csharp-avoid-transcoding
  patterns:
  - pattern-either:
    - pattern: Encoding.UTF8.GetBytes("$STR")
  message: Use u8 to avoid csharp-avoid-transcoding
  fix: \"$STR\"u8.ToArray()
  languages: [csharp]
  severity: WARNING
  metadata:
    category: performance
    subcategory:
    - easyfix
    - strings
    likelihood: LOW
    impact: LOW

Want to see all the rules and benchmarks? Check out csharp-semgrep-performance on GitHub.

2025-04-19

Semgrep rules for C# performance
Performance isn’t just about making users happy, though. It makes our lives as developers way better too. Think about your typical day, you’re constantly running your code, debugging, testing, and then doing it all over again. When your code runs faster, you spend less time waiting and more time actually coding. Nobody enjoys staring at a spinning wheel while your tests run or the debugger loads up. Those small delays mess with your flow and make development less fun.

I was thinking of using semgrep to catch a lot of small easy to fix performance improvements. So I want to just share the rules I make, so maybe somebody else can use them to.

These rules will be covering the small cases, but sometimes the performance issues can be a death of a thousand cuts. Garbage Collection can be a real killer for performance, so a lot of the rules will try to cover things where there is some alternative that requires less or no allocations.

Stop Converting Strings Just to Compare Them

Strings are everywhere in our code, and the way we compare them can make a surprising difference in performance. Here’s our first rule that catches a really common mistake.

We’ve all done this at some point:
```
// The slow way
if (someString.ToLower().Equals(otherString))
{
    // Do something
}
```
Or maybe this version:
```
// Also slow
if (someString.ToUpper().Equals(otherString))
{
    // Do something
}
```
What’s wrong with this? A few things:
- It creates a whole new string just for the comparison
- It wastes memory for this temporary string
- It has to convert every character before it even starts comparing
The Better Way

There’s a much faster way to do the same thing in C#:
```
if (String.Equals(someString, otherString, StringComparison.OrdinalIgnoreCase))
{
    // Do something
}
```
This skips creating new strings completely and just does the comparison directly.

The Semgrep Rule

Here’s the rule I made to catch this in your code:
```
rules:
  - id: csharp-inefficient-string-comparison
    patterns:
      - pattern-either:
          - pattern: $STR.ToLower().Equals($OTHER)
          - pattern: $STR.ToLowerInvariant().Equals($OTHER)
          - pattern: $STR.ToUpper().Equals($OTHER)
          - pattern: $STR.ToUpperInvariant().Equals($OTHER)
      - pattern-not: String.Equals($STR, $OTHER, StringComparison.OrdinalIgnoreCase)
    message: >
      Inefficient string comparison. Use String.Equals(s1, s2, StringComparison.OrdinalIgnoreCase) 
      instead of ToLower()/ToUpper().Equals() for better performance and clarity.
    languages: [csharp]
    severity: WARNING
    metadata:
      category: performance
      subcategory:
      - easyfix
      - strings
```
This catches all four ways people typically do the slow comparison, but it won’t bug you if you’re already doing it the right way.

How Much Faster Is It Really?

I ran some benchmarks to see exactly how big the difference is:
```
| Method                                        | Mean       | Allocated |
|---------------------------------------------- |-----------:|----------:|
| StringEquals_OrdinalIgnoreCase_SameIgnoreCase |  0.0138 ns |         - |
| StringEquals_OrdinalIgnoreCase_Different      |  0.0327 ns |         - |
| ToUpperInvariant_Different                    | 16.4798 ns |      56 B |
| ToLowerInvariant_Different                    | 16.6340 ns |      56 B |
| ToLowerInvariant_SameIgnoreCase               | 17.4380 ns |      56 B |
| ToUpper_Different                             | 18.7223 ns |      56 B |
| ToLower_Different                             | 19.5512 ns |      56 B |
| ToLower_SameIgnoreCase                        | 21.0037 ns |      56 B |
| ToUpperInvariant_SameIgnoreCase               | 29.8586 ns |     112 B |
| ToUpper_SameIgnoreCase                        | 34.5027 ns |     112 B |
```
Why Should You Care?

“But it’s just nanoseconds,” you might say. True, but:
- In a busy app, you might do these comparisons millions of times
- Every little memory allocation makes the garbage collector work harder
- These tiny slowdowns add up across your whole codebase
This is just the first of several performance-boosting rules I’m working on. If you add these to your workflow, you’ll catch these speed bumps before they slow down your code.

Want to see all the rules and benchmarks? Check out csharp-semgrep-performance on GitHub.
2025-04-15

Tag: semgrep

Catching common C# string performance fixes with Semgrep

Semgrep rules for C# performance