Skip to content

[Analyzer Proposal] Code analyzer that detects unnecessary allocations made for dictionary lookups #51221

@Alex-Sob

Description

@Alex-Sob

Analyzer proposal

I would like to suggest an idea for a code analyzer that can detect if an allocation is made only to perform a dictionary or hash set lookup where the type of key is string. For example, in the following cases an allocation can be avoided:

var dictionary = new Dictionary<string, string>();
var key = "...";
var keySpan = key.AsSpan();

dictionary[key.Substring(1)] = value;
dictionary[key.Remove(1)] = value;
dictionary[key[..2]] = value;
dictionary[keySpan.ToString()] = value;

The analyzer could recommend to avoid allocation by creating a ReadOnlySpan<char> instead and using GetAlternateLookup<ReadOnlySpan<char>> method exposed by various dictionary types as well as HashSet<T>.

In case if ToLower{Invariant} or ToUpper{Invariant} is called on a key to perform a lookup, it might even be easier to specify StringComparer when dictionary is constructed.

Examples

For example, for a code like this:

var dictionary = new Dictionary<string, string>();
...
var key = smth.Substring(1);
if (dictionary.TryGetValue(key, out var value) { ... }

the analyzer could recommend using GetAlternateLookup:

var dictionary = new Dictionary<string, string>();
...
var key = smth.AsSpan(1);
var alternate = dictionary.GetAlternateLookup<ReadOnlySpan<char>>();
if (alternate.TryGetValue(key, out var value) { ... }

And in this case:

var dictionary = new Dictionary<string, string>();
...
var key = smth.ToLowerInvariant();
if (dictionary.TryGetValue(key, out var value) { ... }

it could recommend using StringComparer.InvariantCultureIgnoreCase if it's possible:

var dictionary = new Dictionary<string, string>(StringComparer.InvariantCultureIgnoreCase);
...
var key = smth;
if (dictionary.TryGetValue(key, out var value) { ... }

Analysis

The analyzer can look for the following calls on instances of Dictionary<string, V>, ConcurrentDictionary<string, V>, and FrozenDictionary<string, V> types:

  • Indexer access
  • Add, ContainsKey, Remove, TryAdd, TryGetValue calls

as well as for calls to Add, Contains, Remove or TryGetValue on instances of HashSet<string>.

Then it can look at expression used as a key for a lookup. If it's a variable, it may continue to analyze assignments to that variable. In case of a simple expression like a conditional expression or coalesce operation, it may also analyze parts of that expression recursively. The analyzer could report diagnostic if an inline expression passed to an indexer/method or an expression assigned to a variable contains one of these calls:

  • Substring, Remove, Replace, Trim, TrimStart, TrimEnd, ToLower{Invariant}, ToUpper{Invariant}
  • Range expression
  • ToString call on a {ReadOnly}Span<char>

While analyzing assignments to a variable, the analyzer should probably make sure that the variable is not used anywhere except in the lookup. For example, if the variable is passed to some API that takes a string, the analyzer cannot recommend using Span instead of string:

key = smth.Substring(1);
Use(key); // some API that takes a string
if (dictionary.TryGetValue(key, out var value) { ... }

Real examples

I prototyped such an analyzer and ran it as a standalone analysis tool against projects in dotnet/runtime repo. Here are some real examples:

Metadata

Metadata

Assignees

No one assigned

    Labels

    untriagedRequest triage from a team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions