Skip to main content
added 128 characters in body; added 52 characters in body
Source Link
Pavel Anikhouski
  • 23.6k
  • 12
  • 66
  • 81

Benchmark results (using https://github.com/dotnet/BenchmarkDotNet)

Using the default methods to read the entire content of file or by line leads to allocation of 30/60/100 MB of memory. Using allocation free Span<T> leads to significant memory optimization (by hundreds of times) and becomes 2-2.5 times faster.

The most appearing numbers (for 100, 10k and 1M integers):

546 (there are multiple numbers appearing twice)

284

142

Benchmark results

Using the default methods to read the entire content of file or by line leads to allocation of 30/60/100 MB of memory. Using allocation free Span<T> leads to significant memory optimization (by hundreds of times) and becomes 2-2.5 times

Benchmark results (using https://github.com/dotnet/BenchmarkDotNet)

Using the default methods to read the entire content of file or by line leads to allocation of 30/60/100 MB of memory. Using allocation free Span<T> leads to significant memory optimization (by hundreds of times) and becomes 2-2.5 times faster.

The most appearing numbers (for 100, 10k and 1M integers):

546 (there are multiple numbers appearing twice)

284

142

Source Link
Pavel Anikhouski
  • 23.6k
  • 12
  • 66
  • 81

C#/.NET solution utilizing Span<T> and less allocations

[MemoryDiagnoser(false)]
public class Benchmark
{
    [Benchmark]
    [Arguments("100_random_numbers.txt")]
    [Arguments("10000_random_numbers.txt")]
    [Arguments("1M_random_numbers.txt")]
    public int GetResult(string fileName)
    {
        var dict = new Dictionary<int, int>();
        using var stream = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.Read);
        using var streamReader = new StreamReader(stream);

        int numberRead;
        Span<char> buffer = new char[4096];

        var parsedValue = 0;
        while ((numberRead = streamReader.ReadBlock(buffer)) > 0)
        {
            for (int i = 0; i < numberRead; i++)
            {
                var item = buffer[i];
                if (item != '\n')
                {
                    parsedValue = parsedValue * 10 + (item - '0');
                    continue;
                }

                if (dict.TryGetValue(parsedValue, out int value))
                {
                    dict[parsedValue] = ++value;
                }
                else
                {
                    dict[parsedValue] = 1;
                }

                parsedValue = 0;
            }
        }

        int max = 0;
        int index = 0;
        foreach (var pair in dict)
        {
            if (pair.Value > max)
            {
                max = pair.Value;
                index = pair.Key;
            }
        }

        return index;
    }
}

Benchmark results

// * Summary *

BenchmarkDotNet v0.15.3, Windows 10 (10.0.19045.6332/22H2/2022Update)
Intel Core i7-10875H CPU 2.30GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 9.0.305
  [Host]     : .NET 8.0.20 (8.0.20, 8.0.2025.41914), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 8.0.20 (8.0.20, 8.0.2025.41914), X64 RyuJIT x86-64-v3


| Method    | fileName             | Mean        | Error     | StdDev    | Allocated |
|---------- |--------------------- |------------:|----------:|----------:|----------:|
| GetResult | 100_r(...)s.txt [22] |    547.1 μs |  10.94 μs |  10.74 μs |  22.99 KB |
| GetResult | 10000(...)s.txt [24] |    755.0 μs |  14.95 μs |  25.38 μs |  87.23 KB |
| GetResult | 1M_ra(...)s.txt [21] | 14,770.8 μs | 287.13 μs | 294.86 μs |  87.22 KB |

// * Hints *
Outliers
  Benchmark.GetResult: Default -> 1 outlier  was  removed (584.24 μs)
  Benchmark.GetResult: Default -> 1 outlier  was  detected (686.13 μs)

// * Legends *
  fileName  : Value of the 'fileName' parameter
  Mean      : Arithmetic mean of all measurements
  Error     : Half of 99.9% confidence interval
  StdDev    : Standard deviation of all measurements
  Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
  1 μs      : 1 Microsecond (0.000001 sec)

Using the default methods to read the entire content of file or by line leads to allocation of 30/60/100 MB of memory. Using allocation free Span<T> leads to significant memory optimization (by hundreds of times) and becomes 2-2.5 times