15

I have the following code:

char fname[255] = {0}
snprintf(fname, 255, "%s_test_no.%d.txt", baseLocation, i);

vs

std::string fname = baseLocation + "_test_no." + std::to_string(i) + ".txt";

Which one performs better? Does the second one involve temporary creation? Is there any better way to do this?

2
  • 4
    How do you measure the performance of something that happens once and takes zero time? Commented Feb 21, 2014 at 22:25
  • 1
    Unless you call that code several million times, you'll be hard pressed to notice the difference. Measure, yes, but notice, not so much. That said, there is a good chance that the second one will take longer because of the creation of temporary objects, but a good compiler is likely to optimize a lot of that away. Commented Feb 21, 2014 at 22:28

4 Answers 4

41

Let's run the numbers:

2024 edit:

Using QuickBench again

For gcc 13.2 with -03 and -std=c++23, string is now over 3 times as fast:

QuickBenchComparison of char array vs string

For clang 17.0 with -O3 and -std=c++23, string is 2.3 times as fast

QuickBenchComparison of char array vs string

Here's the benchmark code:

static void CharArray(benchmark::State& state) {
  const char*const  baseLocation = "baseLocation";
  for (auto _ : state) {
    char fname[255] = {};
    snprintf(fname, 255, "%s_test_no.%lu.txt", baseLocation, state.iterations());
    benchmark::DoNotOptimize(fname);
  }
}
BENCHMARK(CharArray);

static void String(benchmark::State& state) {
  const std::string baseLocation = "baseLocation";
  for (auto _ : state) {
    benchmark::DoNotOptimize(
    baseLocation + "_test_no." + std::to_string(state.iterations()) + ".txt"
    );
  }
}
BENCHMARK(String);

2022 edit:

Using Quick-Bench with GCC 10.3 and compiling with C++20 (with some minor changes for constness) demonstrates that std::string is now faster, almost 3x as much:

Char demonstrating string is now faster


Original answer (2014)

The code (I used PAPI Timers)

main.cpp

#include <iostream>
#include <string>
#include <stdio.h>
#include "papi.h"
#include <vector>
#include <cmath>
#define TRIALS 10000000

class Clock
{
  public:
    typedef long_long time;
    time start;
    Clock() : start(now()){}
    void restart(){ start = now(); }
    time usec() const{ return now() - start; }
    time now() const{ return PAPI_get_real_usec(); }
};


int main()
{
  int eventSet = PAPI_NULL;
  PAPI_library_init(PAPI_VER_CURRENT);
  if(PAPI_create_eventset(&eventSet)!=PAPI_OK) 
  {
    std::cerr << "Failed to initialize PAPI event" << std::endl;
    return 1;
  }

  Clock clock;
  std::vector<long_long> usecs;

  const char* baseLocation = "baseLocation";
  //std::string baseLocation = "baseLocation";
  char fname[255] = {};
  for (int i=0;i<TRIALS;++i)
  {
    clock.restart();
    snprintf(fname, 255, "%s_test_no.%d.txt", baseLocation, i);
    //std::string fname = baseLocation + "_test_no." + std::to_string(i) + ".txt";
    usecs.push_back(clock.usec());
  }

  long_long sum = 0;
  for(auto vecIter = usecs.begin(); vecIter != usecs.end(); ++vecIter)
  {
    sum+= *vecIter;
  }

  double average = static_cast<double>(sum)/static_cast<double>(TRIALS);
  std::cout << "Average: " << average << " microseconds" << std::endl;

  //compute variance
  double variance = 0;
  for(auto vecIter = usecs.begin(); vecIter != usecs.end(); ++vecIter)
  {
    variance += (*vecIter - average) * (*vecIter - average);
  }

  variance /= static_cast<double>(TRIALS);
  std::cout << "Variance: " << variance << " microseconds" << std::endl;
  std::cout << "Std. deviation: " << sqrt(variance) << " microseconds" << std::endl;
  double CI = 1.96 * sqrt(variance)/sqrt(static_cast<double>(TRIALS));
  std::cout << "95% CI: " << average-CI << " usecs to " << average+CI << " usecs" << std::endl;  
}

Play with the comments to get one way or the other. 10 million iterations of both methods on my machine with the compile line:

g++ main.cpp -lpapi -DUSE_PAPI -std=c++0x -O3

Using char array:

Average: 0.240861 microseconds
Variance: 0.196387microseconds
Std. deviation: 0.443156 microseconds
95% CI: 0.240586 usecs to 0.241136 usecs

Using string approach:

Average: 0.365933 microseconds
Variance: 0.323581 microseconds
Std. deviation: 0.568842 microseconds
95% CI: 0.365581 usecs to 0.366286 usecs

So at least on MY machine with MY code and MY compiler settings, I saw about a 50% slowdown when moving to strings. that character arrays incur a 34% speedup over strings using the following formula:

((time for string) - (time for char array) ) / (time for string)

Which gives the difference in time between the approaches as a percentage on time for string alone. My original percentage was correct; I used the character array approach as a reference point instead, which shows a 52% slowdown when moving to string, but I found it misleading.

I'll take any and all comments for how I did this wrong :)


2015 Edit

Compiled with GCC 4.8.4:

string

Average: 0.338876 microseconds
Variance: 0.853823 microseconds
Std. deviation: 0.924026 microseconds
95% CI: 0.338303 usecs to 0.339449 usecs

character array

Average: 0.239083 microseconds
Variance: 0.193538 microseconds
Std. deviation: 0.439929 microseconds
95% CI: 0.238811 usecs to 0.239356 usecs

So the character array approach remains significantly faster although less so. In these tests, it was about 29% faster.

Sign up to request clarification or add additional context in comments.

10 Comments

Cheers, I think I have the explanation to the behavior you observed, just take a look at my answer :-) +1 for actually performance testing this.
Make that baselocation 80 character string, declare char fname[255]={} also within cycle. Then make a third test and try std::string also declared outside cycle and use it inside with append or operator+= . I trust that one will win.
@ÖöTiib: I've addressed your comments in my post. I performed the timings the way I did originally because I felt it more purely captured what OP was doing. Also, I've updated the timings with the latest version of GCC that I have on Ubuntu. (not the latest overall, though. 5.2 is out as of this time of writing)
Why is std::string faster in C++20? Did they change the implementation in GCC?
What's the reason C++20 is faster?
|
4

The snprintf() version will almost certainly be quite a bit faster. Why? Simply because no memory allocation takes place. The new operator is surprisingly expensive, roughly 250ns on my system - snprintf() will have finished quite a bit of work in the meantime.

That is not to say that you should use the snprintf() approach: The price you pay is safety. It is just so easy to get things wrong with the fixed buffer size you are supplying to snprintf(), and you absolutely need to supply code for the case that the buffer is not large enough. So, only think about using snprintf() when you have identified this part of code to be really performance critical.

If you have a POSIX-2008 compliant system, you may also think about trying asprintf() instead of snprintf(), it will malloc() the memory for you, giving you pretty much the same comfort as C++ strings. At least on my system, malloc() is quite a bit faster than the builtin new-operator (don't ask me why, though).


Edit:
Just saw, that you used filenames in your example. If filenames are your concern, forget about the performance of string operation! Your code will spend virtually no time in them. Unless you have on the order of 100000 such string operations per second, they are irrelevant to your performance.

1 Comment

Spot on. I even tried to pre-declare and allocate space in the string in my answer, but that actually just caused things to slow down.
2

If it's REALLY important, measure the two solutions. If not, whichever you think makes most sense from what data you have, company/private coding style standards, etc. Make sure you use an optimised build [with the same optimisation you are going to use in the actual production build, not -O3 because that is the highest, if your production build is using -O1]

I expect that either will be pretty close if you only do a few. If you have several millions, there may be a difference. Which is faster? I'd guess the second [1], but it depends on who wrote the implementation of snprintf and who wrote the std::string implementation. Both certainly have the potential to take a lot longer than you would expect from a naive approach to how the function works (and possibly also run faster than you'd expect)

[1] Because I have worked with printf, and it's not a simple function, it spends a lot of time messing about with various groking of the format string. It's not very efficient (and I have looked at the ones in glibc and such too, and they are not noticeably better). On the other hand std::string functions are often inlined since they are template implementations, which improves the efficiency. The joker in the pack is whether the memory allocation for std::string that is likely to happen. Of course, if somehow baselocation turns to be rather large, you probably don't want to store it as a fixed size local array anyway, so that evens out in that case.

2 Comments

You seem to know more about this than I do. Care to comment on my post to see how I could make the timings more fair?
@Mats Petersson .. all true, but you might already know there are a few optimised printf's available as open-source, where esoteric formatting is sacrificed for performance. And the pink elephant in the corner is the std::allocator design ... no improvement in that department, after 6 years that I know of... don't make me reveal my sources :)
2

I would recommend using strcat in that case. It is by far the fastest method: Benchmark

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.