0

I have a project where I transfer data between client and server using boost.asio sockets. Once one side of the connection receives data, it converts it into a std::vector of std::strings which gets then passed on to the actualy recipient object of the data via previously defined "callback" functions. That way works fine so far, only, I am at this point using methods like atoi() and to_string to convert other data types than strings into a sendable format and back. This method is of course a bit wasteful in terms of network usage (especially when transferring bigger amounts of data than just single ints and floats). Therefore I'd like to serialize and deserialize the data. Since, effectively, any serialisation method will produce a byte array or buffer, it would be convenient for me to just use std::string instead. Is there any disadvantage to doing that? I would not understand why there should be once, since strings should be nothing more than byte arrays.

5
  • 2
    "Is there any disadvantage to doing that?" No. Maybe a std::vector<uint8_t> might be semantically clearer. Commented May 24, 2017 at 18:41
  • std::string pretty much has to null-terminate its buffer as far as I can tell, whereas std::vector<char> wouldn't have to. Probably not enough of a performance impact to worry about, though, compared to the extra functionality std::string makes available. Commented May 24, 2017 at 18:41
  • 1
    @DanielSchepler I thought std::string isn't null terminated, only string::c_str and string::data gives you a null terminated sequence Commented May 25, 2017 at 3:28
  • But string::c_str is documented to be constant-time at least at cppreference.com, and I don't see how you would achieve that aside from maintaining the string data with a null terminator after it. Commented May 25, 2017 at 5:41
  • @DanielSchepler std::string is not necessarily null-terminated within its managed user data of the index range “[0, size()).” However, the specification ensures the presence of the hidden null terminator at the index size() so that .c_str() always returns a null-terminated C string in constant time. Commented Mar 3, 2024 at 6:51

4 Answers 4

7

In terms of functionality, there's no real difference.

Both for performance reasons and for code clarity reasons, however, I would recommend using std::vector<uint8_t> instead, as it makes it far more clear to anyone maintaining the code that it's a sequence of bytes, not a String.

Sign up to request clarification or add additional context in comments.

Comments

4

You should use std::string when you work with strings, when you work with binary blob you better work with std::vector<uint8_t>. There many benefits:

  • your intention is clear so code is less error prone

  • you would not pass your binary buffer as a string to function that expects std::string by mistake

  • you can override std::ostream<<() for this type to print blob in proper format (usually hex dump). Very unlikely that you would want to print binary blob as a string.

there could be more. Only benefit of std::string that I can see that you do not need to do typedef.

Comments

1

You're right. Strings are nothing more than byte arrays. std::string is just a convenient way to manage the buffer array that represents the string. That's it!

There's no disadvantage of using std::string unless you are working on something REALLY REALLY performance critical, like a kernel, for example... then working with std::string would have a considerable overhead. Besides that, feel free to use it.

--

An std::string behind the scenes needs to do a bunch of checks about the state of the string in order to decide if it will use the small-string optimization or not. Today pretty much all compilers implement small-string optimizations. They all use different techniques, but basically it needs to test bitflags that will tell if the string will be constructed in the stack or the heap. This overhead doesn't exist if you straight use char[]. But again, unless you are working on something REALLY critical, like a kernel, you won't notice anything and std::string is much more convenient.

Again, this is just ONE of the things that happens under the hood, just as an example to show the difference of them.

6 Comments

yes, if you use std::string in the kernel level the overhead is very considerable. Here is an example... but there are many more out there: stackoverflow.com/questions/21946447/…
@Ðаn I don't personally know the details, but there is a small amount of extra overhead in std::string because it has several constraints it needs to conform to, including but not limited to the fact that it needs to always have an extra byte allocated to null-terminate the string. At the same time though, std::string objects can be subject to "Small String Optimizations", which can improve the memory footprint. The critical point to take away is that std::string can do things under-the-hood that you might not expect.
@Xirema, about the null-terminate char, both C-String and std::string have. So this is not the issue. The overhead is associated with the code necessary to construct and delete the string. For example, it needs to handle the case for small string optimizations, etc... and this make the std::string a little heavy! I will update the answer with details.
@WagnerPatriota We're not comparing std::string and "C-Strings" though, we're comparing std::string and char[] or std::vector<char>. char[] and std::vector<char> do not allocate and manage the null terminating character automatically; it needs to be manually added by the user (or, more likely, ignored, since no good String use depends on it).
@Ðаn Well, again, I don't know all the details. I only know that there are various things that affect std::string that std::vector is happy to ignore, that have impacts on performance.
|
-2

Depending on how often you're firing network messages, std::string should be fine. It's a convenience class that handles a lot of char work for you. If you have a lot of data to push though, it might be worth using a char array straight and converting it to bytes, just to minimise the extra overhead std::string has.

Edit: if someone could comment and point out why you think my answer is bad, that'd be great and help me learn too.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.