22

Maybe it's just the lack of coffee, but I'm trying to create a std::string from a null-terminated char array with a known maximum length and I don't know, how to do it.

auto s = std::string(buffer, sizeof(buffer));

.. was my favorite candidate but since C++ strings are not null-terminated this command will copy sizeof(buffer) bytes regardless of any contained '\0'.

auto s = std::string(buffer);

.. copies from buffer until \0 is found. This is almost what I want but I can't trust the receive buffer so I'd like to provide a maximum length.

Of course, I can now integrate strnlen() like this:

auto s = std::string(buffer, strnlen(buffer, sizeof(buffer)));

But that seems dirty - it traverses the buffer twice and I have to deal with C-artifacts like string.h and strnlen() (and it's ugly).

How would I do this in modern C++?

9
  • 4
    It's either null-terminating or has length of exactly sizeof(buffer), it cannot be both at the same time. Commented May 4, 2016 at 8:42
  • 1
    That's not true (and I'm surprised about the upvote). A C-buffer provided to any sort of writing function can be pre-allocated with a fixed size. The sequence written to the buffer can still be null-terminated. Commented May 4, 2016 at 9:22
  • 1
    Anyway this is just a minor nitpick. The correct term would be "possibly null-terminated array of maximal length N" or something similar. Commented May 4, 2016 at 9:46
  • 4
    A C-buffer provided to any sort of writing function For this situation I'd pre-allocate an extra char char buffer[n+1]; buffer[n] = 0;. Then use std::string(buffer) as the string is always null-terminated. Commented May 4, 2016 at 10:14
  • 2
    Or just stick null in the last byte of the buffer. Worst case, you lose one character of a string which is already likely to be truncated. Commented May 4, 2016 at 11:10

3 Answers 3

26
const char* end = std::find(buffer, buffer + sizeof(buffer), '\0');
std::string s(buffer, end);
Sign up to request clarification or add additional context in comments.

7 Comments

Doesn't it still traverse the buffer twice?
@songyuanyao: Depending on how you look at it, yes (one traversal to find the length, then one to copy the bytes). But this is unavoidable because you simply cannot know the correct size to allocate until you check the length, and you cannot copy the bytes before you allocate. A true single-pass solution would always allocate sizeof(buffer) regardless of how short the actual content is, which may increase memory consumption significantly.
Or to put it another way, std::string s(buffer); traverses the buffer twice already, and the additional size limit turns out not to be useful unless we don't mind over-allocation. In principle you could make a series of allocations of exponentially increasing size (like a vector), but by the time you add up all the intermediate copies you've still traversed the data approximately twice.
@joeeey: Here's a demo of the code compiling cleanly as written with all compiler warnings enabled: godbolt.org/z/r56j5x - if you have a problem with some other version feel free to post your code.
@joeeey That's because buffer and end need to be the same type. Either use auto end or remove the const like this: godbolt.org/z/3TzTaK
|
0

Something like this could work in a single pass..

auto eos = false;
std::string s;
std::copy_if(buffer, buffer + sizeof(buffer), std::back_inserter(s),
  [&eos](auto v) {
    if (!eos) {
      if (v) {
        return true;
      }
      eos = true;
    }
    return false;
  });

2 Comments

Except for the extra passes needed to copy the characters on reallocation?
@T.C without knowing how big buffer is, it's possibly difficult to make a guess as to whether there will be any reallocations?
0

If you want a single-pass solution, start with this:

template<class CharT>
struct smart_c_string_iterator {
  using self=smart_c_string_iterator;
  std::size_t index = 0;
  bool is_end = true;
  CharT* ptr = nullptr;
  smart_c_string_iterator(CharT* pin):is_end(!pin || !*pin), ptr(pin) {}
  smart_c_string_iterator(std::size_t end):index(end) {}
};

now, gussy it up and make it a full on random-access iterator. Most of the operations are really simple (++ etc should advance both ptr and index), except == and !=.

friend bool operator==(self lhs, self rhs) {
  if (lhs.is_end&&rhs.is_end) return true;
  if (lhs.index==rhs.index) return true;
  if (lhs.ptr==rhs.ptr) return true;
  if (lhs.is_end && rhs.ptr && !*rhs.ptr) return true;
  if (rhs.is_end && lhs.ptr && !*lhs.ptr) return true;
  return false;
}
friend bool operator!=(self lhs, self rhs) {
  return !(lhs==rhs);
}

we also need:

template<class CharT>
std::pair<smart_c_string_iterator,smart_c_string_iterator>
smart_range( CharT* ptr, std::size_t max_length ) {
  return {ptr, max_length};
}

now we do this:

auto r = smart_range(buffer, sizeof(buffer));
auto s = std::string(r.first, r.second);

and at each step we check for both buffer length and null termination when doing the copy.

Now, Ranges v3 brings about the concept of a sentinal, which lets you do something like the above with reduced runtime cost. Or you can hand-craft the equivalent solution.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.