2

I'm trying to use modern string-handling approaches (like std::string_view or GSL's string_span) to interact with a C API (DBus) that takes strings as null-terminated const char*s, e.g.

DBusMessage* dbus_message_new_method_call(
    const char* destination,
    const char* path,
    const char* iface,
    const char* method 
    )

string_view and string_span don't guarantee that their contents are null-terminated - since spans are (char* start, ptrdiff_t length) pairs, that's largely the point. But GSL also provides a zstring_view, which is guaranteed to be null-terminated. The comments around zstring_span suggest that it's designed exactly for working with legacy and C APIs, but I ran into several sticking points as soon as I started using it:

  1. Representing a string literal as a string_span is trivial:

    cstring_span<> bar = "easy peasy";
    

    but representing one as a zstring_span requires you to wrap the literal in a helper function:

    czstring_span<> foo = ensure_z("odd");
    

    This makes declarations noisier, and it also seems odd that a literal (which is guaranteed to be null-terminated) isn't implicitly convertible to a zstring_span. ensure_z() also isn't constexpr, unlike constructors and conversions for string_span.

  2. There's a similar oddity with std::string, which is implicitly convertible to string_span, but not zstring_span, even though std::string::data() has been guaranteed to return a null-terminated sequence since C++11. Again, you have to call ensure_z():

    zstring_span<> to_zspan(std::string& s) { return ensure_z(s); }
    
  3. There seems to be some const-correctness issues. The above works, but

    czstring_span<> to_czspan(const std::string& s) { return ensure_z(s); }
    

    fails to compile, with errors about being unable to convert from span<char, ...> to span<const char, ...>

  4. This is a smaller point than the others, but the member function that returns a char* (which you would feed to a C API like DBus) is called assume_z(). What's being assumed when the constructor of zstring_span expects a null-terminated range?

If zstring_span is designed "to convert zero-terminated spans to legacy strings", why does its use here seem so cumbersome? Am I misusing it? Is there something I'm overlooking?

2
  • Any reason you aren't just using std::strings and calling c_str() at the call site, or provide a wrapper for the C function that takes std::string's and forwards the c_str() inside? Commented Jul 3, 2019 at 16:31
  • 3
    That's what I'm doing currently, but the entire purpose of string_span and friends is to avoid copies when all you need is a non-owning view into a string. See isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rstr-view and the related Core Guidelines advice on string handling. Commented Jul 3, 2019 at 16:37

2 Answers 2

3
  1. it also seems odd that a literal (which is guaranteed to be null-terminated) isn't implicitly convertible to a zstring_span

A string literal is of type const char[...]. There is no information in the type that this const char array is a null terminated string. Here is some other code with the same types, but without null termination where ensure_z will fail fast.

const char foo_arr[4]{ 'o', 'd', 'd', '-' };
ensure_z(foo_arr);

Both "foo" and foo_arr are of type const char[4], but only the string literal is null terminated while foo_arr is not.

Please note that your combination of ensure_z and czstring_span<> compiles, but it does not work. ensure_z returns only the string without the terminating null byte. When you pass that to the czstring_span<> constructor, then the constructor will fail searching for the null byte (which was cut off by ensure_z).

You need to convert the string literal to a span and pass that to the constructor:

czstring_span<> foo = ensure_span("odd");
  1. There's a similar oddity with std::string, which is implicitly convertible to string_span, but not zstring_span

Good point. There is a constructor for string_span that takes a std::string, but for zstring_span there is only a constructor taking the internal implementation type, a span<char>. For span there is a constructor taking a "container" having .data() and .size() - which std::string implements. Even worse: the following code compiles but will not work:

zstring_span<> to_zspan(std::string& s) { return zstring_span<>{s}; }

You should consider filing an issue in the GSL repo to get the classes aligned. I am not sure if the implicit conversions are a good idea, so I prefer how it is done in zstring_span over how string_span does it.

  1. There seems to be some const-correctness issues.

Also here my first idea of czstring_span<> to_czspan(const std::string& s) { return czstring_span<>{s}; } compiles but does not work. Another solution would be a new function ensure_cz that returns a span<const char, ...>. You should consider filing an issue.

  1. assume_z()

The existance of empty() and the code in as_string_span() suggest that the class was meant to be able to handle empty string spans. In that case as_string_span would always return the string without terminating null byte, ensure_z would return the string with terminating null byte, failing if empty, and assume_z would assume that !empty() and return the string with terminating null byte.

But the one and only constructor is taking a non-empty span of characters, so empty() can never be true. I just created a PR to address these inconsistencies. Please consider filing an issue if you think that more should be changed.

If zstring_span is designed "to convert zero-terminated spans to legacy strings", why does its use here seem so cumbersome? Am I misusing it? Is there something I'm overlooking?

In pure C++ code I prefer std::string_view, zstring_span is only for C interop, that limits its use. And of course you must know the guidelines and the guideline support library. Given that I bet that zstring_span is rarely been used and that you are one of the very few people taking a deep look into it.

Sign up to request clarification or add additional context in comments.

Comments

1

It's "cumbersome" in part because it's intended to be.

This:

zstring_span<> to_zspan(std::string& s) { return ensure_z(s); }

Is not a safe operation. Why? Because while it is true that s is NUL terminated, it is entirely possible that the actual s contains internal NUL characters. That's a legitimate thing you can do with std::string, but zstring_span and whomever takes it can't handle that. They'll truncate the string.

By contrast, string_span/view conversions are safe from this perspective. Consumers of such strings take a sized string and thus can handle embedded NULs.

Because the zstring_span conversion is unsafe, there should be some explicit notation that something potentially unsafe is being done. ensure_z represents that explicit notation.

Another problem is that C++ has no mechanism to tell the difference between a literal string argument and any old const char* or const char[] parameter. Since a bare const char* may or may not be a string literal, you have to assume that it isn't and therefore use a more verbose conversion.

Also, C++ string literals can contain embedded NUL characters, so the above reasoning applies.

The const issue seems like a code bug, and you should probably file it as such.

4 Comments

This seems to be more of a semantic issue than one of safety. A similar argument could be made against using str.c_str().
@johv: Nobody is saying that you can't use it. Only that the conversions should be explicit. Converting a std::string to a char const* is always explicit; in this case, you can clearly see a call to c_str in the code. Making a zstring_span not implicitly convertible from a std::string means that you have to explicitly convert it.
It seem perfectly reasonable to me that the zstring_span could contain NULs, just like std::string can. I'd expect a NUL-containing zstring_span to behave just like a str.c_str(), which is safe (there is at least one NUL) to pass to C-functions, which would stop processing the string at the first NUL. Simply put, one cannot assume that str.length() == strlen(str.c_str()), and I would assume the same is true for zstring_span.
@johv: You seem to be off the subject of the topic. This question is about implicit conversions. std::string is not implicitly convertible to a char const* because that conversion is lossy. You lose information precisely because str.length() and strlen(str.c_str()) may not be equal. If a conversion potentially loses information, then the conversion ought not be implicit. And that's all this answer is speaking to. Explicit conversions like c_str or ensure_z are explicit and therefore OK.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.