4

I have referred below posts before asking here:

std::string, wstring, u16/32string clarification
std::u16string, std::u32string, std::string, length(), size(), codepoints and characters

But they don't my question. Look at the simple code below:

#include<iostream>
#include<string>
using namespace std;

int main ()
{
  char16_t x[] = { 'a', 'b', 'c', 0 };
  u16string arr = x;

  cout << "arr.length = " << arr.length() << endl;
  for(auto i : arr)
    cout << i << "\n";
}

The output is:

arr.length = 3  // a + b + c
97
98
99

Given that, std::u16string consists of char16_t and not char shouldn't the output be:

arr.length = 2  // ab + c(\0)
<combining 'a' and 'b'>
99

Please excuse me for the novice question. My requirement is to get clear about the concept of new C++11 strings.

Edit:

From @Jonathan's answer, I have got the loophole in my question. My point is that how to initialize the char16_t, so that the length of the arr becomes 2 (i.e. ab, c\0).
FYI, below gives a different result:

  char x[] = { 'a', 'b', 'c', 0 };
  u16string arr = (char16_t*)x;  // probably undefined behavior

Output:

arr.length = 3
25185
99
32767
3
  • You have an array of char16_t elements. You initialize it with 3 elements... Commented Jul 25, 2014 at 9:11
  • @JonathanWakely, Yes indeed, had a bit of a conflict between my typing and my thinking :). Point being - initialized with a fixed number of elements. Commented Jul 25, 2014 at 9:18
  • 3
    +1 @downvoters: Why downvote this? This not exactly thrilling, but it is a pitfall well worth mentioning. Commented Jul 25, 2014 at 9:50

4 Answers 4

4

No, you have created an array of four elements, the first element is 'a' converted to char16_t, the second is 'b' converted to char16_t etc.

Then you create a u16string from that array (converted to a pointer), which reads each element up to the null terminator.

Sign up to request clarification or add additional context in comments.

Comments

3

When you do:

char16_t x[] = { 'a', 'b', 'c', 0 };

It is similar to doing this (endianness not withstanding):

char x[] = { '\0', 'a', '\0', 'b', '\0', 'c', '\0', '\0' };

Each character occupies two bytes in memory.

So when you ask for the length of a u16string each two bytes is counted as one character. They are, after all, two-byte (16bit) characters.

EDIT:

Your additional question is creating a string without a null terminator.

Try this:

char x[] = { 'a', 'b', 'c', 0 , 0, 0};
u16string arr = (char16_t*)x;

Now the first character is {'a', 'b'} the second character is {'c', 0} and you also have a null terminator character {0, 0}.

Comments

1

shouldn't the output be:

arr.length = 2
// ab + c(\0) 99

No. The elements of x are char16_t, regardless of that you provide char literals for initialization:

#include<iostream>

int main () {
    char16_t x[] = { 'a', 'b', 'c', 0 };
    std::cout << sizeof(x[0]) << std::endl;
}

output:

2 

Live example

Addendum, referring to the EDIT of the question

I'd not exactly recommend casting the termination away from strings. ;)

#include<iostream>
#include<string>

int main () {
    char x[] = { 'a', 'b', 'c', 0, 0, 0, 0, 0};

    std::wstring   ws   = reinterpret_cast<wchar_t*>(x);
    std::u16string u16s = reinterpret_cast<char16_t*>(x);

    std::cout << "sizeof(wchar_t):  "       << sizeof(wchar_t)
              << "\twide string length: "   << ws.length()   
              << std::endl;

    std::cout << "sizeof(char16_t): "       << sizeof(char16_t)
               << "\tu16string length:  "   << u16s.length()   
               << std::endl;
}

Live example

output (compiled with g++)

sizeof(wchar_t):  4 wide string length: 1
sizeof(char16_t): 2 u16string length:   2

As expected, isn't it.

1 Comment

I think you have nailed it with an example! Thanks. Can you look at the edited question.
-1

C++ supports the following way to build 16-bit integers from 8-bit integers:

char16_t ab = (static_cast<unsigned char>('a') << 8) | 'b';
// (Note: cast to unsigned meant to prevent overflows)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.