0

Can someone clarify what is actually happening during this comparison.

In a C++ program if I have:

string name1 = "Mary";

and I do:

name1 < "Mary Jane" // true

Why is this true? If C++ compares each character to each character and the first mismatched character is the single double quote mark at the end of name1 = "Mary" versus the space value in "Mary Jane" then by the ASCII value a space value is smaller than a single quote mark...

8
  • 1
    Start here and follow down the rabbit hole ... Commented Aug 15, 2018 at 16:10
  • 1
    The quotes are not part of the comparison. Commented Aug 15, 2018 at 16:13
  • 2
    "Single quotation at the end of name1" - I don't see a single quote. Commented Aug 15, 2018 at 16:13
  • 1
    @drescherjm - because they are not part of the string. Commented Aug 15, 2018 at 16:14
  • 1
    yeah, no, that's not part of the string. It's the syntax introducing the string literal you initialized the string with, but the " are not part of the string itself Commented Aug 15, 2018 at 16:14

3 Answers 3

5
string name1 = "Mary";

Let's unpick this, there are several things going on.

The token

"Mary"

taken alone is a string literal which roughly evaluates to the array

const char literal_array[5] = { 'M', 'a', 'r', 'y', 0 };

You can see why it's worth having some syntactic sugar - writing that out for every string would be awful.

Anyway, there are no " characters in there - they're used to tell the compiler to emit that string literal, but they're not part of the string itself.

Then, once we know what the right-hand side of the expression is, we can look at the left:

string name1 = "Mary"

is really

string name1(literal_array);

using the constructor

basic_string<char>::basic_string<char>(const char *)

I'm paraphrasing slightly, but it's item 5 here.


name1 < "Mary Jane"

Now we finally know what the left hand side is, we can look at this expression, which expands to

const char literal_array2[10] = { 'M', 'a', 'r', 'y', ' ', 'J', 'a', 'n', 'e', 0 };
operator< (name1, literal_array2)

which is the 9th overload here (at the time of writing), and which calls compare as

name1.compare(literal_array2)

which is described as doing the following:

4) Compares this string to the null-terminated character sequence beginning at the character pointed to by s, as if by compare(basic_string(s))

which takes us back to the first overload:

1) First, calculates the number of characters to compare, as if by

size_type rlen = std::min(size(), str.size()).

Then compares by calling

Traits::compare(data(), str.data(), rlen).

For standard strings this function performs character-by-character lexicographical comparison.

If the result is zero (the strings are equal so far),

note that this is the case when we've just compared "Mary" with "Mary" so far

then their sizes are compared as follows:

size(data) < size(arg)    => data is less than arg    => result <0

where "result <0" means operator< will return true.

Sign up to request clarification or add additional context in comments.

5 Comments

How do I learn to read the information you provided of the C++ reference. I am currently reading C++ From Control Structures through Objects 8th Edition and I have no clue how to make sense of the source you've given.
Which specific bits don't you understand? I don't have that book, don't know where you're up to in it or what it covers, and sadly can't read your mind. If you quote something you can't make sense of, I can at least try.
I can not read the information you provided in this website: en.cppreference.com/w/cpp/string/basic_string/basic_string How do I learn to read those lines?
It's a bit noisy because basic_string is a template: eg. you can simplify basic_string(const CharT*...) to string(const char* ...) for most purposes. The top of the page is just a list of constructor overloads, which should be familiar if your book deals with Objects. You can ignore the Allocator params till later, because the default is fine.
"I have no clue how to make sense ...". Some clue's are usually provided in the examples at the bottom of the cppreference pages.
0

Overloaded < operator do not compare " in the string. "" is used to denote strings so that compiler will parse it as an string. They are not the part of the string.

Comments

0

Although we write string literals in C++ surrounded by quotes, those quotes aren’t actually a part of the string. That means that the strings that you’re actually comparing here are Mary and Mary Jane, without quotes.

C++ compares strings lexicographically, which means that it goes one character at a time until a mismatch is found or one of the strings ends. If no mismatch is found before one of the strings ends, the shorter string compares smaller, hence the result you’re seeing here.

2 Comments

Yes because the shorter character can be padded out with null characters all of which would be smaller than the longer characters values.
With C++ strings that’s not actually necessarily true. C++ std::string objects can contain null characters, so the padding argument might incorrectly cause you to conclude that two different strings would compare equal. For example, a std::string made of 5 null characters will compare less than a std::string made of 137 null characters, though if you padded them with nulls they’d be the same string.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.