3

So, as we know, objects(in the current example - strings) are compared relative to their reference in the heap. So, if:

string a = "something"; 
string b = "something"; 
bool isEqual = (a == b);

will put the value of a in the string pool and after finding the value of b to be the same as a while searching through the pool, will assign the same reference to the variable b. Okay, that's clear. But what happens if:

string a = "somethingNew";
bool isEqual = (a == "somethingNew");

How are such comparison literals represented in memory(if at all) and how is the whole comparison process done in this case?

4 Answers 4

12

Objects can be compared relative to their reference in the heap. Most objects wouldn't be human-friendly to compare if this is how they were natively used, and so things like string actually implement an overload of the equality operators to be more intuitive. string compares equality (via the equality == operator) by first checking the memory reference (by calling object.ReferenceEquals(object, object) first) and if not the same reference, then falls back to comparing characters in the string, regardless of memory location.

String literals such as "somethingNew" are compiled to a variable with a reference to that string value in what .NET calls the intern memory pool... It's a means by which all strings with the same value (meaning same case and characters) all serve as pointers to a single reference in the intern pool rather than each having their own memory allocation for an identical value. This saves memory at the cost of looking up the value in the intern pool. This works because strings are immutable (read-only), so changing the value of a string via concatenation with the + or += operators or otherwise actually creates an entirely new string. String variables are not interned by default unless they are literals.

The comparison of strings in your example will succeed on the initial string equality object reference check and return true without any further analysis of equality. This will happen because your variables are string literals and thus interned (have the same memory address). If they were not interned, the comparison would fall back to character comparison, again regardless of memory location.

You can intern non-literal strings manually by using string.Intern(string)

Sign up to request clarification or add additional context in comments.

7 Comments

I think this does not really answer directly to what the OP asks for (the last sentence).
I realize that all strings of the same value will point to the same reference (including string constant), that means the second example of the OP will check reference only, and it should return true. To create a new reference of the same value, I think we can use the string.Copy and then it will fall back to compare each characters as you said.
@DavidHaney personally, I'm surprised that it (string.Copy) actually does something other than return this. Contrast Clone() (which is return this;). Interesting!
@DavidHaney I'm stumped; I can't think of any valid use for that method; reflector does show it being used 3 times, though; frankly, I think all 3 are errors
KingKing actually defined the question better for me. The second part of the question was about the comparison with a constant, not a literal and how is such a case represented in memory(mainly, how is the constant represented in memory and is its value checked in the pool when doing a comparison). So, thank you both!
|
5

So, as we know, objects(in the current example - strings) are compared relative to their reference in the heap.

Incorrect; the == operator can be overloaded, and indeed is overloaded for string.

But what happens if:

String comparisons are used; however, even if they weren't: because that data is coming from a literal (ldstr), the same string instance would result in this case due to "interning" - so even if it was using reference comparison, it would still work.

Comments

1

This is still the exact same case, you don't have to have a variable name for a string literal. Do keep in mind that string overrides operator==() so you get a comparison on the string content, not just a plain object comparison. So this works just as well:

 string tail = "New";
 bool isEqual = (a == "something" + tail);

3 Comments

The object reference is not immaterial for a string because it first attempts to "short circuit" the comparison with an object.ReferenceEquals() check.
You are splitting hairs, of course it is allowed to optimize the comparison.
I disagree that I'm splitting hairs. Saying that the reference is immaterial is incorrect, and could lead to some very inefficient comparisons if it were true (for one, the string intern pool would be a lot less useful).
0

Uh, this got me pretty confused because the information that i got about this topic was structured in such a dumb way, first explaining that the comparison between reference types is done only by their addresses using the operator '=='(all this was posted in bold with explanation about 4 pages long). On top of that, all the examples were given by strings, but there was not a single word about any value equality between them. So, after posting here, I decided to finish the whole chapter on the case and 3 pages later(actually the last sentence on the last page) stated that there is a different behaviour for the '==' when using it to compare strings. Absolutely idiotic. So, just for a final check, to be sure that i have got the info right:

'==' used on strings first checks if the both variables reference to the same object. If not, does an actual value comparison in the content itself.

Using string constants as the: bool isEqual = (a == "somethingNew"); for comparison will actually get the constant value, search for it in the so called pool and if it has a match, it will put a reference to the same object? So, it actually assign it as a variable? Sorry, this is still a little unclear for me.

And the last one(an example from the given article):

string firstString = "deer";
string secondString = firstString;
string thirdString = "de" + 'e' + 'r';
cw(firstString == secondString); // True - same object
cw(firstString == thirdString); // True - equal objects
cw((object)firstString == (object)secondString); // True
cw((object)firstString == (object)thirdString); //False

Shouldn't in this case the value of thirdString be searched for in the pool and the whole variable to receive a reference to the same object as firstString and secondString?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.