I'm just asking why Rust decided to use &str for string literals instead of String. Isn't it possible for Rust to just automatically convert a string literal to a String and put it on the heap instead of putting it into the stack?
-
2Theoretically, sure. But it would be waaaaay slower and what would be the advantage?trent– trent2020-08-25 05:05:24 +00:00Commented Aug 25, 2020 at 5:05
-
4"put it on the heap instead of putting it into the stack", I guess string literals are placed to the ro.data section.MaxV– MaxV2020-08-25 05:54:06 +00:00Commented Aug 25, 2020 at 5:54
-
3Re: the close vote, I don't agree that this question leads to opinion-based answers. There are very clear reasons for the design.Peter Hall– Peter Hall2020-08-25 10:36:31 +00:00Commented Aug 25, 2020 at 10:36
2 Answers
To understand the reasoning, consider that Rust wants to be a systems programming language. In general, this means that it needs to be (among other things) (a) as efficient as possible and (b) give the programmer full control over allocations and deallocations of heap memory. One use case for Rust is for embedded programming where memory is very limited.
Therefore, Rust does not want to allocate heap memory where this is not strictly necessary. String literals are known at compile time and can be written into the ro.data section of an executable/library, so they don't consume stack or heap space.
Now, given that Rust does not want to allocate the values on the heap, it is basically forced to treat string literals as &str: Strings own their values and can be moved and dropped, but how do you drop a value that is in ro.data? You can't really do that, so &str is the perfect fit.
Furthermore, treating string literals as &str (or, more accurately &'static str) has all the advantages and none of the disadvantages. They can be used in multiple places, can be shared without worrying about using heap memory and never have to be deleted. Also, they can be converted to owned Strings at will, so having them available as String is always possible, but you only pay the cost when you need to.
6 Comments
ro.data cannot be dropped? Could Rust just pretend it is dropped and carry on or would that cause problems? (Edit: I'm actually wondering why str exists at all, and string literals seem to be an important part of the answer.)ro.data (though it might also be difficult cross platform) and then avoid the drop, but its making the implementation much more complicated. The types String and str have their equivalent in Vec<T> and [T]. With rust's model of ownership and shared references you really need something like &str, not just because of string literals.String owns and can modify its data, not really something you want to do with a string literal / thing in ro.data.String? I don't really know much about reverse engineering too, but I always want the program to be harder to reverse engineer.To create a String, you have to:
- reserve a place on the heap (allocate), and
- copy the desired content from a read-only location to the freshly allocated area.
If a string literal like "foo" did both, every string would effectively be allocated twice: once inside the executable as the read-only string, and the other time on the heap. You simply couldn't just refer to the original read-only data stored in the executable.
&str literals give you access to the most efficient string data: the one present in the executable image on startup, put there by the compiler along with the instructions that make up the program. The data it points to is not stored on the stack, what is stack-allocated is just the pointer/size pair, as is the case with any Rust slice.
Making "foo" desugar into what is now spelled "foo".to_owned() would make it slower and less space-efficient, and would likely require another syntax to get a non-allocating &str. After all, you don't want x == "foo" to allocate a string just to throw it away immediately. Languages like Python alleviate this by making their strings immutable, which allows them to cache strings mentioned in the source code. In Rust mutating String is often the whole point of creating it, so that strategy wouldn't work.
7 Comments
String is guaranteed to refer to heap-allocated data. You can even convert it to Box<str> and Vec<u8> without reallocation. I would assume that to be possible as long as the String is not mut. - there is no such thing as a non-mut String - as long as you own it, you can always make it mut.let mut s = s reuses the same memory location, so it indeed effectively makes the original String mut (instead of e.g. copying the data). So things will break if the non-mut String could refer to read-only data, something I didn't expect, thanks again.