44

It's the reverse of this question: Why can't strings be mutable in Java and .NET?

Was this choice made in Ruby only because operations (appends and such) are efficient on mutable strings, or was there some other reason?

(If it's only efficiency, that would seem peculiar, since the design of Ruby seems otherwise to not put a high premium on faciliating efficient implementation.)

3
  • Note: I know very little about Ruby. But I'm interested in language design. Commented Apr 9, 2010 at 15:03
  • 4
    Why don't you ask Matz? He probably has a better answer than we do Commented Apr 9, 2010 at 15:07
  • 3
    @Sam: haha, good point. I'm hoping he's written about it somewhere and someone could point me to it or summarize. Commented Apr 9, 2010 at 17:37

2 Answers 2

34

This is in line with Ruby's design, as you note. Immutable strings are more efficient than mutable strings - less copying, as strings are re-used - but make work harder for the programmer. It is intuitive to see strings as mutable - you can concatenate them together. To deal with this, Java silently translates concatenation (via +) of two strings into the use of a StringBuffer object, and I'm sure there are other such hacks. Ruby chooses instead to make strings mutable by default at the expense of performance.

Ruby also has a number of destructive methods such as String#upcase! that rely on strings being mutable.

Another possible reason is that Ruby is inspired by Perl, and Perl happens to use mutable strings.

Ruby has Symbols and frozen Strings, both are immutable. As an added bonus, symbols are guaranteed to be unique per possible string value.

Sign up to request clarification or add additional context in comments.

20 Comments

It's possible to make a string immutable by calling .frozen on it, but you can't really make an immutable string mutable - it would violate the principle. For example, if I pass an immutable string to a function, I wouldn't expect the function to make it mutable and start changing it.
I'm fascinated by the idea that immutability makes extra work for me as a programmer. My idea of extra work is that I have to be very careful who I show a string to, because anybody might mutate it! I like my immutable strings!
@rjh: these days my psychiatrist is allowing me nothing but integers and floats---and she's not so sure about the floats.
How exactly do immutable strings "make work harder for the programmer"?
Actually, when it comes to the inner workings of the + operator in Java, it doesn't have to be StringBuffer, StringBuilder can be used as well. The specification mentions StringBuffer or a similar technique
|
6

These are my opinions, not Matz's. For purposes of this answer, when I say that a language has "immutable strings", that means all its strings are immutable, i.e. there is no way to create a string that is mutable.

  1. The "immutable string" design sees strings as both identifiers (e.g. as hash keys and other VM-internal uses) and data-storage structures. The idea is that it's dangerous for identifiers to be mutable. To me, this sounds like a violation of single-responsibility. In Ruby, we have symbol for identifiers, so strings are free to act as data stores. It's true that Ruby allows strings as hash keys, but I think it's rare for a programmer to store a string into a variable, use it as a hash key, then modify the string. In the programmer's mind, there is (or should be) a separation of 2 usages of strings. Often times a string used as a hash key is a literal string, so there is little chance of it being mutated. Using a string as a hash key is not much different from using an array of two strings as a hash key. As long as your mind has a good grasp on what you're using as a key, then there's no problem.

  2. Having a string as a data-store is useful from a viewpoint of cognitive simplicity. Just consider Java and its StringBuffer. It's an extra data structure (in an already large and often unintuitive standard library) that you have to manage if you're trying to do string operations like inserting one string at a certain index of another string. So on the one hand, Java recognizes the need to do these kinds of operations, but because immutable strings are exposed to the programmer, they had to introduce another structure so the operations are still possible without making us reinvent the wheel. This puts extra cognitive load on the programmer.

  3. In Python, it seems like the easiest way to insert is to grab the substrings before and after the insertion-point, then concatenate them around the to-be-inserted string. I suppose they could easily add a method to the standard library that inserts and returns a new string. However, if the method is called insert, beginners may think it mutates the string; to be descriptive it would have to be called new_with_inserted or something odd like that. In everyday usage, "inserting" meaning you change the contents of the things inserted into (e.g. inserting an envelope into a mailbox changes the contents of the mailbox). Again, this raises the question, "why can't I change my data store?"

  4. Ruby provides freezing of objects, so they can be safely passed around without introducing subtle bugs. The nice thing is that Ruby treats strings just like any other data structure (arrays, hashes, class instances); they can all be frozen. Consistency is programmer-friendly. Immutable strings make strings stand out as a "special" data structure, when it's not really, if you use it as a data store.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.