2

I'm using strings.Split to split a string.

I would like my program to retain one of the elements of the array and release the underlying array.

Unfortunately I can't figure out how to convert a slice of a string into a string that doesn't refer to the underlying string.

Am I supposed to do something like this:

func unslice(s string) (string) {
  return string([]byte(s))
}

The background is:

  1. the underlying string is very large
  2. the slice I want to retain is very small
  3. the slice I want to retain will be retained for a long time
  4. the program will run for a long time - weeks or more
  5. during the lifetime of the program it will split many of these strings (millions)

Here is an example in response to the comment.

func takesBigStringOften(big string) {
    parts := strings.Split(big, " ")

    saveTinyStringForALongTime(parts[0])
}
3
  • What is the goal? Give an example. Commented Mar 13, 2016 at 20:41
  • @MuffinTop I am not sure if it does what I want. I'm also not sure if there is a standard way of doing this. I imagine that this is a very common thing so I was surprised I couldn't find anything by googling. Commented Mar 13, 2016 at 20:48
  • @XXXX I added an example, although I'm not sure what you're looking for. Commented Mar 13, 2016 at 20:52

2 Answers 2

3

Just as some further information. Some benchmark code and memory profiling shows that as of go 1.5.3, both methods allocate the same amount of memory from the heap, i.e. a new copy is made either way. In building a string from a byte slice, the compiler calls a routine that makes a unique copy of the bytes - since strings are immutable and byte slices are not.

$ go tool pprof -alloc_space so002.test cprof0
Entering interactive mode (type "help" for commands)
(pprof) list copy
Total: 9.66MB
    9.62MB     9.62MB (flat, cum) 99.55% of Total
         .          .     15:
         .          .     16:var global string
         .          .     17:
         .          .     18:func benchmarkcopy(b *testing.B, c int) {
         .          .     19:   big := "This is a long string"
         .       240B     20:   parts := strings.Split(big, " ")
         .          .     21:   old := parts[0]
         .          .     22:   jlimit := 100
         .          .     23:   for i := 0; i < b.N; i++ {
         .          .     24:       for j := 0; j < jlimit; j++ {
    3.21MB     3.21MB     25:           global = string([]byte(old))
         .          .     26:       }
         .          .     27:       for j := 0; j < jlimit; j++ {
         .          .     28:           b := []byte(old)
    3.21MB     3.21MB     29:           global = string(b)
         .          .     30:       }
         .          .     31:       for j := 0; j < jlimit; j++ {
    3.21MB     3.21MB     32:           new := make([]byte, len(old))
         .          .     33:           copy(new, old)
         .          .     34:           global = string(old)
         .          .     35:       }
         .          .     36:   }
         .          .     37:}
Sign up to request clarification or add additional context in comments.

2 Comments

Nice! That makes sense. Very thorough answer. I had not though about profiling thiss
So many things to admire about go. The easy cpu and memory profiling are one of those. I'm still trying to streamline my own profiling setup so it feels easy to profile things that I care about. I see others who are better at it and seem to take it for granted. Still trying to get to that level of comfort myself so the question here seemed perfect for a little experimentation.
2

To ensure that Go doesn't keep the underlying string in memory you will have to explicitly copy it to a new location:

func unslice(old string) string {
    new := make([]byte,len(old))
    copy(new,old)
    return string(old)
}

SmallString := unslice(BigString[0:7])

4 Comments

Thank you, does that mean in my attempted solution []byte(s) returns a slice?
@Coder yes []byte(s) returns a byte slice where each byte is one character from the string s.
I Think it does, but I am not sure. Either way this shows very explicitly what you are trying to do, making it easier for future maintainers.
The byte slice isn't exactly "each character", if there are multibyte characters in the string (runes ?). You can iterate over these insteade. It is "each byte" of the string.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.