How do you remove a sub-string according to index from a Rust String or &str type?

Question

I have an &str type string from which I want to remove a sub-string. I have an algorithm to calculate the start and end positions of the part to be removed. How can I now remove the sub-string?

To illustrate this clearer, if I were using C++, I would do this:

#include<iostream>
#include<string>

    int main(){
        std::string foo = "Hello";
        int start = 2,stop = 4;
        std::cout<<foo;
        foo.erase(start, stop - start);
        std::cout<<std::endl<<foo<<std::endl;
    }

My code in Rust:

fn main(){
    let mut foo: &str = "hello";
    let start: i32 = 0;
    let stop: i32 = 4;
    //what goes here?
}

You may need to align the range inclusive part but this is the idea: (&foo[..start]).to_string() + &foo[end..] — Ömer Erden
– Ömer Erden, Commented Jun 21, 2020 at 11:32

Kitsu · Accepted Answer · 2020-06-21 11:48:26Z

11

&str is an immutable slice, it somewhat similar to std::string_view, so you cannot modify it. Instead, you may use iterator and collect a new String:

let removed: String = foo
    .chars()
    .take(start)
    .chain(foo.chars().skip(stop))
    .collect();

the other way would be an in-place String modifying:

let mut foo: String = "hello".to_string();

// ...

foo.replace_range((start..stop), "");

Keep in mind, however, that the last example semantically different, because it operates on byte indicies, rather than char ones. Therefore it may panic at wrong usage (e.g. when start offset lay at the middle of multi-byte char).

edited Jun 21, 2020 at 11:48

answered Jun 21, 2020 at 11:30

Kitsu

3,56317 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

mcarton Over a year ago

The replace_range way is likely faster. Also note that with chars the indexes are unlikely-to-be-useful codepoint indexes, while with replace_range they are byte-indexes, so the two examples are not completely equivalent.

Kitsu Over a year ago

Thanks, extended with a notice.

Arun Parolikkal Over a year ago

replace_range was exactly what I was looking for. Thanks so much! Could you clarify on this one more thing? As I understand, replace_range will only work for ASCII characters (those that have a fixed 1 byte size), and not with emojis in something like Unicode. Is this the only implication of using replace_range as opposed to chars?

Kitsu Over a year ago

String is an utf-8 string, which still can be used as a byte array, while preserving an invariant (valid utf-8 string). So you might get a panic quite easily: "я".to_string().replace_range((1..), ""), though this works: "я".to_string().replace_range((0..), ""). Therefore you can use replace_range if you know the proper char bounds.

Kaplan · Accepted Answer · 2023-07-08 22:46:54Z

1

A solution that uses the character indexes of the beginning and the end of the substring to be removed with splice():

fn remove(start: usize, stop: usize, s: &str) -> String {
    let mut v: Vec<char> = s.chars().collect();
    v.splice(start..stop, vec![]);
    v.iter().collect()
}

Playground

Kitsu's solution w/o lambda

fn remove(start: usize, stop: usize, s: &str) -> String {
    let mut rslt = "".to_string();
    for (i, c) in s.chars().enumerate() {
        if start > i || stop < i + 1 {
            rslt.push(c);
        }
    }
    rslt
}

…as fast as replace_range but can handle unicode character w/o character boundary calculations

Playground

edited Jul 8, 2023 at 22:46

answered Aug 4, 2022 at 13:47

Kaplan

3,94621 silver badges22 bronze badges

7 Comments

Chayim Friedman Over a year ago

I don't think it's as fast as replace_range() (did you benchmark?), and replace_range() can certainly handle Unicode.

Kaplan Over a year ago

@Friedman Apart from the fact that you first have to figure out the byte indices of these Unicode characters, then replace_range() works with Unicode.

Chayim Friedman Over a year ago

Even then, I'm not sure this will be faster than finding the indices then using replace_range().

Kaplan Over a year ago

@Friedman There is no even then but only has to be always. And since there is no try_replace_range(), your program will panic, if you made a mistake in finding the byte indices so that replace_range() can work with Unicode characters. This finding takes additional time, that must be taken into account.

Chayim Friedman Over a year ago

Finding the byte indices is as simple as let mut c = s.char_indices(); let start = c.nth(start).unwrap().0; let end = c.nth(end - start).unwrap().0;. And I know this will take some time, but I still think this + replace_range() will be faster than your approach.

|

Ollegn · Accepted Answer · 2023-07-08 17:13:27Z

0

If you prefer a slightly more performant way compared to remove_range that works on unicode you can use this:

fn remove_range(text: &str, start: usize, end: usize) -> String {
    let start = text.floor_char_boundary(start);
    let end = text.ceil_char_boundary(end);
    [&text[..start], &text[end..]].concat()
}

Just a note, since working with variable range characters can be troublesome, in this solution its uses an experimental feature #![feature(round_char_boundary)] to allow for "safe" string manipulation, the results might be different from expected if using complex multi-char arrangements, but it will not panic.

answered Jul 8, 2023 at 17:13

Ollegn

2,3322 gold badges17 silver badges24 bronze badges

3 Comments

Ollegn Over a year ago

if not using unicode, just remove the xxx_char_boundary. it will work fine.

Chayim Friedman Over a year ago

Why do you think it's more performant than replace_range()? I think the opposite.

Ollegn Over a year ago

i did not post any profiling because it "depends", but i would advise to profile for your specific architecture and instruction set, for me, yes it is slightly faster. the worst case scenario is the same performance.

Collectives™ on Stack Overflow

How do you remove a sub-string according to index from a Rust String or &str type?

3 Answers 3

4 Comments

7 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

7 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related