1

Just had this error pop up while messing around with some graphics for a terminal interface...

thread 'main' panicked at 'byte index 2 is not a char boundary; it is inside '░' (bytes 1..4) of ░▒▓█', src/main.rs:38:6

Can I not use these characters, or do I need to work some magic to support what I thought were default ASCII characters?

(Here's the related code for those wondering.)

// Example call with the same parameters that led to this issue.
charlist(" ░▒▓█".to_string(), 0.66);

// Returns the n-th character in a string.
// (Where N is a float value from 0 to 1,
// 0 being the start of the string and 1 the end.)
fn charlist<'a>(chars: &'a String, amount: f64) -> &'a str {
    let chl: f64 = chars.chars().count() as f64;  // Length of the string
    let chpos = -((amount*chl)%chl) as i32;  // Scalar converted to integer position in the string
    &chars[chpos as usize..chpos as usize+1]  // Slice the single requested character
}
4
  • Please show your code, that resulted in this error. You probably indexed a slice in the middle of a character (that's what the error message says), and it's not allowed in rust. Commented Oct 14, 2022 at 7:13
  • @AleksanderKrauze additional context has been provided. The input string was " ░▒▓█" Commented Oct 14, 2022 at 7:21
  • What should the second line of the function do? Are you trying to return n-th character? Commented Oct 14, 2022 at 7:27
  • @AleksanderKrauze Ah yeah, probably should have specified in a comment above the function. Yes, the purpose is to return the nth character in a given string. Commented Oct 14, 2022 at 7:34

1 Answer 1

1

There are couple misconceptions you seem to have. So let me address them in order.

  1. , , and are not ASCII characters! They are unicode code points. You can determine this with following simple experiment.
fn main() {
    let slice = " ░▒▓█";
    for c in slice.chars() {
        println!("{}, {}", c, c.len_utf8());
    } 
}

This code has output:

 , 1
░, 3
▒, 3
▓, 3
█, 3

As you can see this "boxy" characters have a length of 3 bytes each! Rust uses utf-8 encoding for all of it's strings. This leads to another misconception.

  1. I this line &chars[chpos as usize..chpos as usize+1] you are trying to get a slice of one byte in length. String slices in rust are indexed with bytes. But you tried to slice in the middle of a character (it has length of 3 bytes). In general characters in utf-8 encoding can be from one to four bytes in length. To get char's length in bytes you can use method len_utf8.

  2. You can get an iterator of characters in a string slice using method chars. Then getting n-th character is as easy as using iterators method nth So the following is true:

assert_eq!(" ░▒▓█".chars().nth(3).unwrap(), '▒');

If you want to have also indices of this chars you can use method char_indices.

  1. Using f64 values to represent nth character is odd and I would encourage you rethink if you really want to do this. But if you do you have two options. You must remember that since characters have a variable length, string's slice method len doesn't return number of characters, but slice's length in bytes. To know how many characters are in the string you have no other option than iterating over it. So if you for example want to have a middle character you must first know how many there are. I can think of two ways you can do this.

    • You can either collect characters for Vec<char> (or something similar). Then you will know how many characters are there and can in O(1) index nth one. However this will result in additional memory allocation.

    • You can fist count how many characters there are with slice.chars().len(). Then calculate position of the nth one and get it by again iterating over chars and getting the nth one (as I showed above). This won't result in any additional memory allocation, but it will have complexity of O(2n), since you will have to iterate over whole string twice.

Which one you pick is up to you. You will have to make a compromise.

  1. This isn't really a correctness problem, but prefer using &str over &String in the arguments of functions (as it will provide more flexibility to your callers). And you don't have to specify lifetime if you have only one reference in the arguments and the other one is in the returned type. Rust will infer that they have to have the same lifetime.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.