I have a function capable of translating UTF-8 characters into a custom 8-bit encoding:
const fn char_to_custom_encoding(c: char) -> u8;
I'd like to apply it at compile time to a string literal so that only the translated array is stored and can be accessed instantly at runtime, with a signature along the lines of
const fn literal_to_custom_encoding(input: &'static str) -> &'static [u8]
The source string is expected to contain UTF-8 characters, so processing it naively byte-by-byte as if it was ASCII, like done in this blog post, is not an option, and str.chars() can not be used at compile time as it is not a const method.
So far, my best bet seems to be to employ the general approach described in the aforementioned blog post and then manually detect UTF-8 characters during iteration. However, aside from having to write a manual implementation when a built-in one exists, this will also lead to the resulting array being padded with zeros as multi-byte characters get converted into single values and don't fill it completely; this is made even more complicated by the crate being no_std, making returning a Vec nonviable even at runtime.
Is there a better approach to this problem that still ensures that the transformation is done at compile time? I've considered using macros, but it seems that would leave the problem of iterating over the UTF-8 characters largely unsolved (there's the const_format crate doing something similar internally, but I'm having a hard time understanding its code) and I'd have to reimplement the conversion function as a macro as well, making it completely impossible to use at runtime even if desired.
&'static [u8]doesn't make much sense as a return type, especially withoutalloc. What memory is it supposed to reference?