No, it is not safe -- but if it compiles it will almost certainly work as expected.
unsigned char buff[]="someutf8string;separated;with;";
This is fine; the standard specifically permits arrays of character type (including unsigned char) to be initialized with a string literal. Successive bytes of the string literal initialize the elements of the array.
strtok(buff, ";")
This is a constraint violation, requiring a compile-time diagnostic. (That's about as close as the C standard gets to saying that something is illegal.)
The first parameter of strok is of type char*, but you're passing an argument of type unsigned char*. These two pointer types are not compatible, and there is no implicit conversion between them. A conforming compiler may reject your program if it contains a call like this (and, for example, gcc -std=c99 -pedantic-errors does reject it.)
Many C compilers are somewhat lax about strict enforcement of the standard's requirements. In many cases, compilers issue warnings for code that contains constraint violations -- which is perfectly valid. But once a compiler has diagnosed a constraint violation and proceeded to generate an executable, the behavior of that executable is not defined by the C standard.
As far as I know, any actual compiler that doesn't reject this call will generate code that behaves just as you expect it to. The pointer types char* and unsigned char* almost certainly have the same representation and are passed the same way as arguments, and the types char and unsigned char are explicitly required to have the same representation for non-negative values. Even for values exceeding CHAR_MAX, like the ones you're using, a compiler would have to go out of its way to generate misbehaving code. You could have problems on a system that doesn't use 2's-complement for signed integers, but yo're not likely to encounter such a system.
If you add an explicit cast:
strtok((char*)buff, ";")
removes the constraint violation and will probably silence any warning -- but the behavior is still strictly undefined.
In practice, though, most compilers try to treat char, signed char, and unsigned char almost interchangeably, partly to cater to code like yours, and partly because they'd have to go out of their way to do anything else.
strtokis locale independant, so it shouldn't have any trouble with what you want it for.strcpylooks for the null-terminator, and UTF-8 encoding doesn't have a 0 character, AFAIK. So it shouldn't be an issue either. BUt if you want, wait a bit and someone else will come along and either confirm or disprove what I said. Some info here: java-samples.com/showtutorial.php?tutorialid=806U+0000, encoded as byte octet0x00just like in ASCII.