I've got some problems trying to delete from my string a subsequence \u000.
Firstly, I read bytes [] from my file into string by String str = new String(bytes, "UTF8"); then I get the str which equals \u0004Word which means 4Word. 4 is length of word Word. So now I need to convert it to regular 4Words. replaceAll("\u000", "");, replaceALL("\\\\u000", "") etc doesn't work. How to do that?
void FillingStorage() throws Exception{
Path path = Paths.get(System.getProperty("db.file"));//that's my file
byte[] data = Files.readAllBytes(path);
String str = new String(data, "UTF8");
System.out.println(str);
String res = str.replaceAll("I don't know what to write here cos nothing I've tried works");
return;
}
UPDATE!
Firstly, I fill my HashMap with Key -> Value and Key1 -> Value1. Then I write it in file as bytes.
So when I try to convert it back to string and print it I see: Key Value Key1 Value1 instead of 3Key 5Value 4Key1 6Value1. But suprisingly if you look at string that I print you will see smth like that: \u0003Key \u0005Value etc... so looks like that my string contains these numbers but java can't print them.
This is how I write my bytes in file:
DataOutputStream stream = new DataOutputStream(new FileOutputStream(System.getProperty("db.file"), true));
for (Map.Entry<String, String> entry : storage.entrySet()) {
byte[] bytesKey = entry.getKey().getBytes(StandardCharsets.UTF_8);
stream.write((int)bytesKey.length);//it disappears!
stream.write(bytesKey);
byte[] bytesVal = entry.getValue().getBytes(StandardCharsets.UTF_8);
stream.write((Integer)bytesVal.length);//disappears too!
stream.write(bytesVal);
}
stream.close();
str? I am asking because I doubt that there is\u000in it since you claim thatreplaceALL("\\\\u000", "")doesn't work. Or maybe you forgot to store result ofreplaceAllinstrreference (strings are immutable, so original string is not changed byreplaceAllmethod, but new string is created and returned).new String(data, StandardCharsets.UTF_8)instead to avoid theUnsupportedEncodingExceptionwhich can't actually happen with UTF-8.\u0080or greater at the beginning of the string, it would cause problems interpreting the data as UTF-8. You need to remove the length before you convert it to a string.