3

I have the following simple function to append text to some txt file. The length of the text in the code is 1024:

void AppendToFile(String filename)
{
    String text = "0,1,0,0,1,0,1,1,1,0,1,0,1,0,0,1,0,1,1,1,0,1,1,0,1,0,1,1,1,1,0,1,0,1,0,0,1,1,1,1,0,1,0,0,0,0,0,1,1,1,0,1,1,0,1,0,1,0,1,1,0,1,1,1,0,0,0,0,1,0,0,1,1,0,0,0,1,0,1,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,1,0,1,0,1,0,0,0,0,1,0,0,1,1,0,1,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,1,1,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,1,0,1,0,1,1,0,0,1,0,1,0,1,0,1,1,0,0,1,1,0,0,0,0,1,1,1,1,0,1,0,1,1,1,1,0,0,0,1,1,0,1,1,0,0,0,1,0,1,1,1,0,1,1,1,1,0,1,1,0,0,0,0,1,0,0,0,0,1,1,1,1,1,1,0,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,0,1,0,1,1,1,1,1,0,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,1,1,1,0,0,0,0,1,0,0,1,0,1,1,0,1,0,1,0,1,0,1,1,0,1,1,1,1,0,0,1,1,1,1,1,0,1,0,0,0,0,1,0,0,0,1,1,0,0,0,1,1,1,1,0,1,1,1,0,1,1,1,1,0,0,1,0,0,0,0,1,0,1,0,1,1,0,1,0,0,1,1,0,1,0,0,1,0,0,1,0,1,0,1,1,0,1,1,1,1,1,1,1,0,1,0,0,1,1,0,0,1,0,1,1,1,0,1,1,1,1,0,0,1,0,1,1,0,0,0,1,1,1,1,1,0,1,0,0,1,0,1,0,0,1,1,0,0,0,0,1,0,1,1,1,1,1,0,1,0,1,1,0,1,1,0,1,1,1,0,0,0,1,1,0,1,0,1,1,0,1,0,0,1,1,0,1,1,1,1,0,0,1,1,1,0,1,1,1,1,1,0,0,1,0,1,1,0,1,0,0,1,0,1,0,1,0,0,1,0,0,1,1,1,1,0,0,1,1,1,1,0,0,1,1,0,1,0,1,1,1,0,0,1,1,";
    System.out.println(text);

    PrintWriter out = null;
    try {
        out = new PrintWriter(new BufferedWriter(new FileWriter(filename, true)));
        out.println(text);
    } catch (IOException e) {

    } finally {
        out.close();
    }
}

The printing to the console works fine. However, when I open the file - it seems like

ⰰⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰰⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰰⰱⰰⰱⰱⰱⰱⰰⰱⰰⰱⰰⰰⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰰⰱⰱⰱⰰⰱⰱⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰰⰰⰰⰰⰱⰰⰰⰱⰱⰰⰰⰰⰱⰰⰱⰰⰰⰰⰱⰰⰱⰱⰰⰰⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰱⰰⰰⰰⰰⰱⰰⰰⰱⰱⰰⰱⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰰⰱⰱⰱⰰⰰⰰⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰰⰰⰰⰱⰰⰱⰰⰱⰱⰰⰰⰱⰰⰱⰰⰱⰰⰱⰱⰰⰰⰱⰱⰰⰰⰰⰰⰱⰱⰱⰱⰰⰱⰰⰱⰱⰱⰱⰰⰰⰰⰱⰱⰰⰱⰱⰰⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰱⰱⰰⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰱⰱⰰⰱⰰⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰰⰰⰰⰱⰱⰱⰰⰰⰰⰰⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰱⰰⰰⰰⰱⰱⰰⰰⰰⰱⰱⰱⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰰⰰⰰⰰⰱⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰱⰰⰱⰰⰰⰱⰰⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰱⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰰⰰⰰⰱⰱⰱⰱⰱⰰⰱⰰⰰⰱⰰⰱⰰⰰⰱⰱⰰⰰⰰⰰⰱⰰⰱⰱⰱⰱⰱⰰⰱⰰⰱⰱⰰⰱⰱⰰⰱⰱⰱⰰⰰⰰⰱⰱⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰰⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰰⰱⰰⰱⰰⰰⰱⰰⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰱⰰⰰⰱⰱⰰⰱⰰⰱⰱⰱⰰⰰⰱⰱ਍

For shorter string, like:

String text = "0,1,0,0,1,0,1,1,1";

or for another very long string, e.g., 1024 times 'a' it works fine (so the reason is not the length of the string).

I can't understand this. Do you have any explanation?

4
  • It may be an encoding issue: 1. You should specify an encoding (e.g. UTF-8) 2. In what reader do you open the file? Commented Jan 3, 2014 at 9:41
  • Notepad. If this is an encoding issue, why I can read the short string (it is just some prefix of the text)? Commented Jan 3, 2014 at 9:43
  • Works ok on Ubuntu, using Gedit to read file Commented Jan 3, 2014 at 9:47
  • I run it on Windows 7 (64 bit) with Eclipse. Commented Jan 3, 2014 at 10:17

3 Answers 3

1

The problem is with Notepad. I believe it is still incorrectly detecting the encoding, although Wikipedia claims this is fixed in Windows 7.

In all my tests I compiled and run with Java 1.6.0_45 on Windows 7 64-bit. Also the system property file.encoding = Cp1252.

With your original code, the file produced is detected by Sublime Text as UTF-8 but (importantly) the Byte Order Mark (BOM) is missing. Opening the same file in Notepad shows the character placeholder square. Re-saving the file in Sublime Text with the BOM then opening in Notepad gives the expected characters.

Replacing 0s and ,s with as and opening in Notepad, I see Chinese (I think) characters which fits in with the Wikipedia information as I guess I have the correct font. So the encoding is detected incorrectly. Attempting to Save as the Notepad file, the encoding listed is Unicode which is really UTF-16 Little Endian (UTF-16LE) - see Setting the default Java character encoding?

Replacing 0s with as and opening in Notepad, I see squares again, since the incorrectly detected encoding has not matched a valid character.

Replacing all characters with as works because the detected encoding is ANSI. You can see this by trying a Save as in Notepad and observing the Encoding drop down.

From How to add a UTF-8 BOM in java, I added out.write('\ufeff'); to write the BOM before the out.println(text);, but with my default encoding the result in Notepad started with a ? since again Notepad was failing to correctly detect the encoding. It was again detected as ANSI, although at least the rest of the characters displayed as expected.

Adding -Dfile.encoding=UTF-8 and out.write('\ufeff'); finally produced a file that Notepad could decode and display as expected.

Sign up to request clarification or add additional context in comments.

Comments

0

FileWriter uses the system default encoding and in your case which is probably is NOT set to UTF-8. One way of fixing this problem by setting system property

-Dfile.encoding=UTF-8

which will make FileWriter to use UTF-8

3 Comments

@radimpe The example shorter strings mentioned are not dependent on UTF-8 encoding.
So "0,1,0,1,0...." for 1024 characters is UTF-8 while "0,1,0,0,1,0,1,1,1" is not UTF-8? Quite puzzled.
I followed the instructions in stackoverflow.com/questions/361975/… to change the settings. It still doesn't work. (I get the 'Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8' as expected in the console).
0

This is more than likely an issue with Notepad.

Notepad (at least on Windows 7, where I've replicated your issue) has a max line length of 1024 characters. By adding another 0 to the end of your string it prints fine, although it wraps the last character onto a new line.

It is also unlikely to be an encoding issue since by replacing all 0s with As and all 1s with Bs you actually get a similar error:

 ⱡⱢⱡⱡⱢⱡⱢⱢⱢⱡⱢⱡⱢⱡⱡⱢⱡⱢⱢⱢⱡⱢⱢⱡⱢⱡⱢⱢⱢⱢⱡⱢⱡⱢⱡⱡⱢⱢⱢⱢⱡⱢⱡⱡⱡⱡⱡⱢⱢⱢⱡⱢⱢⱡⱢⱡⱢⱡⱢⱢⱡ

Again adding or removing a character, will print fine.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.