Java. Appending string to file, ended with strange output

Question

I have the following simple function to append text to some txt file. The length of the text in the code is 1024:

void AppendToFile(String filename)
{
    String text = "0,1,0,0,1,0,1,1,1,0,1,0,1,0,0,1,0,1,1,1,0,1,1,0,1,0,1,1,1,1,0,1,0,1,0,0,1,1,1,1,0,1,0,0,0,0,0,1,1,1,0,1,1,0,1,0,1,0,1,1,0,1,1,1,0,0,0,0,1,0,0,1,1,0,0,0,1,0,1,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,1,0,1,0,1,0,0,0,0,1,0,0,1,1,0,1,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,1,1,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,1,0,1,0,1,1,0,0,1,0,1,0,1,0,1,1,0,0,1,1,0,0,0,0,1,1,1,1,0,1,0,1,1,1,1,0,0,0,1,1,0,1,1,0,0,0,1,0,1,1,1,0,1,1,1,1,0,1,1,0,0,0,0,1,0,0,0,0,1,1,1,1,1,1,0,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,0,1,0,1,1,1,1,1,0,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,1,1,1,0,0,0,0,1,0,0,1,0,1,1,0,1,0,1,0,1,0,1,1,0,1,1,1,1,0,0,1,1,1,1,1,0,1,0,0,0,0,1,0,0,0,1,1,0,0,0,1,1,1,1,0,1,1,1,0,1,1,1,1,0,0,1,0,0,0,0,1,0,1,0,1,1,0,1,0,0,1,1,0,1,0,0,1,0,0,1,0,1,0,1,1,0,1,1,1,1,1,1,1,0,1,0,0,1,1,0,0,1,0,1,1,1,0,1,1,1,1,0,0,1,0,1,1,0,0,0,1,1,1,1,1,0,1,0,0,1,0,1,0,0,1,1,0,0,0,0,1,0,1,1,1,1,1,0,1,0,1,1,0,1,1,0,1,1,1,0,0,0,1,1,0,1,0,1,1,0,1,0,0,1,1,0,1,1,1,1,0,0,1,1,1,0,1,1,1,1,1,0,0,1,0,1,1,0,1,0,0,1,0,1,0,1,0,0,1,0,0,1,1,1,1,0,0,1,1,1,1,0,0,1,1,0,1,0,1,1,1,0,0,1,1,";
    System.out.println(text);

    PrintWriter out = null;
    try {
        out = new PrintWriter(new BufferedWriter(new FileWriter(filename, true)));
        out.println(text);
    } catch (IOException e) {

    } finally {
        out.close();
    }
}

The printing to the console works fine. However, when I open the file - it seems like

ⰰⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰰⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰰⰱⰰⰱⰱⰱⰱⰰⰱⰰⰱⰰⰰⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰰⰱⰱⰱⰰⰱⰱⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰰⰰⰰⰰⰱⰰⰰⰱⰱⰰⰰⰰⰱⰰⰱⰰⰰⰰⰱⰰⰱⰱⰰⰰⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰱⰰⰰⰰⰰⰱⰰⰰⰱⰱⰰⰱⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰰⰱⰱⰱⰰⰰⰰⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰰⰰⰰⰱⰰⰱⰰⰱⰱⰰⰰⰱⰰⰱⰰⰱⰰⰱⰱⰰⰰⰱⰱⰰⰰⰰⰰⰱⰱⰱⰱⰰⰱⰰⰱⰱⰱⰱⰰⰰⰰⰱⰱⰰⰱⰱⰰⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰱⰱⰰⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰱⰱⰰⰱⰰⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰰⰰⰰⰱⰱⰱⰰⰰⰰⰰⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰱⰰⰰⰰⰱⰱⰰⰰⰰⰱⰱⰱⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰰⰰⰰⰰⰱⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰱⰰⰱⰰⰰⰱⰰⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰱⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰰⰰⰰⰱⰱⰱⰱⰱⰰⰱⰰⰰⰱⰰⰱⰰⰰⰱⰱⰰⰰⰰⰰⰱⰰⰱⰱⰱⰱⰱⰰⰱⰰⰱⰱⰰⰱⰱⰰⰱⰱⰱⰰⰰⰰⰱⰱⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰰⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰰⰱⰰⰱⰰⰰⰱⰰⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰱⰰⰰⰱⰱⰰⰱⰰⰱⰱⰱⰰⰰⰱⰱ਍

For shorter string, like:

String text = "0,1,0,0,1,0,1,1,1";

or for another very long string, e.g., 1024 times 'a' it works fine (so the reason is not the length of the string).

I can't understand this. Do you have any explanation?

It may be an encoding issue: 1. You should specify an encoding (e.g. UTF-8) 2. In what reader do you open the file? — assylias
– assylias, Commented Jan 3, 2014 at 9:41
Notepad. If this is an encoding issue, why I can read the short string (it is just some prefix of the text)? — Gari BN
– Gari BN, Commented Jan 3, 2014 at 9:43

Community · Accepted Answer · 2017-05-23 11:49:39Z

The problem is with Notepad. I believe it is still incorrectly detecting the encoding, although Wikipedia claims this is fixed in Windows 7.

In all my tests I compiled and run with Java 1.6.0_45 on Windows 7 64-bit. Also the system property file.encoding = Cp1252.

With your original code, the file produced is detected by Sublime Text as UTF-8 but (importantly) the Byte Order Mark (BOM) is missing. Opening the same file in Notepad shows the character placeholder square. Re-saving the file in Sublime Text with the BOM then opening in Notepad gives the expected characters.

Replacing 0s and ,s with as and opening in Notepad, I see Chinese (I think) characters which fits in with the Wikipedia information as I guess I have the correct font. So the encoding is detected incorrectly. Attempting to Save as the Notepad file, the encoding listed is Unicode which is really UTF-16 Little Endian (UTF-16LE) - see Setting the default Java character encoding?

Replacing 0s with as and opening in Notepad, I see squares again, since the incorrectly detected encoding has not matched a valid character.

Replacing all characters with as works because the detected encoding is ANSI. You can see this by trying a Save as in Notepad and observing the Encoding drop down.

From How to add a UTF-8 BOM in java, I added out.write('\ufeff'); to write the BOM before the out.println(text);, but with my default encoding the result in Notepad started with a ? since again Notepad was failing to correctly detect the encoding. It was again detected as ANSI, although at least the rest of the characters displayed as expected.

Adding -Dfile.encoding=UTF-8 and out.write('\ufeff'); finally produced a file that Notepad could decode and display as expected.

Juned Ahsan · Accepted Answer · 2014-01-03 09:47:05Z

0

FileWriter uses the system default encoding and in your case which is probably is NOT set to UTF-8. One way of fixing this problem by setting system property

-Dfile.encoding=UTF-8

which will make FileWriter to use UTF-8

answered Jan 3, 2014 at 9:47

Juned Ahsan

68.9k12 gold badges101 silver badges138 bronze badges

3 Comments

Juned Ahsan Over a year ago

@radimpe The example shorter strings mentioned are not dependent on UTF-8 encoding.

radimpe Over a year ago

So "0,1,0,1,0...." for 1024 characters is UTF-8 while "0,1,0,0,1,0,1,1,1" is not UTF-8? Quite puzzled.

Gari BN Over a year ago

I followed the instructions in stackoverflow.com/questions/361975/… to change the settings. It still doesn't work. (I get the 'Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8' as expected in the console).

radimpe · Accepted Answer · 2014-01-03 10:24:19Z

0

This is more than likely an issue with Notepad.

Notepad (at least on Windows 7, where I've replicated your issue) has a max line length of 1024 characters. By adding another 0 to the end of your string it prints fine, although it wraps the last character onto a new line.

It is also unlikely to be an encoding issue since by replacing all 0s with As and all 1s with Bs you actually get a similar error:

 ⱡⱢⱡⱡⱢⱡⱢⱢⱢⱡⱢⱡⱢⱡⱡⱢⱡⱢⱢⱢⱡⱢⱢⱡⱢⱡⱢⱢⱢⱢⱡⱢⱡⱢⱡⱡⱢⱢⱢⱢⱡⱢⱡⱡⱡⱡⱡⱢⱢⱢⱡⱢⱢⱡⱢⱡⱢⱡⱢⱢⱡ

Again adding or removing a character, will print fine.

answered Jan 3, 2014 at 10:24

radimpe

3,2132 gold badges29 silver badges49 bronze badges

Collectives™ on Stack Overflow

Java. Appending string to file, ended with strange output

3 Answers 3

Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related