1

Usually, when I read text files, I do it like this:

 File file = new File("some_text_file.txt");
 Scanner scanner = new Scanner(new FileInputStream(file));
 StringBuilder builder = new StringBuilder();
 while(scanner.hasNextLine()) {
     builder.append(scanner.nextLine());
     builder.append('\n');
 }
 scanner.close();
 String text = builder.toString();

There may be better ways, but this method has always worked for me perfectly.

For what I am working on right now, I need to read a large text file (over 700 kilobytes in size). Here is a sample of the text when opened in Notepad (the one that comes standard with any Windows operating system):

"lang"
{
    "Language"      "English"
    "Tokens"
    {
        "DOTA_WearableType_Daggers"     "Daggers"
        "DOTA_WearableType_Glaive"      "Glaive"
        "DOTA_WearableType_Weapon"      "Weapon"
        "DOTA_WearableType_Armor"       "Armor"

However, when I read the text from the file using the method that I provided above, the output is:

Sample output

I could not paste the output for some reason. I have also tried to read the file like so:

 File file = new File("some_text_file.txt");
 Path path = file.toPath();
 String text = new String(Files.readAllBytes(path));

... with no change in result.

How come the output is not as expected? I also tried reading a text file that I wrote and it worked perfectly fine.

2
  • 2
    Specify an encoding. The data is UTF-16. Commented May 30, 2013 at 6:43
  • Your method worked for me. Commented May 30, 2013 at 7:06

2 Answers 2

2

It looks like encoding problem. Use a tool that can detect encoding to open the file (like Notepad++) and find how it is encoded. Then use the other constructor for Scanner:

Scanner scanner = new Scanner(new FileInputStream(file), encoding);

Or you can simply experiment with it, trying different encodings. It looks like UTF-16 to me.

Sign up to request clarification or add additional context in comments.

1 Comment

Yep. It was an encoding problem. As soon as I initialized the Scanner to use UTF-16 it worked without a problem. Thanks to everyone who mentioned that.
1

final Scanner scanner = new Scanner(new FileInputStream(file), "UTF-16");

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.