0

I have a text file with accented characters like é. I used File Encoding Checker, it appears that the file is encoded as windows-1252. I read the file with the following JS and node.js code:

fs.readFile('frenchVerbsList.txt','utf-8', function(err, data) {
    if (err) {
        return console.log("ERROR here!: " + err);
    }
    frenchWords = data.split('\r\n');
    console.log(frenchWords);
});

The output from the console.log statement shows a question mark instead of the accented characters. What has happened?

2
  • 1
    The file probably isn't UTF-8 encoded. Commented May 31, 2016 at 9:27
  • Can you provide the file? Commented May 31, 2016 at 9:28

1 Answer 1

1

Node only supports some encodings and windows-1252 is not part of it. You need to convert the encoding with, for example, encoding to, for example, utf-8.

Similar to this, but haven't tested

var encoding = require("encoding");

fs.readFile('frenchVerbsList.txt', function(err, text) {
    if (err) return console.log("ERROR here!: " + err);

    var resultBuffer = encoding.convert(text, 'utf8', 'windows1252');
    frenchWords = resultBuffer.toString().split('\r\n');

    console.log(frenchWords);
})
Sign up to request clarification or add additional context in comments.

1 Comment

It certainly looks good, and I duly installed the "encoding" module to this project's directory. However it didn't change the output. Then I used Notepad++ to convert the french file to utf-8, this produced correct output. Thanks for the info about the "encoding" module though, it led me in the right direction :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.