How do I write and encode my file in UTF-8 style in python?

Question

import io

def write_ngrams(table, filename):

    with io.open(filename, "w") as file:
        for i in table:
            outputstring=(('%d %s\n' % (table[i], i)))
            encoded = outputstring.encode("utf-8")
            file.write(encoded)

tabel = ngram_table('hiep, hiep, hoera!', 3, 0) // these are not really interesting for now

write_ngrams(tabel, "testfile3.txt")

I am getting an error at the file.write(encoded) line that states the following:

TypeError: write() argument must be str, not bytes.

However my assignment was: The output must use the utf8 encoding,

Which means that the output should be in the form of b'....'

With the ways I have tried I only get the string without the encoding or the error. However when I use print(encoded) I do receive the output in UTF-8 encoding, however when I write it to a file the encoding is gone or I get an error.

Any tips would be appreciated.

change io.open(filename, "w") to io.open(filename, "wb"). notice the b next to the w — Anwarvic
– Anwarvic, Commented May 17, 2020 at 12:45
Please don't make more work for other people by vandalizing your posts. By posting on Stack Overflow, you've granted a non-revocable right, under the CC BY-SA 4.0 license for Stack Overflow to distribute that content. By Stack Overflow policy, any vandalism will be reverted. If you want to know how to delete your post, take a look at How does deleting work? — Vickel
– Vickel, Commented May 26, 2020 at 0:33

rdas · Accepted Answer · 2020-05-17 12:45:32Z

3

You can pass the string to write() & open the file with the encoding set to utf-8

import io

def write_ngrams(table, filename):

    with io.open(filename, "w", encoding='utf-8') as file:
        for i in table:
            outputstring=(('%d %s\n' % (table[i], i)))
            file.write(outputstring)

tabel = ngram_table('hiep, hiep, hoera!', 3, 0) // these are not really interesting for now

write_ngrams(tabel, "testfile3.txt")

answered May 17, 2020 at 12:45

rdas

21.4k6 gold badges39 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Douwe Over a year ago

Thank you sir, this did indeed fix my problem. However, when I open my .txt file I don't see the encoding myself. Is this supposed to be the case? (this is my first semester using Python so I am not really experienced with it yet)

rdas Over a year ago

The encoding is how characters are represented in bytes (ultimately everything is bytes). If you use any modern text editor, they will be able to handle utf-8 seamlessly. It's not something you can "see". You can try to check the encoding in the text editor metadata. For example, VS Code shows the encoding in the bottom right

Collectives™ on Stack Overflow

How do I write and encode my file in UTF-8 style in python?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related