0
import io

def write_ngrams(table, filename):

    with io.open(filename, "w") as file:
        for i in table:
            outputstring=(('%d %s\n' % (table[i], i)))
            encoded = outputstring.encode("utf-8")
            file.write(encoded)

tabel = ngram_table('hiep, hiep, hoera!', 3, 0) // these are not really interesting for now

write_ngrams(tabel, "testfile3.txt")

I am getting an error at the file.write(encoded) line that states the following:

TypeError: write() argument must be str, not bytes.

However my assignment was: The output must use the utf8 encoding,

Which means that the output should be in the form of b'....'

With the ways I have tried I only get the string without the encoding or the error. However when I use print(encoded) I do receive the output in UTF-8 encoding, however when I write it to a file the encoding is gone or I get an error.

Any tips would be appreciated.

3
  • change io.open(filename, "w") to io.open(filename, "wb"). notice the b next to the w Commented May 17, 2020 at 12:45
  • 1
    Please don't deface your question Commented May 26, 2020 at 0:30
  • 3
    Please don't make more work for other people by vandalizing your posts. By posting on Stack Overflow, you've granted a non-revocable right, under the CC BY-SA 4.0 license for Stack Overflow to distribute that content. By Stack Overflow policy, any vandalism will be reverted. If you want to know how to delete your post, take a look at How does deleting work? Commented May 26, 2020 at 0:33

1 Answer 1

3

You can pass the string to write() & open the file with the encoding set to utf-8

import io

def write_ngrams(table, filename):

    with io.open(filename, "w", encoding='utf-8') as file:
        for i in table:
            outputstring=(('%d %s\n' % (table[i], i)))
            file.write(outputstring)

tabel = ngram_table('hiep, hiep, hoera!', 3, 0) // these are not really interesting for now

write_ngrams(tabel, "testfile3.txt")
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you sir, this did indeed fix my problem. However, when I open my .txt file I don't see the encoding myself. Is this supposed to be the case? (this is my first semester using Python so I am not really experienced with it yet)
The encoding is how characters are represented in bytes (ultimately everything is bytes). If you use any modern text editor, they will be able to handle utf-8 seamlessly. It's not something you can "see". You can try to check the encoding in the text editor metadata. For example, VS Code shows the encoding in the bottom right

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.