I am currently trying to get the tweets of an account and write them in a specific format into a file, but sometimes the account uses emojis and other characters outside the codec, so when reading tweets, Python freaks out and gives me the following error (The specific character it doesn't like is the greek letter "χ", if that helps in any way, although I need a fix that could work with any character that Python dislikes):
UnicodeEncodeError: 'charmap' codec can't encode character '\u03c7' in position 4: character maps to <undefined>
I tried adding .encode("utf-8") to the end of the String, but that ends up writing the raw text data to the file, when I actually need the words to write to different lines. Here's the code I have so far (The code itself works, as in it reads the data and puts it into the format I need, so I don't need help on that, just the writing to file part.):
with open("LSData.txt", "a") as file:
for status in tl:
wordList = status.full_text.split(" ")
for word in wordList:
try:
if("http" not in word):
if(word == wordList[0] or
wordList[wordNum-1][len(wordList[wordNum-1])-1] == "." or
wordList[wordNum-1][len(wordList[wordNum-1])-1] == "!" or
wordList[wordNum-1][len(wordList[wordNum-1])-1] == "?"):
wordsToAdd = "-" + word + " " + wordList[wordNum+1] + "\n"
file.write(wordsToAdd)
else:
wordsToAdd = word + " " + wordList[wordNum+1] + "\n"
file.write(wordsToAdd)
except(IndexError):
pass
wordNum += 1
If I need to provide more info, let me know. Thanks in advance!