0

I try to write to txt file list with russian string.(I get that with unique1 = np.unique(df['search_term']), it's numpy.ndarray)

thefile = open('search_term.txt', 'w')
for item in unique1:
    thefile.write("%s\n" % item)

But in list this string looks correct. But after writing it looks like

 предметов berger bg bg045-14 отзывы
 звезд 
 воронеж

Why a get that?

4
  • @Keiwan my list is numpy.ndarray and I can't use this Commented Jul 8, 2016 at 10:29
  • What is the encoding of the data? Commented Jul 8, 2016 at 10:32
  • utf-8 @PadraicCunningham Commented Jul 8, 2016 at 10:37
  • Does the data appear correct when you view it in your dataframe? Commented Jul 8, 2016 at 14:38

1 Answer 1

0

Try writing to the file like this:

import codecs

thefile = codecs.open('search_term.txt', 'w', encoding='utf-8')  
for item in unique1:
    thefile.write("%s\n" % item)

The problem is that the file likely is encoded correctly hence why the characters will display incorrectly.

Sign up to request clarification or add additional context in comments.

1 Comment

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd7 in position 2: invalid continuation byte

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.