2

I have a Python scraper which scraper a web site and inserts a data into MySql db. All of a sudden I got an error of

UnicodeEncodeError: 'latin-1' codec can't encode character u'\u20ac' in position 39: ordinal not in range(256) when I parsed the string which contains the sign of EURO -- €1.

I saw some articles describing how to solve this issue but didn't understand how to apply them to my issue. I just scrape the data using BeautifulSoup, I don't encode/decode it manually.

I use this module import MySQLdb to work with MySql.

So how do I get rid of this issue?

1
  • 1
    What character set did you use when you created the database? Commented May 5, 2013 at 4:35

2 Answers 2

1

I have the same problems befor,I think it because Python use unicode encoding as default,but mysql use latin as default encoding,if you mysql database do not support utf-8,please use this

simply you can add default-character-set = utf8 under [client] in the mysql configure file and character-set-server = utf8 under [mysqld].the mysql configure file in linux is /etc/my.cnf,I don't know the location in windows,you can find out youself.At the same time,you sould use sql_con = MySQLdb.connect(host=MYSQL_ADDR , user=MYSQL_USER , passwd=MYSQL_PWD , db=MYSQL_DB , charset="utf8") to connect mysql.for safety you can add #coding: utf8 in your python code.attention,it a comment.by the way,you do not have to set encoding in mysql5.6.

Sign up to request clarification or add additional context in comments.

Comments

1

If you're using 'latin-1' encoding in your table (You can check charset using Show Create Table <table-name>;), then you can replace all unknown characters with their HTML entities:

u'EURO -- €1'.encode('latin-1', 'xmlcharrefreplace')
# result is 'EURO -- &#8364;1'

If you're using Unicode encoding, just create a Unicode string with u'' and pass it to DB.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.