I think I am following the right approach but I am still getting an encoding error:
from xml.dom.minidom import Document
import codecs
doc = Document()
wml = doc.createElement("wml")
doc.appendChild(wml)
property = doc.createElement("property")
wml.appendChild(property)
descriptionNode = doc.createElement("description")
property.appendChild(descriptionNode)
descriptionText = doc.createTextNode(description.decode('ISO-8859-1'))
descriptionNode.appendChild(descriptionText)
file = codecs.open('contentFinal.xml', 'w', encoding='ISO-8859-1')
file.write(doc.toprettyxml())
file.close()
The description node contains some characters in ISO-8859-1 encoding, this is encoding specified by the site it self in meta tag. But when doc.toprettyxml() starts writing in file I got following error:
Traceback (most recent call last):
File "main.py", line 467, in <module>
file.write(doc.toprettyxml())
File "C:\Python27\lib\xml\dom\minidom.py", line 60, in toprettyxml
return writer.getvalue()
File "C:\Python27\lib\StringIO.py", line 271, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 10: ordinal not in range(128)
Why am I getting this error as I am decoding and encoding with same standard?
Edited
I have following deceleration in my script file:
#!/usr/bin/python
# -*- coding: utf-8 -*-
may be this is conflicting?
Orientaciónin content we got error. The site is inespanol language. Don't know about content as it fails before writing it in to file. If i do not use encoding at all then it writes in to file but that is not accepted by browser and they keep sayingerror on line blah on column blahdescriptionobject is a byte-string not a character string.