Sometimes I get the following message:
in process_item item['external_link_rel'] = dict_["rel"]
KeyError: 'rel'
It must be because it doesn't exist. I tried to manage but failed.
from lxml import etreeclass CleanItem():
def process_item(self, item, spider): try: root = etree.fromstring(str(item['external_link_body']).split("'")[1]) dict_ = {} dict_.update(root.attrib) dict_.update({'text': root.text}) item['external_link_rel'] = dict_["rel"] return item except KeyError as EmptyVar: if str(EmptyVar) == 'rel': dict_["rel"] = "null" item['external_link_rel'] = dict_["rel"] return item
Most likely, all problems are due to this line if str(EmptyVar) == 'rel'.
Thank you for guiding me so that an operation is performed only when this error occurs.
Before asking the question, I did a lot of research and did not come to a conclusion
Just for information, the above codes are in the pipelines.py file inside the Scrapy framework