What I'm missing?
I'm trying to scrapy some json but I'm keeping receiving this html header with the json response:
response.data['html'] return:
2021-02-18 10:35:57 [bcb] DEBUG: b'<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">{"TotalRows":132,"RowCount":15,"Rows":[{"tit`....
Here is the code:
yield scrapy.Request(address_pesquisa, self.parse, meta={
'splash': {
'args': {
# set rendering arguments here
'html': 1,
'png': 0,
},
# optional parameters
'endpoint': 'render.json', # optional; default is render.json
'splash_url': 'http://192.168.15.100:8050', # optional; overrides SPLASH_URL
'slot_policy': scrapy_splash.SlotPolicy.PER_DOMAIN,
'splash_headers': {}, # optional; a dict with headers sent to Splash
'dont_process_response': False, # optional, default is False
'dont_send_headers': True, # optional, default is False
'magic_response': True, # optional, default is True
}
})
I have to remove this header by my self with some regex or what? Or my scrapy is misconfigured?