I am trying to create an automation tool for scraping a site. As part of that, I am making a Python script that utilizes the Remote Debugging protocol through this library: https://github.com/jpramosi/geckordp
My problem is, that the elements that are of interest to me, reside inside an iframe. Therefore, while elements outside of the iframe can be easily selected through a querySelect, that does not work for elements inside the iframe. As a workaround, I have resorted to manually reaching these elements through repeated traversing of children nodes:
val = walker.query_selector(val["actor"], ".last-non-iframe-class")['node']
# print(val)
children = walker.children(val["actor"])
val = children[1]
children = walker.children(val["actor"])
val = children[0]
children = walker.children(val["actor"])
val = children[1]
children = walker.children(val["actor"])
val = children[0]
.
.
.
But this is too slow for my needs. Is there any way to make querySelect work with elements inside an iframe and cut down on the number of requests I have to make to the debugger server?
The Firefox instance runs on a Linux machine, the Python code too. The Firefox instance is running normally (not headless).
If that is not easy for Firefox, is there a more feasible way with Chrome Remote Debugging?
Seleniumit would need to usedriver.switch_to.frame(iframe)and later it allows to use queries inside thisiframegeckordpI found switch_to_frame but I don't know if it is what you need.Seleniumin my case, hence the use of the lowest level RDP. I will get back to this thread after I test the function above.