1

I'm working on a scraper project and one of the goals is to get every image link from HTML & CSS of a website. I was using BeautifulSoup & TinyCSS to do that but now I'd like to switch everything on Selenium as I can load the JS.

I can't find in the doc a way to target some CSS parameters without having to know the tag/id/class. I can get the images from the HTML easily but I need to target every "background-image" parameter from the CSS in order to get the URL from it.

ex: background-image: url("paper.gif");

Is there a way to do it or should I loop into each element and check the corresponding CSS (which would be time-consuming)?

2
  • "get every image from HTML & CSS..." What do you mean by get image from CSS? What is your expected result? Do you need to get files of links to images? Commented Sep 28, 2018 at 15:26
  • Sorry, I just edited the question, I just need the links of every image from a website and not just the ones from the HTML tag "img" Commented Sep 28, 2018 at 15:33

1 Answer 1

1

You can grab all the Style tags and parse them, searching what you look.

Also you can download the css file, using the resource URL and parse them.

Also you can create a XPATH/CSS rule for searching nodes that contain the parameter that you're looking for.

Sign up to request clarification or add additional context in comments.

1 Comment

Downloading the css file and parse them, I'm already doing it but it takes a bit too much processing time. Could you explain me more about the XPATH/CSS rule, because I tried that and I thought that I would need the class/id/name of the element in order to get the CSS from it. Also could you enlight me about your first option "grab all the Style tages"?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.