Extract images with Selenium (Python)

Question

I'm learning about web scraping and now I want to know if is possible to extract a image from a website and put in to a excel file?

I'm working in this website:https://www.browniespain.com/es/novedades/

And here my code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import os
import openpyxl
from openpyxl import Workbook
import time


browser=webdriver.Safari()
browser.get("https://www.browniespain.com/es/novedades/")

primera = "//*[@id='center_column']/div[6]/div["
segunda ="]/div/div[2]/div[1]/h5/a"

productos = len(browser.find_elements_by_xpath('//*. [@id="center_column"]/div[6]/div'))

print(productos)

for n in range(1,productos+1):
  direccion = primera+str(n)+segunda
  nombre_producto = browser.find_element_by_xpath(direccion).text
  file_name = 'NovedadesBrownie.xlsx'

  if(os.path.exists(file_name)):
    workbook = openpyxl.load_workbook(file_name)
    worksheet = workbook.get_sheet_by_name('Sheet')
  else:
    workbook = Workbook()
    worksheet = workbook.active
  worksheet.cell(row=n,column=1).value = nombre_producto
  workbook.save(file_name)



  print(nombre_producto)

  primera = "//*[@id='center_column']/div[6]/div["
  segunda ="]/div/div[2]/div[1]/div[2]/span"

  productos = len(browser.find_elements_by_xpath('//*[@id="center_column"]/div[6]/div'))

  print(productos)

  for n in range(1,productos+1):
    direccion = primera+str(n)+segunda
    precio_producto = browser.find_element_by_xpath(direccion).text

    if(os.path.exists(file_name)):
      workbook = openpyxl.load_workbook(file_name)
      worksheet = workbook.get_sheet_by_name('Sheet')
    else:
      workbook = Workbook()
      worksheet = workbook.active
    worksheet.cell(row=n,column=2).value = precio_producto
    workbook.save(file_name)



    print(precio_producto)

    browser.close()

Do you know any idea to extract the images and put in that Excel file?

wp78de · Accepted Answer · 2019-01-07 04:12:03Z

3

Your XPath syntax is not correct. Try it like this:

browser.find_elements_by_xpath('//*[@id="center_column"]/div[6]/div')

The rest of the code seems to work as intended.

However, to get the images you would like to use an XPath like this:

//div/a/img[contains(@class,'imgcat')]

then use a get_attributethe retrieve the src URLs:

for i in  elements:
    image = i.find_elements_by_xpath("//div/a/img[contains(@class,'imgcat')]")
    img_src = image.get_attribute("src")

Next, I recommend downloading the files to your local disc

import urllib.request
urllib.request.urlretrieve("http://www.example.com/news/media/test.jpg", "local-filename.jpg")

and add them to your worksheet.

import openpyxl

wb = openpyxl.Workbook()
ws = wb.worksheets[0]
img = openpyxl.drawing.Image('local-filename.jpg')
img.anchor(ws.cell('A1'))
ws.add_image(img)

edited Jan 7, 2019 at 4:12

answered Jan 5, 2019 at 22:39

wp78de

19.1k7 gold badges49 silver badges78 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

rogarui Over a year ago

I know, it was a problem posting the code. Do you know how I can get the images from a web and put in an Excel file?

rogarui Over a year ago

Is working bro, but what I need is how I can download the images and put it in a Excel

wp78de Over a year ago

@rogarui I've added the necessary bits. It should be straightforward from there.

Collectives™ on Stack Overflow

Extract images with Selenium (Python)

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related