2

I'm learning about web scraping and now I want to know if is possible to extract a image from a website and put in to a excel file?

I'm working in this website:https://www.browniespain.com/es/novedades/

And here my code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import os
import openpyxl
from openpyxl import Workbook
import time


browser=webdriver.Safari()
browser.get("https://www.browniespain.com/es/novedades/")

primera = "//*[@id='center_column']/div[6]/div["
segunda ="]/div/div[2]/div[1]/h5/a"

productos = len(browser.find_elements_by_xpath('//*. [@id="center_column"]/div[6]/div'))

print(productos)

for n in range(1,productos+1):
  direccion = primera+str(n)+segunda
  nombre_producto = browser.find_element_by_xpath(direccion).text
  file_name = 'NovedadesBrownie.xlsx'

  if(os.path.exists(file_name)):
    workbook = openpyxl.load_workbook(file_name)
    worksheet = workbook.get_sheet_by_name('Sheet')
  else:
    workbook = Workbook()
    worksheet = workbook.active
  worksheet.cell(row=n,column=1).value = nombre_producto
  workbook.save(file_name)



  print(nombre_producto)

  primera = "//*[@id='center_column']/div[6]/div["
  segunda ="]/div/div[2]/div[1]/div[2]/span"

  productos = len(browser.find_elements_by_xpath('//*[@id="center_column"]/div[6]/div'))

  print(productos)

  for n in range(1,productos+1):
    direccion = primera+str(n)+segunda
    precio_producto = browser.find_element_by_xpath(direccion).text

    if(os.path.exists(file_name)):
      workbook = openpyxl.load_workbook(file_name)
      worksheet = workbook.get_sheet_by_name('Sheet')
    else:
      workbook = Workbook()
      worksheet = workbook.active
    worksheet.cell(row=n,column=2).value = precio_producto
    workbook.save(file_name)



    print(precio_producto)

    browser.close()

Do you know any idea to extract the images and put in that Excel file?

0

1 Answer 1

3

Your XPath syntax is not correct. Try it like this:

browser.find_elements_by_xpath('//*[@id="center_column"]/div[6]/div')

The rest of the code seems to work as intended.

However, to get the images you would like to use an XPath like this:

//div/a/img[contains(@class,'imgcat')]

then use a get_attributethe retrieve the src URLs:

for i in  elements:
    image = i.find_elements_by_xpath("//div/a/img[contains(@class,'imgcat')]")
    img_src = image.get_attribute("src")

Next, I recommend downloading the files to your local disc

import urllib.request
urllib.request.urlretrieve("http://www.example.com/news/media/test.jpg", "local-filename.jpg")

and add them to your worksheet.

import openpyxl

wb = openpyxl.Workbook()
ws = wb.worksheets[0]
img = openpyxl.drawing.Image('local-filename.jpg')
img.anchor(ws.cell('A1'))
ws.add_image(img)
Sign up to request clarification or add additional context in comments.

3 Comments

I know, it was a problem posting the code. Do you know how I can get the images from a web and put in an Excel file?
Is working bro, but what I need is how I can download the images and put it in a Excel
@rogarui I've added the necessary bits. It should be straightforward from there.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.