3

I found some Python2 code to extract images from Excel files.

I have a very fundamental question: Where shall I specify the path of my target excel file?

Or does it only work with an active opened Excel file?

import win32com.client       # Need pywin32 from pip
from PIL import ImageGrab    # Need PIL as well
import os

excel = win32com.client.Dispatch("Excel.Application")
workbook = excel.ActiveWorkbook

wb_folder = workbook.Path
wb_name = workbook.Name
wb_path = os.path.join(wb_folder, wb_name)

#print "Extracting images from %s" % wb_path
print("Extracting images from", wb_path)

image_no = 0

for sheet in workbook.Worksheets:
    for n, shape in enumerate(sheet.Shapes):
        if shape.Name.startswith("Picture"):
            # Some debug output for console
            image_no += 1
            print("---- Image No. %07i ----", image_no)

            # Sequence number the pictures, if there's more than one
            num = "" if n == 0 else "_%03i" % n

            filename = sheet.Name + num + ".jpg"
            file_path = os.path.join (wb_folder, filename)

            #print "Saving as %s" % file_path    # Debug output
            print('Saving as ', file_path)

            shape.Copy() # Copies from Excel to Windows clipboard

            # Use PIL (python imaging library) to save from Windows clipboard
            # to a file
            image = ImageGrab.grabclipboard()
            image.save(file_path,'jpeg')

3 Answers 3

12

You can grab images from existing Excel file like this:

from PIL import ImageGrab
import win32com.client as win32

excel = win32.gencache.EnsureDispatch('Excel.Application')
workbook = excel.Workbooks.Open(r'C:\Users\file.xlsx')

for sheet in workbook.Worksheets:
    for i, shape in enumerate(sheet.Shapes):
        if shape.Name.startswith('Picture'):  # or try 'Image'
            shape.Copy()
            image = ImageGrab.grabclipboard()
            image.save('{}.jpg'.format(i+1), 'jpeg')
Sign up to request clarification or add additional context in comments.

1 Comment

It seems there are many images names are not starting 'Picture'. I guess there could be better control as to which images to extract if we'd examine the shape.names closely. Thanks a lot for the hacking.
9

An xlsx file is actually a zip file. You can directly get the images from the xl/media subfolder. You can do this in python using the ZipFile class. You don't need to have MS Excel or even run in Windows!

1 Comment

Can you share with us an example ?
1

Filepath and filename is defined in the variables here:

wb_folder = workbook.Path
wb_name = workbook.Name
wb_path = os.path.join(wb_folder, wb_name)

In this particular case, it calls the active workbook at the line prior:

workbook = excel.ActiveWorkbook

But you should theoretically be able to specify path using the wb_folder and wb_name variables, as long as you load the file on the excel module (Python: Open Excel Workbook using Win32 COM Api).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.