0

I have 40 excel workbooks that are source files, and 40 corresponding excel workbooks that are the destination files, every week I open the 40 source files and manually copy the data from a specific worksheet in each file and paste it into the corresponding destination file. I want to automate this task with Python and openpyxl.

Source files: Input files Destination files: Distination files

So far, I am able to copy data from one excel workbook and paste it into another one but I don't know how to expand it to cover copying from multiple input files and pasting in multiple destination files.

import openpyxl

# opening the source excel file
wbo = openpyxl.load_workbook('ABC_Export.xlsx')
#attach the ranges to the sheet
wso = wbo["Report Data"]["A9":"B100000"]

# opening the destination excel file 
wbd = openpyxl.load_workbook("ABC_2023.xlsm", keep_vba=True)
#attach the ranges to the sheet
wsd = wbd["Sheet1"]["A2":"B100000"]

#step1 : pair the rows
for row1,row2 in zip(wso,wsd):
    #within the row pair, pair the cells
    for cell1, cell2 in zip(row1,row2):
    cell2.value = cell1.value
#save document
wbd.save('ABC_2023.xlsm')

This is an example of what I want to copy from a source file:

and where to paste it into the corresponding destination file:

3
  • If you are able to do it for one file, then I don't see what's stopping you to do it for multiple files. Please show us your code so far and provide an example of this multiple files copy-paste. Do you intend to copy data only between files with the same prefix (e.g., ADC_)? Commented Oct 30, 2022 at 20:57
  • @ CreepyRaccoon I have editied my question and added the code, the source files have different names but same surfix (eg, ABC_Report, DEF_Report), destination files have same names as source file but with a different suffix(eg ABC_2023, DEF_2023) Commented Oct 30, 2022 at 23:56
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. Commented Oct 31, 2022 at 5:03

1 Answer 1

1

I don't think your code is working but maybe this can guide you a bit:

import os
from glob import glob
from openpyxl import load_workbook


def copy_data(src_file: str, dst_file: str) -> None:
    # open files
    ws_src = load_workbook(src_file)["Report Data"]
    wb_dst = load_workbook(dst_file, keep_vba=True)
    ws_dst = wb_dst["Sheet1"]

    # configuration
    start_row_src = 2       # A2
    start_row_dst = 10      # A10
    rows2copy = 100000

    # copy data from src_file to dst_file
    input_offset = start_row_dst - start_row_src
    for i in range(start_row_src, rows2copy):
        ws_dst[f"A{i}"].value = ws_src[f"A{i + input_offset}"].value
        ws_dst[f"B{i}"].value = ws_src[f"B{i + input_offset}"].value

    # save the modifications
    wb_dst.save(dst_file)


# files directories
src_dir_path = "your/source/files/directory"
dst_dir_path = "your/destination/files/directory"

# iterate over all excel files found in source path
workbooks = glob(f"{src_dir_path}/*.xlsx")
for src in workbooks:
    dst = dst_dir_path + '/' + os.path.basename(src).replace("_Report.", "_2023.")
    copy_data(src, dst)

The idea is to scan for all input files and then call the copy_data function for each one. You will have to tweak it a bit to your needs.

Sign up to request clarification or add additional context in comments.

10 Comments

Thank You @Creepy Raccoon, my code works for copying from just one file to another file, trying yours now
Thank You again @CreepyRaccoon, everything works but I am having trouble with the last part of the code dst = dst_dir_path + '/' + os.path.basename(src).replace("_Report.", "_2023.") the code doesn't rename all the files in the source path it renames just one file so only that one file is copied and pasted in the destination file, how do i get the code to rename all the files in the source path so the copy and paste can work for all the files?
@ohmandy, I assumed that all your files are within your source path, and all of them have the same naming pattern "xxx_Report.xlsx" (with the same file extension). If you have subdirectories in your source path, you may try with glob(f"{src_dir_path}/**/*.xlsx", recursive=True)) instead. Do you see all your files listed in the variable workbooks?
@CreepyRacoon Your assumption is correct, all my source files are within the same source path and have same naming pattern "xxx_Report.xlsx", yes i see all the files listed in variable the workbooks, but only one file is listed in the variable src and only one corresponding destination file is listed in the variable dst
What do you mean @ohmandy? This sounds good and expected. If all your files are listed in workbooks, then the for loop will iterate over all those files and process them one by one, that's why you see only one src and one dst at a time. However, at the end of the loop, your destination path should have all the updated excel files.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.