0

Using Python, I want to merge all .xls files in a directory into one data frame and save that as a new concatenated .xls file. The .xls files will have an unknown number of columns and no consistent headers.

I've used other suggestions on this forum and ended up with this:

import os
import pandas as pd

path = os.getcwd()
files = os.listdir(path)

files_xls = [f for f in files if f[-3:] == 'xls']

df = pd.DataFrame()

for f in files_xls:
    data = pd.read_excel(f for f in files_xls) # I dont understand what to add 
# in the parentheses here.
    df = df.append(data)
    df

I'm getting these errors:

File "<ipython-input-17-bb67a423cf40>", line 14, in <module>
  data = pd.read_excel(f for f in files_xls)

File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 170, in read_excel
  io = ExcelFile(io, engine=engine)

File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 229, in __init__
  raise ValueError('Must explicitly set engine if not passing in'

ValueError: Must explicitly set engine if not passing in buffer or path for io.

1 Answer 1

1

try this brother

df = []

for f in files_xls:
    data = pd.read_excel(f) 
    df = df.append(data)

mydf = pd.concat(df, axis = 0)
Sign up to request clarification or add additional context in comments.

2 Comments

you need to add your path, like read_excel('P://myfolder' + f)
Okay, I think the problem might be with the files I'm trying to work with. I made some test .xlsx files to try this out (with your filepath suggestion), and now I'm getting this error: File "<ipython-input-10-457ba263105e>", line 15, in <module> df = df.append(data) AttributeError: 'NoneType' object has no attribute 'append'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.