6

I have a very large CSV data set (several million records). I have already filtered and massaged and split this list to a clients specification. This was all done in Python3.3

The last requirement is that these split lists be saved in Excel format. They have a utility that imports an Excel spreadsheet (in a specific format) into their database after doing some calculations and checking for existing duplicates in the DB. My problem is that their utility only works on Excel 2003 .xls files... I didn't know this ahead of time.

So I can already write the data in the correct format for Excel 2007 using OpenPyXl, but these files won't work. I can write CSV files but those don't work either, their importer needs xls files. Maybe there is a way to batch convert all the files from Excel 2007 xlsx format to xls format, or from csv format to xls format? There are thousands of files so it can't be done by hand.

The best thing to do would be output them in the correct format, but I can't seem to find a python 3 compatible way that will work with Excel 2003 format. xlwt is python 2.x only.

Does anyone have suggestions how I can finish this?

EDIT: This is what the solution looked like.

EDIT2: Added the workbook close as suggested by stenci.

import os
import errno
import glob 
import time 
import win32com.client    

def xlsx_to_xls(path):
    xlsx_files = glob.glob(path+'\\*.xlsx') 

    if len(xlsx_files) == 0: 
        raise RuntimeError('No XLSX files to convert.') 

    xlApp = win32com.client.Dispatch('Excel.Application') 

    for file in xlsx_files: 
        xlWb = xlApp.Workbooks.Open(os.path.join(os.getcwd(), file)) 
        xlWb.SaveAs(os.path.join(os.getcwd(), file.split('.xlsx')[0] + '.xls'), FileFormat=1) 
        xlWb.Close()

    xlApp.Quit() 

    time.sleep(2) # give Excel time to quit, otherwise files may be locked 
    for file in xlsx_files: 
        os.unlink(file) 
3
  • 1
    Careful: your solution doesn't seem to close the workbooks after saving them. If you really have thousands of files your function keeps fattening up Excel, and it will eventually crash. Commented Jul 24, 2013 at 21:42
  • @stenci Nice catch. I had only tested it on a small subset so far. Thanks! Commented Jul 24, 2013 at 22:19
  • It's no longer the case that xlwt is only for Python 2. Commented May 7, 2019 at 16:56

2 Answers 2

5

Open them with Excel 2007 and save them as Excel 2003. You can do it with a simple VBA macro, or from Python, without even showing the Excel application to the user. The only problem is that you need Excel in your computer.

Here is the VBA code:

Sub ConvertTo2003(FileName As String)
  Dim WB As Workbook
  Set WB = Workbooks.Open(FileName, ReadOnly:=True)
  WB.SaveAs Replace(FileName, ".xlsx", ".xls"), FileFormat:=xlExcel8
  WB.Close
End Sub

Here is the Python code:

xlApp = Excel.ExcelApp(False)
xlApp.convertTo2003('FileName.xlsx')

class ExcelApp(object):
    def __init__(self, visible):
        self.app = win32com.client.Dispatch('Excel.Application')
        if visible:
            self.app.Visible = True

    def __exit__(self):
        self.app.Quit()

    def __del__(self):
        self.app.Quit()

    def convertTo2003(self, fileName):
        if self.app:
            wb = self.app.WorkBooks.Open(fileName, ReadOnly = True)
            wb.SaveAs(fileName[:-1], FileFormat = 56)
            wb.Close()

    def quit(self):
        if self.app:
            self.app.Quit()
Sign up to request clarification or add additional context in comments.

4 Comments

Yes this is what i actually did, in python. Thanks.
If you think this (or any other) is the correct answer please mark it as answer :)
I marked it as answered. I apparently don't have enough reputation yet to mark your answer as useful or else I would.
@trashrobber: Now you do :)
2

The situation has changed since the question was first asked (and answered). As of version 1.0.0, xlwt does work with Python 3. As such, it is arguably the most straightforward option for outputting Excel 2003 workbooks, and definitely the preferred way if you don't have Excel handy.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.