1

I am newbie to Python. Basically, I want to write a program to read column D & E from an excel file, and calculate the total Incoming and Outgoing duration.

Which Python module is used to read excel files and how to process data inside it?

Excel file:

D            E
Incoming    18
Outgoing    99
Incoming    20
Outgoing    59
Incoming    30
Incoming    40
2
  • You can check xlrd or openpyxl for reading .xls or .xlsx files in Python. Or, you can convert your excel workbook to .csv file and read it using Python's csv module or combine open() and str.split(). Commented Jul 15, 2015 at 2:42
  • I tried numpy but i was able to read data bot unable to process data inside it. but now it working with xlrd ..Thanks Commented Jul 17, 2015 at 4:58

3 Answers 3

5

there are a couple of options depending on the version of excel you are using.
openpyxl - used for reading Excel 2010 files (ie: .xlsx)
xlrd - used for reading older Excel files (ie: .xls)

I have only used xlrd, which you could do something like the below
** Note ** code not tested

import xlrd


current_row = 0
sheet_num = 1
input_total = 0
output_total = 0

# path to the file you want to extract data from
src = r'c:\temp\excel sheet.xls'

book = xlrd.open_workbook(src)

# select the sheet where the data resides
work_sheet = book.sheet_by_index(sheet_num)

# get the total number of rows
num_rows = work_sheet.nrows - 1

while current_row < num_rows:
    row_header = work_sheet.cell_value(current_row, 4)

    if row_header == 'output':
        output_total += work_sheet.cell_value(current_row, 5)
    elif row_header == 'input':
        input_total += work_sheet.cell_value(current_row, 5)

print output_total
print input_total
Sign up to request clarification or add additional context in comments.

1 Comment

Moura Thanks for the extended help it really helped me to achieve my task :)
2

It seems like simply using Excel's =SUMIF() function would be sufficient. However, you're asking for a Python solution, so here's a Python solution!

Pandas is a library that provides a DataFrame data structure very similar to an Excel spreadsheet. It provides a read_excel() function, whose documentation you can find here. Once you have a DataFrame, you could do something like this:

import pandas as pd
table = pd.read_excel('path-to-spreadsheet.xlsx')
incoming_sum = table.E[table.D == 'Incoming'].sum()
outgoing_sum = table.E[table.D == 'Outgoing'].sum()

You can get Pandas for Python on Windows, but it's a bit difficult. The easiest way is a Scientific Python distribution for Windows, like Anaconda. On Linux, installing pandas is simple as sudo pip install pandas.

1 Comment

Thanks for the your response.. but I tried with xlrd and its working now.
1

Using xlrd 0.9.3 in Python 3.4.1:

It puts all values from row D and E in two separate list.

It then combines each parallel elements of these lists (simply elements with same index) to a tuple using zip().

Then, these generated tuples are combined to a list. Using sum() and list comprehension, incoming_sum and outgoing_sum are calculated.

import xlrd

with xlrd.open_workbook('z.xlsx') as book:

    # 0 corresponds for 1st worksheet, usually named 'Book1'
    sheet = book.sheet_by_index(0)

    # gets col D values
    D = [ D for D in sheet.col_values(3) ]

    # gets col E values
    E = [ E for E in sheet.col_values(4) ]

    # combines D and E elements to tuples, combines tuples to list
    # ex. [ ('Incoming', 18), ('Outgoing', 99), ... ]
    data = list( zip(D, E) )

    # gets sum
    incoming_sum = sum( tup[1] for tup in data if tup[0] == 'Incoming' )
    outgoing_sum = sum( tup[1] for tup in data if tup[0] == 'Outgoing' )

    print('Total incoming:', incoming_sum)
    print('Total outgoing:', outgoing_sum)

Output:

Total incoming: 108.0
Total outgoing: 158.0

To install xlrd: (Windows)

  1. Download here: https://pypi.python.org/pypi/xlrd
  2. Extract to any directory, then change cmd's current directory ( chdir ) to the directory where you extracted, then type in cmd python setup.py install

    • Take note that you will extract xlrd-0.9.3.tar.gz two times, first to remove .gz, second to remove .tar.

    • The extracted directory (where you will change your cmd's current directory) will look like this:enter image description here

1 Comment

Thanks for the response it was really helpful..since I am new your sample program helped me lot. one more thing could you suggest me some best sources to learn network programming in python..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.