Read certain column in excel to dataframe

Question

I want to read certain column from excel file into dataframe however I want to specify the column with its column header name.

for an example, I have an excel file with two columns in Sheet 2: "number" in column A and "ForeignKey" in column B). I want to import the "ForeignKey" into a dataframe. I did this with the following script:

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols=[0,1])

It shows the following in my xl_file:

       number ForeignKey
0       1        abc
1       2        def
2       3        ghi

in case a small number of column, I can get the "ForeignKey" by specifying usecols=[1]. However if I have many column and know the column name pattern, it will be easier by specifying the column name. I tried the following code but it gives empty dataframe.

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols=['ForeignKey'])

According to discussion in the following link, the code above works well but for read_csv.

[How to drop a specific column of csv file while reading it using pandas?

Is there a way to do this for reading excel file?

thank you in advance

meW · Accepted Answer · 2019-01-09 09:22:39Z

3

You need to pass excel column name, that too in a format of range e.g. colname:colname.

For instance, if the ForeignKey appears in column B of your excel sheet 2, then do -

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols='B:B')

Refer to Github issue and prescribed solution for the same.

answered Jan 9, 2019 at 9:22

meW

3,97710 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

meW Over a year ago

@anky_91 I checked with usecols='ForeignKey' also but I received an empty dataframe.

Fadri Over a year ago

That is the case. I've an excel that contains hundreeds of columns but with date and time naming. Because I know what date and time I want to know, it would be more efficient by specifying the column name, but not the excel column name. I cannot use directy this column header name with read_excel as read_csv can.

meW Over a year ago

@anky_91 I don't think OP knows which column will have foreign key, and making such dictionary of 100 pairs doesn't seem practical.

Community · Accepted Answer · 2020-06-20 09:12:55Z

2

there is a solution but csv are not treated the same way excel does.

from documentation, for csv:

usecols : list-like or callable, default None

For example, a valid list-like usecols parameter would be [0, 1, 2] or [‘foo’, ‘bar’, ‘baz’].

for excel:

usecols : int or list, default None

If None then parse all columns,

If int then indicates last column to be parsed

If list of ints then indicates list of column numbers to be parsed

If string then indicates comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”). Ranges are inclusive of both sides

so you need to call it like this:

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols='ForeignKey')

and if you need also 'number':

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2', usecols='number,ForeignKey')

EDIT: you need to put the name of the excel column not the name of the data. the other answer solve this. however you won't need 'B:B', 'B' will do the trick BUT that won't improve the usecols with numbers.

if you can load all the datas in not time maybe the best way to solve this is to parse all columns and then select the desired columns:

xl_file = pd.read_excel('D:/SnapPython/TestDF.xlsx', sheet_name='Sheet 2')['ForeignKey']

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Jan 9, 2019 at 9:18

Frayal

2,17113 silver badges19 bronze badges

2 Comments

meW Over a year ago

Alexis It's not the right solution. Did you verify it?

Fadri Over a year ago

@Alexis, your last suggestion can work for me. I'll accept it for this question. thank you

Collectives™ on Stack Overflow

Read certain column in excel to dataframe

2 Answers 2

3 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related