Python function passing parameters

Question

I am trying to write a simple function that will give me a count of unique values from a specific column in pandas df. I would like to use the column name as the function parameter. However,the parameter does not get recognized as string inside the function.

Here is what I am trying to convert to a function where c_type is a column name.

c_type_count = data.groupby('c_type').c_type.count()

Here is the function. I use parameter column to pass the column name:

def uniques(column):
    count = data.groupby(column).column.count()
    print(count)

The groupby(column) part works as indented but the second reference .column stays as .column and I get an error because there is no column by that name in the df.

I understand what is happening there but since I am new to Python I don't necessarily know who to switch the syntax.

sacuL · Accepted Answer · 2018-05-29 17:08:32Z

2

I think you're simply looking for value_counts()

data['c_type'].value_counts()

Gives exactly what you describe you're looking for.

Example:

>>> data
  b_type c_type
0      d      b
1      d      a
2      d      a
3      c      a
4      c      a
5      d      b
6      c      a
7      d      b
8      c      b
9      c      a

>>> data['c_type'].value_counts()
a    6
b    4

How to fix your custom function

If you want to keep using your custom function, you just have to use standard indexing rather than attribute indexing, in other words, use square brackets instead of the dot notation to access your column. See the documentation on indexing for more info

def uniques(column):
    count = data.groupby(column)[column].count()
    # Alternatively:
    # count = data.groupby(column).size()
    print(count)

This works as you want:

>>> uniques('c_type')
c_type
a    6
b    4

edited May 29, 2018 at 17:08

answered May 29, 2018 at 16:52

sacuL

51.6k9 gold badges88 silver badges115 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

user3088202 Over a year ago

Thank you. I am curious to know though how I can make sure that the parameter gets picked as string for .column. Is there a different syntax that i need to use?

user3088202 Over a year ago

Awesome Thank you again.

BENY Over a year ago

You can using size , rather than count here

sacuL Over a year ago

@Wen, true, I'll add that. Is there an advantage to size in this case, though?

BENY Over a year ago

Then you can just data.groupby(column).size()

Dalvenjia · Accepted Answer · 2018-05-29 16:57:13Z

1

This is by design, in your example you are calling the column method of the GroupBy object, python never looks for column value in the current scope. What you are looking for is the built-in function getattr() which will get an object attribute/method by its string name.

def uniques(column):
    count = getattr(data.groupby(column), column).count()
    print(count)

answered May 29, 2018 at 16:57

Dalvenjia

2,0731 gold badge15 silver badges17 bronze badges

1 Comment

abarnert Over a year ago

With a Pandas dataframe, the columns are dict items. They're also available as attribute names as a convenience, when they happen to be valid identifier names and don't conflict with standard attributes, but that doesn't mean you should look them up as attributes rather than indices when you don't need that convenience. In other words, don't do getattr(df, colname); just do df[colname].

Collectives™ on Stack Overflow

Python function passing parameters

2 Answers 2

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related