0

I have arrays within my data frame that I need to count.

This is the code I'm using:

indi = data_1.query("'2016-11-22' <= login_date <= '2016-12-22'").groupby(['employer_key','account_id','login_date']).count().reset_index()
indi_1 = indi.groupby(['employer_key']).account_id.unique().reset_index()
indi_1

which gives me this:

    employer_key        account_id
0   boeing              [17008601, 17008645, 17008698, 17008952, 17009...]
1   dell_inc            [10892711, 10892747, 10894032, 10894676, 10894...]
2   google              [9215462, 9216605, 9217052, 9218693, 9222937, ...]
3   sprint_corporation  [9858036, 9858809, 9859191, 9859350, 9859498, ...]
4   walmart             [2515412, 2517367, 2519765, 2520049, 2526763, ...]

I want to count the numbers in the array so it looks like this:

  employer_key         account_id
0   boeing             5000
1   dell_inc           289
2   google             789
3   sprint_corporation 154670
4   walmart            4689

How can I do this? I'm using pandas. I'm also very new to python, so simpler the better.

1 Answer 1

2

If the account_id column contains lists, you can use str.len() to calculate the number of elements in each cell:

df['account_id_count'] = df.account_id.str.len()
df

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.