1

My Pandas function is returning "None" as a result instead of the DataFrame that I am trying to filter using the function that I have written. Why is this so? And how can I resolve this? Thank you!

import pandas as pd
nz_data = pd.read_csv('research-and-development-survey-2016-2019-csv.csv', index_col = 2)

def count_of_mining_biz():
    if "B_Mining" in nz_data[["Breakdown_category"]] and "Count of businesses" in nz_data[["Units"]]:
        return nz_data.loc["2019", "RD_Value"]

print(count_of_mining_biz())

Here is how the data looks like.

I am trying to find out the RD Value in 2019 for the Mining industry. The reason why I have to set a conditional for the "Units" column is because there is another type of data that is not the count for the business mentioned.

1 Answer 1

2

.loc[..., ...] means .loc[row_index, col_index] but there's no row index called 2019.

Try using .loc with boolean masks in this case:

def count_of_mining_biz():
    category = nz_data['Breakdown_category'] == 'B_Mining'
    units = nz_data['Units'] == 'Count of businesses'
    year = nz_data['Year'] == 2019
    return nz_data.loc[category & units & year].RD_Value
Sign up to request clarification or add additional context in comments.

7 Comments

Hi Geof, thank you for your help! I encountered an error that says: KeyError: 'Year'. How can I resolve it?
@pleepc can you copy-paste the output of print(nz_data.columns)
hey Geof, here's the output: Index(['Variable', 'Breakdown_category', 'RD_Value', 'Units'], dtype='object'). However, even after I removed "Year" as my index_col, the output of print(count_of_mining_biz()) gives me: Series([], Name: RD_Value, dtype: object). Am I doing something wrong here?
@pleepc (1) If Year is a str instead of int, it needs quotes like '2019' instead of just 2019, so check the output of print(type(df.iloc[0].Year)) to see which it is. (2) Units and Breakdown_category might have leading/trailing spaces, so strip() them first just in case (and if Year ends up being a str, then strip() it too): nz_data['Units'] = nz_data['Units'].str.strip(); nz_data['Breakdown_category'] = nz_data['Breakdown_category'].str.strip()
hey Geof, thank you so much, it works now after I removed the 'index_col = 2' argument. I managed to get the correct answer, which is '3', however, python returns me another number before that, so I am getting '5 3'. Do you mind explaining why does it have the '5' in front?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.