0

I am writing an intelligent application that determines what factors lead to 0 children in a relationship based off of data from the UCI Machine Learning Repository's contraceptive method choice dataset cited Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. I am having trouble writing a lambda expression using the pandas apply function.

I am unsure what to try.

Here is some of the sample file

wife's age, wife's education, husband's education, number of children, wife's religion, wife now working, husband's occupation, standard-of-living index, media exposure, contraceptive method used
24,2,3,3,1,1,2,3,0,1
45,1,3,10,1,1,3,4,0,1
43,2,3,7,1,1,3,4,0,1
42,3,2,9,1,1,3,3,0,1
36,3,3,8,1,1,3,2,0,1
19,4,4,0,1,1,3,3,0,1

and here is my code

#import modules
import pandas as pd

#define functions
def read_datafile():
    d = pd.read_csv('cmc.data.txt', sep=',')
    return d

def create_bin_label(data):
    data['numchildren'] = data.apply(lambda row: 1 if (row['number of children']) <= 0 else 0, axis=1)
    data = data.drop(['number of children'], axis=1)

#read in datafile
data = read_datafile()
print(len(data))

#create a binary label column and delete the old column
bl = create_bin_label(data)
print(data.head())

I expect create_bin_label(data) to isolate one value from a set of numerical values found in a numerical attribute ex) number of children can be any number but I only want 0, I also expect it to add the column "numchildren" as a binary label, and I expect create_bin_label(data) to delete the old column (its called "number of children." What create_bin_label(data) does is return an error that looks like this (although I think the important part is that some str is trying to be processed as an int but I am unsure where that is happening)

Traceback (most recent call last):
  File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\indexes\base.py", line 4381, in get_value
    return libindex.get_value_box(s, key)
  File "pandas\_libs\index.pyx", line 52, in pandas._libs.index.get_value_box
  File "pandas\_libs\index.pyx", line 48, in pandas._libs.index.get_value_at
  File "pandas\_libs\util.pxd", line 113, in pandas._libs.util.get_value_at
  File "pandas\_libs\util.pxd", line 98, in pandas._libs.util.validate_indexer
TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/Hezekiah/PycharmProjects/Artificial Intelligence 0/Chapter 1 Application Contraception.py", line 24, in <module>
    bl = create_bin_label(data)
  File "C:/Users/Hezekiah/PycharmProjects/Artificial Intelligence 0/Chapter 1 Application Contraception.py", line 14, in create_bin_label
    data['numchildren'] = data.apply(lambda row: 1 if (row['number of children']) <= 0 else 0, axis=1)
  File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\frame.py", line 6487, in apply
    return op.get_result()
  File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\apply.py", line 151, in get_result
    return self.apply_standard()
  File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\apply.py", line 257, in apply_standard
    self.apply_series_generator()
  File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\apply.py", line 286, in apply_series_generator
    results[i] = self.f(v)
  File "C:/Users/Hezekiah/PycharmProjects/Artificial Intelligence 0/Chapter 1 Application Contraception.py", line 14, in <lambda>
    data['numchildren'] = data.apply(lambda row: 1 if (row['number of children']) <= 0 else 0, axis=1)
  File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\series.py", line 868, in __getitem__
    result = self.index.get_value(self, key)
  File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\indexes\base.py", line 4389, in get_value
    raise e1
  File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\indexes\base.py", line 4375, in get_value
    tz=getattr(series.dtype, 'tz', None))
  File "pandas\_libs\index.pyx", line 81, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 89, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: ('number of children', 'occurred at index 0')
1
  • 1
    Why not data.apply(lambda row: row['number of children'] <= 0, axis=1) ? It will give you a bool. Easier I think. Commented Apr 2, 2019 at 14:09

1 Answer 1

1
import pandas as pd

#define functions
def read_datafile():
    d = pd.read_csv('cmc.data.txt', sep=',')
    return d

def create_bin_label(data,columns):
    # i added an extra columns argument that holds a list of all column names 
    # the 'number of children' column is on position 3 in the list
    data['numchildren'] = data.apply(lambda row: 1 if (row[columns[3]]) <= 0 else 0, 
                           axis=1)
    data = data.drop([columns[3]], axis=1)

#read in datafile
data = read_datafile()
print(len(data))
columns = data.columns.values #this creates the list of the dataframe's column names

#create a binary label column and delete the old column
bl = create_bin_label(data,columns) # remember to insert the var that holds the cols
print(data)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.