2

I have a function that iterates over the rows of a csv for the Age column and if an age is negative, it will print the Key and the Age value to a text file.

def neg_check():
    results = []

    file_path = input('Enter file path: ')
    file_data = pd.read_csv(file_path, encoding = 'utf-8')
    
    for index, row in file_data.iterrows():
        if row['Age'] < 0:
            results.append((row['Key'], row['Age']))
    with open('results.txt', 'w') as outfile:
        outfile.write("\n".join(map(str, results)))   
        outfile.close()

In order to make this code repeatable, how can I modify it so it will iterate the rows if the column starts with "Age"? My files have many columns that start with "Age" but end differently. . I tried the following...

if row.startswith['Age'] < 0:

and

if row[row.startswith('Age')] < 0:

but it throws AttributeError: 'Series' object has no attribute 'startswith' error.

My csv files:

sample 1

Key   Sex     Age
    1        Male          46
    2        Female        34

sample 2

Key   Sex     AgeLast
    1        Male          46
    2        Female        34

sample 3

Key   Sex     AgeFirst
    1        Male          46
    2        Female        34

1 Answer 1

2

I would do this in one step, but there are a few options. One is filter:

v = df[df.filter(like='AgeAt').iloc[:, 0] < 0]

Or,

c = df.columns[df.columns.str.startswith('AgeAt')][0]
v = df[df[c] < 0]

Finally, to write to CSV, use

if not v.empty:
    v.to_csv('invalid.csv')

Looping over your data is not necessary with pandas.

Sign up to request clarification or add additional context in comments.

3 Comments

Works great, thanks @coldspeed! Can you explain what the [0] means in the first line or direct me to the documentation please?
@n8_ df.columns[df.columns.str.startswith('AgeAt')] returns a list of (in your case) one column name. From this list, I extract the first element with [0].
After some testing, this creates the file even if no negative ages were found (just writes headers). How can I modify to only write to file if negative ages are found? Using an if-statement returns: ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.