0

I have a pandas data frame with columns Longitude and Latitude. I'd like to get X and Y from them. There is a function in utm called from_latlon that does this. It receives Latitude and Longitude and gives [X,Y]. Here's what I do:

    def get_X(row):
        return utm.from_latlon(row['Latitude'], row['Longitude'])[0]

    def get_Y(row):
        return utm.from_latlon(row['Latitude'], row['Longitude'])[1] 

    df['X'] = df.apply(get_X, axis=1)
    df['Y'] = df.apply(get_Y, axis=1)

I'd like to define a function get_XY and apply from_latlon just one time to save time. I took a look at here, here and here but I could not find a way to make two columns with one apply function. Thanks.

2 Answers 2

6

You can return a list from your function:

d = pandas.DataFrame({
    "A": [1, 2, 3, 4, 5],
    "B": [8, 88, 0, -8, -88]
})

def foo(row):
    return [row["A"]+row["B"], row["A"]-row["B"]]

>>> d.apply(foo, axis=1)
    A   B
0   9  -7
1  90 -86
2   3   3
3  -4  12
4 -83  93

You can also return a Series. This lets you specify the column names of the return value:

def foo(row):
    return pandas.Series({"X": row["A"]+row["B"], "Y": row["A"]-row["B"]})

>>> d.apply(foo, axis=1)
    X   Y
0   9  -7
1  90 -86
2   3   3
3  -4  12
4 -83  93
Sign up to request clarification or add additional context in comments.

2 Comments

Based on your first solution, I use temp = d.apply(foo, axis=1) and then do d['sum'] = [item[0] for item in temp] and d['subtract'] = [item[1] for item in temp]. Is there a better way to do it. If I do d[['sum','subtract']] = d.apply(foo, axis=1) I get an error. I guess at this point it's a matter of returning the result into the original data frame. Your second solution doesn't work for me unfortunately because of my specific function. Thanks.
@bikhaab: There's no simple way to assign multiple columns into a DataFrame at once, so that's kind of a separate issue. You could use concat or merge to join the result DataFrame with your original one. See this question for some related ideas.
1

I merged a couple of the answers from a similar thread and now have a generic multi-column in, multi-column out template I use in Jupyter/pandas:

# plain old function doesn't know about rows/columns, it just does its job.
def my_func(arg1,arg2):
    return arg1+arg2, arg1-arg2  # return multiple responses

df['sum'],df['difference'] = zip(*df.apply(lambda x: my_func(x['first'],x['second']),axis=1))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.