1

SEE UPDATE AT THE END FOR A MUCH CLEARER DESCRIPTION.

According to http://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.DataFrame.apply.html you can pass external arguments to an apply function, but the same is not true of applymap: http://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.DataFrame.applymap.html#pandas.DataFrame.applymap

I want to apply an elementwise function f(a, i), where a is the element, and i is a manually entered argument. The reason I need that is because I will do df.applymap(f) in a loop for i in some_list.

To give an example of what I want, say I have a DataFrame df, where each element is a numpy.ndarray. I want to extract the i-th element of each ndarray and form a new DataFrame from them. So I define my f:

def f(a, i):
    return a[i]

So that I could make a loop which would return the i-th element of each of the np.ndarray contained in df:

for i in some_series:
    b[i] = df.applymap(f, i=i)

so that in each iteration, it would pass my value of i into the function f.

I realise it would all have been easier if I had used MultiIndexing for df but for now, this is what I'm working with. Is there a way to do what I want within pandas? I would ideally like to avoid for-looping through all the columns in df, and I don't see why applymap doesn't take keyword arguments, while apply does.

Also, the way I currently understand it (I may be wrong), when I use df.apply it would give me the i-th element of each row/column, instead of the i-th element of each ndarray contained in df.


UPDATE:

So I just realised I could split df into Series and then use the pd.Series.apply which could do what I want. Let me just generate some data to show what I mean:

def f(a,i):
    return a[i]

b = pd.Series(index=range(10), dtype=object)
for i in b.index:
    b[i] = np.random.rand(5)

b.apply(f,args=(1,))

Does exactly what I expect, and want it to do. However, trying with a DataFrame:

b = pd.DataFrame(index=range(4), columns=range(4), dtype=object)
for i in b.index:
    for col in b.columns:
        b.loc[i,col] = np.random.rand(10)

b.apply(f,args=(1,))

Gives me ValueError: Shape of passed values is (4, 10), indices imply (4, 4).

0

3 Answers 3

3

You can use it:

def matchValue(value, dictionary):
    return dictionary[value]

a = {'first':  1, 'second':  2}
b = {'first': 10, 'second': 20}
df['column'] = df['column'].map(lambda x: matchValue(x, a))
Sign up to request clarification or add additional context in comments.

Comments

2

This is a solution where argument is stored within a nested method

f(cell,argument):
    """Do something with cell value and argument"""
    return output

def outer(argument):
   def inner(cell):
        return f(cell,argument)

   return inner 

argument = ...
df.applymap(func = outer(argument))

Comments

0

Pandas applymap doesn't accept arguments, DataFrame.applymap(func). If you want to maintain an i as state, you can store it as a global variable that's accessed/modified by func, or use a decorator.

However, I would recommend you to try the apply method.

4 Comments

See the update. Is there a way to make the apply function do what I want? I don't really understand the error it's giving me (there's a ton of text), but I assumed it's trying to return the i-th row of b, instead of the i-element of each element of b.
Do you want to use f on a list or series, or on a 2D dataframe? Pandas apply applies function along input axis of DataFrame. And applymap apply a function to a DataFrame that is intended to operate elementwise, i.e. like doing map(func, series) for each series in the DataFrame.
Essentially I would like the functionality of applymap (so apply func on each element of df/b), while being able to pass my "external" argument i into func. As you said, it seems I could use global variables or perhaps function attributes or something, or just split df into Series, but I was just wondering if there was a way to do that directly within pandas.
It depends on how do you define your i-th element of a 2D array? If it is i = row * n_col + col, pandas doesn't have a direct way for that, but you may consider using apply twice or flattening the dataframe to a list first.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.