1

I am working in python for quiet a time, but stuck at simple problem.I have to run crosstab functions for different variables with same ID variable(masteruserid)

pd.crosstab(data['MasterUserId'],visittime_cat)
pd.crosstab(data['MasterUserId'],week_cat)

Now I want to do the same about 7-8 times. Instead of calling the crosstab function recurvisely, I want to put inside a loop and generate a crosstab dataset for each iteration. I tried this, but was not successful

def cross_tab(id_col,field):
    col_names=['visittime_cat','week_cat','var3','var4']
    for i in col_names:
        'crosstab_{ }'.format(i)=pd.crosstab(id_col,i)

I want to generate datasets such as crosstab_visittime_cat,crosstab_week_cat or as crosstab_1, crosstab_2 and so on.

1 Answer 1

1

Might I suggesting storing the datasets in a dictionary?

def cross_tab(data_frame, id_col):
    col_names=['visittime_cat','week_cat','var3','var4']
    datasets = {}
    for i in col_names:
        datasets['crosstab_{}'.format(i)] = pd.crosstab(data_frame[id_col], data_frame[i])
    return datasets

Testing with a fictional data set

import numpy as np
import pandas as pd

data = pd.DataFrame({'MasterUserId': ['one', 'one', 'two', 'three'] * 6,
             'visittime_cat': ['A', 'B', 'C'] * 8,
             'week_cat': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4,
             'var3': np.random.randn(24),
             'var4': np.random.randn(24)})

storage = cross_tab(data, "MasterUserId")

storage.keys()
['crosstab_week_cat', 'crosstab_var4', 'crosstab_visittime_cat', 'crosstab_var3']

storage['crosstab_week_cat']
week_cat      bar  foo
MasterUserId          
one             6    6
three           4    2
two             2    4

[3 rows x 2 columns]
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for your response. But It's not working, its displaying key error. I tried to rectified it by placing quotes inside the bracket, but of no use. datasets["'crosstab_{ }'.format(i)"]=pd.crosstab(id_col,i)
In which line does it issue a KeyError and wich Key does it complain about?
The line which I mentioned above in the comment. It displays key error ' '
I'll update with a working version with a fictional data set to see if it fixes the problem for you
It would not run unless the error is fixed. It is displaying Key error when this line is executed: datasets['crosstab_{}'.format(i)] = pd.crosstab(data_frame[id_col], data_frame[i])
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.