0

I'm looking to extend a Panda's DataFrame, creating an object where all of the original DataFrame attributes/methods are in tact, while making a few new attributes/methods available. I also need the ability to convert (or copy) objects that are already DataFrames to my new class. What I have seems to work, but I feel like I might have violated some fundamental convention. Is this the proper way of doing this, or should I even be doing it in the first place?

import pandas as pd

class DataFrame(pd.DataFrame):
    def __init__(self, df):
        df.__class__ = DataFrame # effectively 'cast' Pandas DataFrame as my own

the idea being I could then initialize it directly from a Pandas DataFrame, e.g.:

df = DataFrame(pd.read_csv(path))
2
  • 1
    You're mixing up inheritance and composition. Your DataFrame class both "has a" and "is a" pd.DataFrame. Commented Jun 29, 2018 at 19:21
  • self = df doesn't do anything Commented Jun 29, 2018 at 20:33

3 Answers 3

1

I'd probably do it this way, if I had to:

import pandas as pd

class CustomDataFrame(pd.DataFrame):
    @classmethod
    def convert_dataframe(cls, df):
        df.__class__ = cls
        return df

    def foo(self):
        return "Works"


df = pd.DataFrame([1,2,3])
print(df)
#print(df.foo())    # Will throw, since .foo() is not defined on pd.DataFrame

cdf = CustomDataFrame.convert_dataframe(df)
print(cdf)
print(cdf.foo())    # "Works"

Note: This will forever change the df object you pass to convert_dataframe:

print(type(df))     # <class '__main__.CustomDataFrame'>
print(type(cdf))    # <class '__main__.CustomDataFrame'>

If you don't want this, you could copy the dataframe inside the classmethod.

Sign up to request clarification or add additional context in comments.

Comments

1

If you just want to add methods to a DataFrame just monkey patch before you run anything else as below.

>>> import pandas                                
>>> def foo(self, x):                            
...     return x                                 
...                                              
>>> foo                                          
<function foo at 0x00000000009FCC80>             
>>> pandas.DataFrame.foo = foo                   
>>> bar = pandas.DataFrame()                     
>>> bar                                          
Empty DataFrame                                  
Columns: []                                      
Index: []                                        
>>> bar.foo(5)                                   
5                                                
>>>

2 Comments

Thanks for the response, I actually wanted to add some attributes to the dataframe that specifically interact with the methods, I've updated the question
You can create an initializer method (not called __init__) that monkey patches your new attributes onto the data frame after it is created
0
if __name__ == '__main__':
    app = DataFrame()
    app()

event

super(DataFrame,self).__init__()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.