0

I want to create a class that automatically adds a holder for values that I can access in the future so that when I run cd.orders or cd.users it will return or give me a dataframe of each of the tables I just queried against.

Heres my sample code:

class samplecode:
    def __init__(self,credentials):
        c = credentials ('DATABASE', 'USER', 'PASSWORD', 'HOST', 'PORT', 'SCHEMA')
        print('credentials loaded')
        self.connection_string = "postgresql://%s:%s@%s:%s/%s" % (c.USER,
                                                                  c.PASSWORD,
                                                                  c.HOST,
                                                                  str(c.PORT),
                                                                  c.DATABASE)
        self.engine = sa.create_engine(connection_string)
        print('redshift connected')
        self.data = []

    def get_db(self,tables):
        for t in tables:
            self.data = pd.read_sql_query('SELECT * FROM database.{} limit 10'.format(t),engine)
            print(self.data.head(2))

cd = samplecode(credential)
# llf.view_obj
cd.get_db(['orders','user'])

What I am hoping is that after cd.get_db it will return or give me two instances/objects. When I type dir(cd)

i should be able to do cd.orders and cd.user and if I add more to the list cd.xyz.

I tried this but could only access the most recent df since it overwrites the other df

class Wrapper(object):
    def __init__(self, data):
        self.data = data
    def __getattr__(self, attr):
        return [d[attr] for d in self.data]

# Wrapper([{'x': 23}, {'x': 42}, {'x': 5}]) 
instancelist = ['orders','user']

for i in instancelist:
    data = Wrapper([{i:'a'}])
cd.data

Hopnig for help and clarification on the matter thanks!

or if this is confusing, consider the following:

class BaseClass:
    def __init__(self):
        self.a = []
        self.b = []

    def execute_query(self,table_name):
        for tables in table_name:
            self.table_name = run_query()

table_list = ['D','E','F']
test = BaseClass
test.execute_query(table_list)

dir(test)
[
 'a',
 'b',
 'D',
 'E',
 'F'
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
]

1 Answer 1

1

It sounds like you're looking for the setattr builtin. You can call it to assign an attribute (given as a string), to an object. So rather than printing out your tables, you can assign each one to an attribute named after the table's name:

def get_db(self,tables):
    for t in tables:
        data = pd.read_sql_query('SELECT * FROM database.{} limit 10'.format(t), engine))
        setattr(self, t, data)

You could also do things in the other direction, and have the lookup of an attribute trigger a database query. For that you'd want to add a __getattr__ method to your class. That would be called when an attribute was looked up and not found normally.

def __getattr__(self, name):
    data = pd.read_sql_query('SELECT * FROM database.{} limit 10'.format(name), engine))
    setattr(self, name, data)  # save to an attribute so we don't need to query it again
    return data
Sign up to request clarification or add additional context in comments.

1 Comment

Hi blacknkight thanks it worked; I have a new issue tho, i didn't foresee that it will have an issue when called as an sql statement in pandasql so .when i do "Select from cls.table" using the code above it doesn't work. My workaround is to pass the instance as an object in the local python variables i.e o = cls.table

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.