Python: code duplication on class attribute definition

Question

I'm trying to implement a simple ORM in python. I'm facing a code duplication issue and I do not know how to solve it. Here is a simplified example of a class in my project:

class Person:

    TABLE_NAME = 'person'

    FIELDS = [
        ('name', 'VARCHAR(50)'),
        ('age', 'INTEGER')
    ]

    # CODE DUPLICATION: the two next lines shoudl be genereated with FIELDS not hard coded...
    name: str
    age: int

    def __init__(self, **kwargs):
        self.__dict__ = kwargs

    @classmethod
    def create_sql_table(cls):
        # use TABLE_NAME and FIELDS to create sql table
        pass


alice = Person(name='Alice', age=25)

print(alice.name)

If I remove the two lines name: strand age: int I lose auto-completion and I get a mypy error on the print line (Error: Person has no attribute name)

But If I keep it, I have code duplication (I write twice each field name).

Is there a way to avoid the code duplication (by generating this two lines using FIELDS variable for instance) ?

Or another way to implement this class that avoid code duplication (without mypy error and auto-completion loss) ?

Wombatz · Accepted Answer · 2022-03-03 13:30:24Z

You can use descriptors:

from typing import Generic, TypeVar, Any, overload, Union

T = TypeVar('T')

class Column(Generic[T]):
    sql_type: str  # the field type used for this column

    def __init__(self) -> None:
        self.name = ''  # the name of the column

    # this is called when the Person class (not the instance) is created
    def __set_name__(self, owner: Any, name: str) -> None:
        self.name = name  # now contains the name of the attribute in the class

    # the overload for the case: Person.name -> Column[str]
    @overload
    def __get__(self, instance: None, owner: Any) -> 'Column[T]': ...

    # the overload for the case: Person().name -> str
    @overload
    def __get__(self, instance: Any, owner: Any) -> T: ...

    # the implementation of attribute access
    def __get__(self, instance: Any, owner: Any) -> Union[T, 'Column[T]']:
        if instance is None:
            return self
        # implement your attribute access here
        return getattr(instance, f'_{self.name}')  # type: ignore

    # the implementation for setting attributes
    def __set__(self, instance: Any, value: T) -> None:
        # maybe check here that the type matches
        setattr(instance, f'_{self.name}', value)

Now we can create specializations for each column type:

class Integer(Column[int]):
    sql_type = 'INTEGER'

class VarChar(Column[str]):
    def __init__(self, size: int) -> None:
        self.sql_type = f'VARCHAR({size})'
        super().__init__()

And when you define the Person class we can use the column types

class Person:
    TABLE_NAME = 'person'

    name = VarChar(50)
    age = Integer()

    def __init__(self, **kwargs: Any) -> None:
        for key, value in kwargs.items():
            setattr(self, key, value)


    @classmethod
    def create_sql_table(cls) -> None:
        print("CREATE TABLE", cls.TABLE_NAME)
        for key, value in vars(cls).items():
            if isinstance(value, Column):
                print(key, value.sql_type)


Person.create_sql_table()

p = Person(age=10)
print(p.age)
p.age = 20
print(p.age)

This prints:

CREATE TABLE person

name VARCHAR(50)

age INTEGER

10

20

You should probably also create a base Model class that contains the __init__ and the class method of Person

You can also extend the Column class to allow nullable columns and add default values.

Mypy does not complain and can correctly infer the types for Person.name to str and Person.age to int.

Which version of python are you using? The __set_name__ mechanism requires at least python 3.6
I'm using 3.7. I found the issue I had, now everything is working fine. Thx.

sudden_appearance · Accepted Answer · 2022-03-03 11:41:10Z

1

Ok, I ended up with that

class Person:
    # this is not full, you need to fill other types you use it with the correct relationship
    types = {
        str: 'VARCHAR(50)',
        int: 'INTEGER',
    }  # you should extract that out if you use it elsewhere


    TABLE_NAME = 'person'
    

    # NOTE: the only annotated fields should be these. if you annotate anything else, It will break
    name: str
    age: int

    def __init__(self, **kwargs):
        self.__dict__ = kwargs

    @property
    def FIELDS(cls):
        return [(key, cls.types[value]) for key, value in cls.__annotations__.items()]


alice = Person(name='Alice', age=25)

print(alice.FIELDS)  # [('name', 'VARCHAR(50)'), ('age', 'INTEGER')]

And

>>> mypy <module>
>>> Success: no issues found in 1 source file

answered Mar 3, 2022 at 11:41

sudden_appearance

2,2131 gold badge6 silver badges21 bronze badges

1 Comment

Vince M Over a year ago

Interesting solution, thank you. Unfortunately, it has some downside: only one field per type (for instance if Person has name 'VARCHAR(50)' and description 'TEXT', it does not work). But still it helps me (i did not know it was possible to get class annotations data with cls.__annotations__, i might search in this direction).

Anshuman Tiwari · Accepted Answer · 2022-03-03 11:30:09Z

0

In the class Person try to add data type in constructor

answered Mar 3, 2022 at 11:30

Anshuman Tiwari

11 bronze badge

2 Comments

Vince M Over a year ago

Hi, thank you for the answer. You mean like that: def__init__(self, name: str, age: int) ? It would be the same code duplication

Community Over a year ago

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Collectives™ on Stack Overflow

Python: code duplication on class attribute definition

3 Answers 3

2 Comments

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related