18

I'm fairly new to Python and have a question regarding the following class:

class Configuration:
    def __init__(self):
        parser = SafeConfigParser()
        try:
            if parser.read(CONFIG_FILE) is None:
                raise IOError('Cannot open configuration file')
        except IOError, error:
            sys.exit(error)
        else:
            self.__parser = parser
            self.fileName = CONFIG_FILE

    def get_section(self):
        p = self.__parser
        result = []
        for s in p.sections():
            result.append('{0}'.format(s))
        return result

    def get_info(self, config_section):
        p = self.__parser
        self.section = config_section
        self.url = p.get(config_section, 'url')
        self.imgexpr = p.get(config_section, 'imgexpr')
        self.imgattr1 = p.get(config_section, 'imgattr1')
        self.imgattr2 = p.get(config_section, 'imgattr2')
        self.destination = p.get(config_section, 'destination')
        self.createzip = p.get(config_section, 'createzip')
        self.pagesnumber = p.get(config_section, 'pagesnumber')

Is it OK to add more instance variables in another function, get_info in this example, or is it best practice to define all instance variables in the constructor? Couldn't it lead to spaghetti code if I define new instance variables all over the place?

EDIT: I'm using this code with a simple image scraper. Via get_section I return all sections in the config file, and then iterate through them to visit each site that I'm scraping images from. For each iteration I make a call to get_section to get the configuration settings for each section in the config file. If anyone can come up with another approach it'll be fine! Thanks!

4
  • stackoverflow.com/questions/2964230/… Commented May 1, 2012 at 11:32
  • 1
    The self.__parser = None should be set at the beginning of the __init__(). The reason is that the __init__() is called as a first mentod of already existing object. If the parser fails to read the config file and raises the exception, the exception may by catched elswhere (the program may not be terminated). Then the object of the Configuration class still exists and the later get_info() will cause the *AttributeError: Configuration instance has no attribute '__parser'. Commented May 1, 2012 at 12:44
  • @pepr Should I read you answer the way that I should add self.__parser = None in the beginning of __init__.py or do you suggest to move the parser initialization from __init__.py to another function? Commented May 1, 2012 at 15:12
  • @happygoat No, the self.__parser should be created at the beginning of the method __init__ of the class Configuration. The __init__.py is completely unrelated to the problem. Commented May 1, 2012 at 21:45

2 Answers 2

15

I would definitely declare all instance variables in __init__. To not do so leads to increased complexity and potential unexpected side effects.

To provide an alternate point of view from David Hall in terms of access, this is from the Google Python style guide.

Access Control:

If an accessor function would be trivial you should use public variables instead of accessor functions to avoid the extra cost of function calls in Python. When more functionality is added you can use property to keep the syntax consistent

On the other hand, if access is more complex, or the cost of accessing the variable is significant, you should use function calls (following the Naming guidelines) such as get_foo() and set_foo(). If the past behavior allowed access through a property, do not bind the new accessor functions to the property. Any code still attempting to access the variable by the old method should break visibly so they are made aware of the change in complexity.

From PEP8

For simple public data attributes, it is best to expose just the attribute name, without complicated accessor/mutator methods. Keep in mind that Python provides an easy path to future enhancement, should you find that a simple data attribute needs to grow functional behavior. In that case, use properties to hide functional implementation behind simple data attribute access syntax.

Note 1: Properties only work on new-style classes.

Note 2: Try to keep the functional behavior side-effect free, although side-effects such as caching are generally fine.

Note 3: Avoid using properties for computationally expensive operations; the attribute notation makes the caller believe that access is (relatively) cheap.

Python isn't java/C#, and it has very strong ideas about how code should look and be written. If you are coding in python, it makes sense to make it look and feel like python. Other people will be able to understand your code more easily and you'll be able to understand other python code better as well.

Sign up to request clarification or add additional context in comments.

8 Comments

+1 because I do agree with the "We are all adults here" philosophy in Python. My main beef is with classes where you have to know to call a certain function to put the class in a valid state.
@DavidHall There was a really interested talk at last years pycon "Stop writing classes", here is the hacker news thread which has some good discussion of it. Both the thread and the video are worth a read. news.ycombinator.com/item?id=3717715
Cheers - I'll have a look. As you can probably tell I'm a c++/c# developer who knows some python so discussions like this are great to read.
Reading that thread reminded me of something that crossed my mind when writing the original answer that got removed while editing - while I personally like classes and such due to my history, EVERY time I've written a class called Configuration in a python project I've refactored it away within two days.
I know how you feel, I work primarily in C#, coming from an embedded C background. For all my freelance/fun projects I've moved to python now though, and it took a while to start trying to think pythonically.
|
6

I would favour setting all the instance variables in the constructor over having functions like get_info() that are required to put the class in a valid state.

With public instance variables that are only instantiated by calls to methods such as your get_info() you create a class that is a bit of a minefield to use.

If you are worried about have certain configuration values which are not always needed and are expensive to calculate (which I guess is why you have get_info(), allowing for deferred execution), then I'd either consider refactoring that subset of config into a second class or introducting properties or functions that return values.

With properties or get style functions you encourage consumers of the class to go through a defined interface and improve the encapsulation 1.

Once you have that encapsulation of the instance variables you give yourself the option to do something more than simply throw a NameError exception - you can perhaps call get_info() yourself, or throw a custom exception.


1.You can't provide 100% encapsulation with Python since private instance variables denoted by a leading double underscore are only private by convention

2 Comments

I don't see the benefit to properties over public instance variables. You can simply make everything a public instance variable, and if you later need it to perform something other than just setting/getting a value you can always convert it to a property then.
Good point - my main problem with public instance variables here is instantiating them in functions like get_info makes for unnecessarily hard to use classes. But you and Andrew Barret have moved me to edit my answer a touch, emphasising constructors.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.