python library directory structure to separate code

Question

I want to have the following directory structure:

project_name (unfortunate, I did not think to name the git repo something different)
|  ... 
|
|--project_name
|  |-- __init__.py
|  |-- module1
|  |   |-- __init__.py
|  |   |-- pyfile1.py
|  |        #contains class1 which pulls from class2 and class1
|  |-- module2
|  |   |-- __init__.py
|  |   |-- pyfile2.py
|  |        #contains class2
|  |-- module3
|      |-- __init__.py
|      |-- pyfile3.py
|           #contains class3
|--tests
|  |--test1   
   ...

The above, when I throw it to github and pip install it from github, I noticed that the sub modules are getting installed separately and not together under project_name.

I want to access class 1 succinctly while class 1 pulls from class 2 and 3. Currently, I have them calling upon eachother fine I believe as they are passing tests through pytest contained in my tests directory. Namely...

# in class1
from project_name.module2.pyfile2 import class2
from project_name.module3.pyfile3 import class3

# in test1
from project_name.module1.pyfile1 import class1

I would like to be able to call say class1 like from project_name.module1 import class1 or would I have to use from project_name.module1.pyfile1 import class1 under this structure? Which might be kind of ridiculous, I know, but I have tried a few organization iterations, namely deleting the submodule structure and simply having them all under project_name with one __init__.py. But I would to have some sort of separation of concerns now so I would like to know how to do this properly.

I figure it is the multiple __init__.pys causing the separate modules to be installed along with my setup.py which is looking inside the project_name directory instead of just installing the project_name directory which then I can access the submodules like I do with the tests:


import setuptools
import re
import os.path
import sys

with open("README.md", "r") as fh:
    long_description = fh.read()

def read(filename):
    return open(os.path.join(os.path.dirname(__file__), filename)).read()

version = re.search(r'^__version__\s*=\s*[\'"]([^\'"]*)[\'"]', read('project_one/__init__.py'), re.MULTILINE).group(1)

install_requires=['pandas>=1.0.0']
include_packages=['project_one']
tests_require=['pytest>4.0', 'pytest-pylint']
setup_requires=['pytest-runner', 'pytest-pylint']

setuptools.setup(
    name="project_one"
    , version=version
    , license='LICENSE.txt'
    , author=author
    , author_email=email
    , description=description
    , long_description=long_description
    , long_description_content_type="text/markdown"
    , url=url
    , packages=setuptools.find_packages('project_name', exclude=['tests'])
    , classifiers=[
        'Development Status :: 3 - Alpha',
        'Intended Audience :: Developers',
        'Topic :: Software Development :: Build Tools',
        'License :: OSI Approved :: MIT License',
        'Programming Language :: Python :: 3.6',
        'Programming Language :: Python :: 3.7',
        'Programming Language :: Python :: 3.8',
    ]
    , python_requires='>3.0, !=3.1.*, !=3.2.*, !=3.3.*'
    , install_requires=install_requires
    , tests_require=tests_require
    , setup_requires=setup_requires
    #, extras_require={'pandas': ['pandas>=0.14.0']}
)

So how do I stop the installs of the subdirectories, have it install the project_name directory, and structure the directory and code properly to do so.

Also, my subdirectory __init__.py are empty. The project_name's __init__.py has the following but if the project_name directory is not installed it is useless:

# -*- coding: utf-8 -*-
# PEP 8 Style Guide for Python Code.
# https://www.python.org/dev/peps/pep-0008/

import logging

# Major.Minor.Patch 
__version__ = '0.7.0'
__author__ = 'name'

from .module1.pyfile1 import class1
from .module2.pyfile2 import class2
from .module1.pyfile3 import class3

# Set default logging handler to avoid "No handler found" warnings.
try:  # Python 2.7+
    from logging import NullHandler
except ImportError:
    class NullHandler(logging.Handler):
        """For future error handling purposes"""
        def emit(self, record):
            pass

logging.getLogger(__name__).addHandler(NullHandler())

To clearify some naming: a Python file is a module and a folder with an __init__.py is a package. — Klaus D.
– Klaus D., Commented Apr 1, 2021 at 4:21
Thank you for that correction, that is very helpful. Makes a lot more sense in these terms... the three packages are being installed in site packages. — eccadena
– eccadena, Commented Apr 1, 2021 at 14:29

Paul Whipp · Accepted Answer · 2021-04-01 04:39:58Z

1

The structure you want looks strange because the file name is adding nothing but confusion with respect to the module name. It would look more natural (and work as you expect) with each module named as per the class e.g

.
└── project_name
    ├── __init__.py
    ├── class1.py
    ├── class2.py
    └── class3.py

As your modules become more complex you might want to break them up into files internally but not change any of the code accessing them. To do that you can use your __init__.py to 'publish' content for convenience (and to keep your calling protocol the same) thus:

.
└── project_name
    ├── __init__.py
    ├── class1
    │   ├── __init__.py
    │   ├── class1.py
    │   └── utils.py
    ├── class2.py
    └── class3.py

In the class1/__init__.py you can put from .class1 import <whatever> so the rest of your application does not need to change its importing.

With this approach you can start simple with one module (file) and expand it into a tree of packages as you separate concerns out during development.

For your setup, you are finding the packages in project_name so you need to remove the first parameter from find_packages.

answered Apr 1, 2021 at 4:39

Paul Whipp

16.7k6 gold badges49 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

eccadena Over a year ago

Thanks for the feedback. I have followed the recommendations and I believe I understand this better. So, I removed the "project_one" from find_packages and it breaks saying "project_one/project_one" does not exist. Also, "modules" 1, 2, and 3 are being installed separately, not under "project_one" which causes the imports to break still because of it.

eccadena Over a year ago

Ah... turns out my setup.cfg was taking over? I removed as it was one of the more recent additions to this project (first one tbh). Can you shed light on that? It just has package dir = project_one. OH but in project_one... there is not project_one. Got it.

Collectives™ on Stack Overflow

python library directory structure to separate code

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related