0

I am creating a Python package that needs certain data files in order to work. I've been looking for a way to include these data files with the package installation. I found a way using importlib.resources.files(). However, I'm receiving an error when I try to decode the objects I am returned.

I've created a barebones example package. The package tree is as follows.

.
├── package
│   ├── __init__.py
│   ├── one.ppn
│   └── two.rhn
├── pyproject.toml
└── setup.py

1 directory, 5 files

The entire point of this example package is to be able to access one.ppn and two.rhn. This is done by identifying absolute file paths, and then savings them as constants to be imported. The code is located in __init__.py.

# package.__init__.py

from importlib.resources import files


PACKAGE_DATA = files('package')

KEYWORD_PATH = PACKAGE_DATA.joinpath('one.ppn')

print(PACKAGE_DATA)
print(KEYWORD_PATH)

CONTEXT_PATH = PACKAGE_DATA.joinpath('two.rhn').read_text()

I have created an editable install (pip3 install -e ../Package) in a seperate directory. If I then import package, I receive the following output.

/home/millertime/Desktop/Package/package
/home/millertime/Desktop/Package/package/one.ppn
Traceback (most recent call last):
  File "/home/millertime/Desktop/Test/test.py", line 1, in <module>
    import package
  File "/home/millertime/Desktop/Package/package/__init__.py", line 11, in <module>
    CONTEXT_PATH = PACKAGE_DATA.joinpath('two.rhn').read_text()
  File "/usr/lib/python3.9/pathlib.py", line 1256, in read_text
    return f.read()
  File "/usr/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 0: invalid continuation byte

You can see that importlib is functioning perfectly at first, and has correctly identified the absolute file paths to my data files. However, when I try to decode to a str that I can actually use, I receive a UnicodeDecodeError.

I'm not sure if my pyproject.toml file is relevant, so I'm going to include it here. The only part I could see contributing to the problem is [tool.setuptools.package-data].

# pyproject.toml

[build-system]
requires = ["setuptools>=61.0.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "package"
version = "1.0.0"

[tool.setuptools]
packages = [
    "package"
]

[tool.setuptools.package-data]
package = [
    "one.ppn",
    "two.rhn"
]

I researched other instances of this error and tried a couple of things to solve it.

  1. I attempted to create my own decoding method using a with statement and the read_bytes() method of the object, but received the same error.
  2. I saw that many of the errors were related to the encoder, and thought that maybe I was using the wrong one (utf-8). I installed chardet to tell me what kind I should use, and received another error relating to being unable to decode due to an "invalid continuation byte".

It seems to me that this an internal problem with importlib. I don't see how it could be related to my data file types, given it's just a string representing a file path, not the actual data of the file.

I am currently using Python 3.9.2 on a Raspberry Pi 4. Thanks in advance.

6
  • 1
    byte 0xe0 in position 0 -> the very first byte of the file is problematic. Does the file actually contain text data? Can you show us the few bytes of the file, and the text that you expect them to represent? Commented Dec 3, 2023 at 17:00
  • @snakecharmerb The bytes do not represent the file data, they represent the file path. Commented Dec 3, 2023 at 17:08
  • 1
    Are you sure? It looks to me as if the error is in read_text(), so it is trying to decode the contents of the file that the path points to. (But I have covid at the moment, perhaps it has addled my brain) Commented Dec 3, 2023 at 17:12
  • @snakecharmerb You're right. I'm an idiot. Thanks for solving my problem, will use the proper method this time. Go ahead and post as an answer if you would like. Commented Dec 3, 2023 at 17:14
  • 1
    Solved it. Thanks for your help, good luck with COVID. Commented Dec 3, 2023 at 17:28

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.