Encountered a strange problem. The following code snippet won't run if I execute the "load_data.py" file. However, when I run the lines in an IDE console it runs without problems. Hard coding the file path would work, and I have read solutions that suggested appending the data path to PYTHONPATH, however I'm not sure this is a good solution because I want to push this to a Docker container.
Can someone help me figure out what seems to be the problem here?
import pandas as pd
from os import path
def _preprocess_data(data_path):
try:
## load data ##
data = pd.read_json(data_path)
except ValueError:
print("File not found. Check the path variable and filename")
exit()
if __name__ == '__main__':
print('Preprocessing data...')
#### Preparation ####
file_path = path.abspath('load_data.py') # full path of your script
dir_path = path.dirname(file_path) # full path of the directory of your script
json_file_path = path.join(dir_path, 'data/clean_data.json') # absolute zip file path
_preprocess_data(data_path=json_file_path)
print(file_path) >>> /Users/YalDan/myproject/load_data.py print(dir_path) >>> /Users/YalDan/myproject print(json_file_path) >>> /Users/YalDan/myproject/data/clean_data.jsonThis is the output. The full error message isValueError: Expected object or value. I tracked it down to being the standard error message when pandas can't find the file path. Opening and parsing the json file in other ways yielded other errors that were also caused by not finding the file. Thanks for the suggestion with pathlib, I will see if that helps