I'm using a text file that is updated every day and I want to extract the values from the string and append them to a DataFrame. The text file doesn't change structurally (at least mostly), just the values are updated, so I've written some code to extract the values preceding the keywords in my list.
To make my life easier I've tried to build a for-loop to automate as much as possible, but frustratingly I'm stuck at appending the values I've sourced to my DataFrame. All the tutorials I've looked at are dealing with ranges in for loops.
empty_df = pd.DataFrame(columns = ["date","builders","miners","roofers"])
text = "On 10 May 2022, there were 400 builders living in Rome, there were also no miners and approximately 70 roofers"
text = text.split()
profession = ["builders","miners","roofers"]
for i in text:
if i in profession:
print(text[text.index(i) - 1] + " " + i)
400 builders
no miners
70 roofers
I've tried to append using:
for i in text:
if i in profession:
empty_df.append(text[text.index(i) - 1] + " " + i)
But it doesn't work, and I'm really unsure how to append multiple calculated variables.
So what I want to know is:
- How can I append the resulting values to my empty dataframe in the correct columns.
- How could I convert the 'no' or 'none' into zero.
- How can I also incorporate the date each time I update this?
Thanks