How to edit Excel file using DataFrame and save it back as Excel file?

Question

I have this Excel file. I also put the screenshot of my the file below.

I want to edit the data on pitch-class column with this 2 criteria:

removing ' ' mark between the text.
removing 0 values.
removing [] mark.

So, for example, from this text:

['0', 'E3', 'F3', 'F#3 / Gb3', 'G3', 'G#3 / Ab3', 'A3', 'A#3 / Bb3', 'B3', 'C4', 'C#4 / Db4', 'D4']

I want to make it look like this:

[E3, F3, F#3 / Gb3, G3, G#3 / Ab3, A3, A#3 / Bb3, B3, C4, C#4 / Db4, D4]

Of course, I can do this manually one by one, but unfortunately because I have about 20 similar files that I have to edit, I can't do it manually, so I think I might need help from Python.

My idea to do it on Python is to load the Excel file to a DataFrame, edit the data row by row (maybe using .remove() and .join() method), and put the edit result back to original Excel file, or maybe generate a new one consisting an edited pitch-class data column.

But, I kinda have no idea on how to do code it. So far, what I've tried to do is this:

read the Excel files to Python.
read pitch-class column in that Excel file.
load it to a dataframe. Below is my current code.

import pandas as pd 

file = '014_twinkle_twinkle 300 0.0001 dataframe.xlsx' # file attached above

df = pd.read_excel(file, index_col=None, usecols="C") # read only pitch-class column

# printing data
for row in df.iterrows():
    print(df['pitch-class'].astype(str))

My question is how can I edit the pitch-class data per row and put the edit result back again to original or a new Excel file? I have difficulties accessing the df['pitch-class'] data because I can't get the string value. Is there any way in Python to achieve it?

Tom McLean · Accepted Answer · 2021-04-15 09:56:19Z

1

In general you do not want to iterate over every row in a pandas dataframe, it is very slow. There are a lot of ways (that you can lean by practice over time) to apply functions over a column/row/the whole dataframe in pandas. In this example:

Convert the column to type string, and replace the ' character with a blank space

df = pd.read_excel("014_twinkle_twinkle 300 0.0001 dataframe.xlsx")
df["pitch-class"] = df["pitch-class"].astype(str).str.replace("'0', ", "")
df["pitch-class"] = df["pitch-class"].astype(str).str.replace("'", "")
df.to_excel("results.xlsx")

edited Apr 15, 2021 at 9:56

answered Apr 15, 2021 at 9:50

Tom McLean

6,6332 gold badges24 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Dionisius Pratama Over a year ago

This worked. Thank you. But, the 0 still remains in the data. Can I do df['pitch-class'].remove(0) to omit it?

Tom McLean Over a year ago

@Dionisius Pratama I edited my answer to remove the 0 at the start of the data. If this has answered your question, remember to press the 'Answered' tick :)

Collectives™ on Stack Overflow

How to edit Excel file using DataFrame and save it back as Excel file?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related