1

I have this Excel file. I also put the screenshot of my the file below. Excel screenshot

I want to edit the data on pitch-class column with this 2 criteria:

  1. removing ' ' mark between the text.
  2. removing 0 values.
  3. removing [] mark.

So, for example, from this text:

['0', 'E3', 'F3', 'F#3 / Gb3', 'G3', 'G#3 / Ab3', 'A3', 'A#3 / Bb3', 'B3', 'C4', 'C#4 / Db4', 'D4']

I want to make it look like this:

[E3, F3, F#3 / Gb3, G3, G#3 / Ab3, A3, A#3 / Bb3, B3, C4, C#4 / Db4, D4]

Of course, I can do this manually one by one, but unfortunately because I have about 20 similar files that I have to edit, I can't do it manually, so I think I might need help from Python.

My idea to do it on Python is to load the Excel file to a DataFrame, edit the data row by row (maybe using .remove() and .join() method), and put the edit result back to original Excel file, or maybe generate a new one consisting an edited pitch-class data column.

But, I kinda have no idea on how to do code it. So far, what I've tried to do is this:

  1. read the Excel files to Python.
  2. read pitch-class column in that Excel file.
  3. load it to a dataframe. Below is my current code.
import pandas as pd 

file = '014_twinkle_twinkle 300 0.0001 dataframe.xlsx' # file attached above

df = pd.read_excel(file, index_col=None, usecols="C") # read only pitch-class column

# printing data
for row in df.iterrows():
    print(df['pitch-class'].astype(str))

My question is how can I edit the pitch-class data per row and put the edit result back again to original or a new Excel file? I have difficulties accessing the df['pitch-class'] data because I can't get the string value. Is there any way in Python to achieve it?

1 Answer 1

1

In general you do not want to iterate over every row in a pandas dataframe, it is very slow. There are a lot of ways (that you can lean by practice over time) to apply functions over a column/row/the whole dataframe in pandas. In this example:

Convert the column to type string, and replace the ' character with a blank space

df = pd.read_excel("014_twinkle_twinkle 300 0.0001 dataframe.xlsx")
df["pitch-class"] = df["pitch-class"].astype(str).str.replace("'0', ", "")
df["pitch-class"] = df["pitch-class"].astype(str).str.replace("'", "")
df.to_excel("results.xlsx")
Sign up to request clarification or add additional context in comments.

2 Comments

This worked. Thank you. But, the 0 still remains in the data. Can I do df['pitch-class'].remove(0) to omit it?
@Dionisius Pratama I edited my answer to remove the 0 at the start of the data. If this has answered your question, remember to press the 'Answered' tick :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.