6

I'm working with an xlsx-file which looks like this:

enter image description here

My previous task was to modify the columns named 'Entry 1' and 'Entry 2'. I have stored those columns in a seperate slice of the original dataframe for better overview. I'll give you a quick glimpse how this slice looks:

>>> slice = df.loc[:, 'Entry 1':'Entry 2']
# code to modify the values
>>> slice

    Entry 1     Entry 2
1   Modified 1  Value 1
2   Modified 2  Value 2
3   Modified 3  Value 3 

I now want to overwrite those columns in the original dataframe with the named slice. I already achieved this by using the following:

df.loc[:, 'Entry1':'Entry2'] = slice

Question

As you can see, the header of the columns has a special format. How do I overwrite the values in 'Entry1' and 'Entry2', excluding the header, to keep the format?

2
  • Are you using something like xl-wings? Pandas doesn't store that kind of formatting info about the data - your best bet is to write back into the original file, just starting from row 2 Commented Jan 2, 2019 at 11:53
  • This is a way I thought about as well. How do I achieve this? Commented Jan 2, 2019 at 12:10

5 Answers 5

9

Full disclosure: I'm the author of the suggested library

Unfortunately there is no out-of-the-box way in pandas to achieve that as it does not load the styling data. You can use StyleFrame (that wraps pandas and openpyxl, which I assume you already have installed) that can read xlsx files while keeping (most) of the styling elements.

Using it in this case may look like the following:

from styleframe import StyleFrame

sf = StyleFrame.read_excel('test.xlsx', read_style=True)
# currently you have to specify each value manually,
# using slices will revert to the default style used by StyleFrame
sf.loc[0, 'Entry 1'].value = 'Modified 1'
sf.loc[1, 'Entry 1'].value = 'Modified 2'
sf.loc[2, 'Entry 1'].value = 'Modified 3'
sf.to_excel('test.xlsx').save()

Another alternative using a loop:

sf = StyleFrame.read_excel('test.xlsx', read_style=True)
new_values = ['Modified 1', 'Modified 2', 'Modified 3']
for cell, new_value in zip(sf['Entry 1'], new_values):
    cell.value = new_value
sf.to_excel('test.xlsx').save()

Content of test.xlsx before execution:

enter image description here

and after:

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

great library thanks for it. I looked for preserving all header, and other cell styles but couldn't see an example of default-style. Would appreciate, if you assist in finding such text in documentation
4

I know this is more than you need, but in case others were looking for an answer to keeping formatting; as of Pandas 1.4 there is the addition of if_sheet_exists='overlay'

Original Spreadsheet:

Original Spreadsheet

import pandas as pd

df = pd.DataFrame({'Entry1': ['Modified 1', 'Modified 2 ', 'Modified 3'],
                   'Entry2': ['Value 1', 'Value 2','Value 2']})

with pd.ExcelWriter('Original_File.xlsx', engine='openpyxl'
                    mode='a', if_sheet_exists='overlay') as writer:
    
    df.to_excel(writer, sheet_name='SheetName', startrow=1,
                startcol=2, header=False, index=False)

After Overlay

And one can see that this also works if there is formatting in the cell.

Lots of Formatting

Keeps lots of formatting

Comments

3

Final answer

To give probs to a way more extensive solution which will fit to many passengers dropping by, check this.


But for me, this easy way was enough to fit my needs. All you need to do is write back to the original file, just start by "row 1" (since the first row is marked as "row 0") as well as letting out the header and the indexing. In my case, you achieve this by the following:

# It is also possible to write the dataframe without the header and index.
df4.to_excel(writer, sheet_name='Sheet1',
             startrow=1, startcol=2, header=False, index=False)

Comments

1

So, I want to answer this with my workaround pre-pandas 1.4 because I found this page when trying to solve this problem. I'm working in Pandas 1.3.4.

This is not the most elegant or fast solution, but it got the job done for me.

import openpyxl
import pandas as pd

with open(filePath,'rb') as fid:
    DataFrame = pd.read_excel(fid,"sheetName")
dataWorkbook = openpyxl.load_workbook(filePath)
dataSheet = dataWorkbook["sheetName"]

--> Logic for editing data here

#Iterate over dataframe to write to the format in openpyxl
for col, header in enumerate(DataFrame):
    for row in range(len(DataFrame)):
        cellRef = dataSheet.cell(row=row+2,column=col+1) #2: OpenPyXl does not track headers internally 1:Indexing starts at 1 in excel
        cellRef.value = DataFrame.loc[row,header]
dataWorkbook.save(filePath)

Disclaimer: I began learning Python in Late August of this year.

Comments

0

You can do this using df.to_clipboard(index=False)

from win32com.client import Dispatch
import pandas as pd

xlApp = Dispatch("Excel.Application")
xlApp.Visible = 1
xlApp.Workbooks.Open(r'c:\Chadee\test.xlsx')
xlApp.ActiveSheet.Cells(1,1).Select

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
df.to_clipboard(index=False)

xlApp.ActiveWorkbook.ActiveSheet.PasteSpecial()

Output:

Note that the cell colors are still the same

Hope that helps! :-)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.