Adding empty spaces in a dataframe in one column?

Question

I currently have a data frame that looks like this

  Temp1       Temp2         Pattern         Errors
 307.858K    303.197K         F0's            0
 297.960K    282.329K         F1's            0
   277K       260K             CA             0
   262K       238K             C5             0
   228K       168K         DATA==ADDR         0
   192K       140K            PRBS            0
   197K       77K             F0's            0
  199.9K     77.3K            F1's            0
  199K       773K              CA             0
                               C5             0
                           DATA==ADDR         0
                              PRBS            0
                              F0's            0
                              F1's            0
                               CA             0
                               C5             0
                           DATA==ADDR         0
                              PRBS            0
                              F0's            0 
                              F1's            0
                               CA             0
                               C5             0
                           DATA==ADDR         0
                              PRBS            0
                               .              . 
                               .              .
                               .              .

Expected output table

  Temp1       Temp2         Pattern         Errors
                              F0's            0
                              F1's            0
                               CA             0
                               C5             0
                          DATA==ADDR          0
                              PRBS            0
 307.858K    303.197K         F0's            0
                              F1's            0
                               CA             0
                               C5             0
                           DATA==ADDR         0
                              PRBS            0
 297.960K    282.329K         F0's            0
                              F1's            0
                               CA             0
                               C5             0
                           DATA==ADDR         0
                              PRBS            0
   277K       260K            F0's            0 
                              F1's            0
                               CA             0
                               C5             0
                           DATA==ADDR         0
                              PRBS            0
   262K       238K             .              . 
                               .              .
                               .              .

I want to change it to where the temperature column is split up to have a value for each section. ie. the first 2 temperature values correspond to the values from the second F0's to PRBS, then the second 2 temperature values correspond to the next set of 6 patterns. I thought the best way to do this would be adding 6 blank spaces before each entry but I don't know if that is the best way to do it and if it is, I'm not really sure how to go around doing it, any help will be appreciated.

EDIT: This data frame is created by concatenating 3 different dataframes I created earlier by parsing through a log file.

results = pd.concat([tempFrame, patternFrame, errorsFrame], axis = 1, sort = False)

The tempFrame contains the first 2 columns, the patternFrame contains the Pattern column and errorsFrame contains the Errors column.

tempFrame:

 tempFrame = tempFrame.assign(newIndex = tempFrame.groupby('Extra').cumcount())
 tempFrame= tempFrame.set_index(['newIndex', 'Extra']).unstack().swaplevel(0, axis = 1).sort_index(axis = 1, level = 0)

It may be easier to fix this further upstream. How is the dataframe generated? — Silenced Temporarily
– Silenced Temporarily, Commented Jun 21, 2018 at 13:08

Alok Nayak · Accepted Answer · 2018-06-21 13:31:59Z

1

You can try some variation of below code to generate expected output. Given you have df as dataframe.

#fetch the initial temp1
temp1 = df['Temp1'].iloc[:df.shape[0]/6]
#OR
temp1 = df['Temp1'].iloc[:(df.shape[0]/6 - 1)]
#create an numpy array of first 6 empty strings followed by array of (temp,'','','','','')
df['Temp1'] = np.hstack([np.full(6,'',dtype='S20')]+[np.append(tmp,np.full(5,'',dtype='S20')) for tmp in temp1])
temp2 = df['Temp2'].iloc[:df.shape[0]/6]
#OR
temp2 = df['Temp2'].iloc[:(df.shape[0]/6 - 1)]
df['Temp2'] = np.hstack([np.full(6,'',dtype='S20')]+[np.append(tmp,np.full(5,'',dtype='S20')) for tmp in temp2])

edited Jun 21, 2018 at 13:31

answered Jun 21, 2018 at 13:14

Alok Nayak

2,57124 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

shady mccoy Over a year ago

When I tried this I got a ValueError: Length of values does not match length of index. Could it be because I did this (check latest edit) to create a multiindex dataframe originally?

Alok Nayak Over a year ago

try to cross check dimension of generated arrays. May be try temp1 = df['Temp1'].iloc[:(df.shape[0]/6 - 1)]

Collectives™ on Stack Overflow

Adding empty spaces in a dataframe in one column?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related