Python CSV: Replace cell value (float) of row with new value (float)

Question

updated Sample CSV Data:

c1,c2,v1,v2,p1,p2,r1,a1,f1,f2,f3,Time_Stamp 

0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:00
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:01
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:02
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:03
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:04
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:05
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:06
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:07
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:08
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:09
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:10
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:11
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:12
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:13
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:14
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:15
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:16 
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:17
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:18
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:19
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:20
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:21
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:22
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:23
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:24
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:25
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:26
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:27
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:28
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:29
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:30
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:31
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:32
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:33
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:34
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:35
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:36

Edited- Reading of CSV, Python code:

import numpy as np
from datetime import date,time,datetime
import pandas as pd

def readcsv(x): #def function to read csv files based on input below
    Data = pd.read_csv(x, parse_dates=['Time_Stamp'], infer_datetime_format=True)
    Data['Date'] = Data.Time_Stamp.dt.date #creating Date Column in the Data Frame ( does not affect the main .csv file)
    Data['Time'] = Data.Time_Stamp.dt.time #creating Time Column in the Data Frame ( also does not affect the main .csv file)
    #print (Data) #<-- prints the output
    #print (Data['Time_Stamp'][6000:7000]) <- print from row 6000 to 7000 of the data frame (has over 150'000 rows)
    Data['Time_Stamp'] = pd.to_datetime(Data['Time_Stamp']) # Time_Stamp Data Frame
    print(Data[1:6])
    return Data

Data = readcsv('data.csv')


#Data = csv file data 

def getMask(start,end,Data):
    mask = (Data['Time_Stamp'] > start) & (Data['Time_Stamp'] <= end)
    return mask;

start = '2017-06-13 16:00:00'
end = '2017-06-13 16:40:00'
timerange = Data.loc[getMask(start, end, Data)]

pspike = timerange.loc[timerange['c1'] <= 5.0]

pspike output:

the row : for e.g. -> After printing pspike , the printed row has time value of 16:38:15 and the next printed row has time value of 16:38:17, that means it skipped a row where the time value is 16:38:16

[ e.g. below ]

13/06/2017 16:38:12
13/06/2017 16:38:13
13/06/2017 16:38:14
13/06/2017 16:38:15
13/06/2017 16:38:17
13/06/2017 16:38:18

After running the code below, it prints out the row( only the Time_Stamp value ) that was skipped, which has time value of 16:38:16, 16:38:22 and 16:38:32 where the rows with that value were skipped based on the output of pspike

for i in range(timerange.shape[0] - 1):
row1 = timerange.iloc[i]
row2 = timerange.iloc[i+1]
if (row2[-1] - row1[-1]).seconds > 1:
    print (row1[-1] + pd.Timedelta('1s'))

Output:

2017-06-13 16:38:16
2017-06-13 16:38:22
2017-06-13 16:38:32

What I'm trying to get is to print the whole row where the Time_Stamp value is 2017-06-13 16:38:16, where its the only row with c1 value that is more than 5.0, in this case (based on the sample code), its 415.7

Instead of this:

13/06/2017 16:38:16

I want to print it like this:

415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:16
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:22
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:32

After printing that row out, I have to replace the value of c1 - 415.7 directly above, with 0.0 . How do I do this?

EDIT

What to replace:

The c1 of the missing row in pspike output

Can you simplify your problem statement? What are all the values you want to replace with 0? Basically, what is the condition for replacement. — cs95
– cs95, Commented Aug 22, 2017 at 2:39
What is the column "AC_Input_Current"? I cannot see that in the data — Gayatri
– Gayatri, Commented Aug 22, 2017 at 3:14
My bad, I've edited and changed it to what it suppose to be , which is c1 — Sancta Ignis
– Sancta Ignis, Commented Aug 22, 2017 at 3:27

Sancta Ignis · Accepted Answer · 2017-08-23 08:19:58Z

1

I'm a bit confused here because you could simply do:

pspike = timerange[timerange['c1'].gt(5.0)] #gr=greater than, lt=lower than

Which returns a dataframe with:

16  415.7   12.5    30.2    154.6   4675.2  1   -1  5199.4  0   50  0   2017-06-13 16:38:16

Set values of column "c1" to 0.0

pspike["c1"] = 0.0

Create a string from first row (index=0):

','.join(pspike.astype(str).values.tolist()[0])

Prints:

'0.0,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,2017-06-13 16:38:16'

Update

string = """c1,c2,v1,v2,p1,p2,r1,a1,f1,f2,f3,Time_Stamp
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:00
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:01
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:02
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:03
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:04
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:05
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:06
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:07
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:08
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:09
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:10
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:11
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:12
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:13
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:14
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:15
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:16 
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:17
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:18
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:19
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:20
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:21
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:22
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:23
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:24
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:25
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:26
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:27
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:28
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:29
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:30
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:31
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:32
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:33
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:34
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:35
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:36"""

df = pd.read_csv(io.StringIO(string)) # reads data from string above
#Use : df = pd.read_csv('filename.csv')  for csv file (handling tons of data)
df["Time_Stamp"] = pd.to_datetime(df["Time_Stamp"]) # convert to Datetime
df_filter = df[df["c1"].le(0.5)] # new df with less or equal to 0.5
where = (df_filter[df_filter["Time_Stamp"].diff().dt.total_seconds() > 1] ["Time_Stamp"] - pd.Timedelta("1s")).astype(str).tolist() # Find where diff > 1 second
df_filter2 = df[df["Time_Stamp"].isin(where)] # Create new df with those
df_filter2["c1"] = 0.0 # Set c1 to 0.0

for index, row in df_filter2.iterrows():
    values = row.astype(str).tolist()
    print(','.join(values))

Return

0.0,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,2017-06-13 16:38:16
0.0,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,2017-06-13 16:38:22
0.0,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,2017-06-13 16:38:32

edited Aug 23, 2017 at 8:19

Sancta Ignis

831 silver badge10 bronze badges

answered Aug 22, 2017 at 4:11

Anton vBR

19k6 gold badges47 silver badges47 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

Sancta Ignis Over a year ago

Based on the sample data I provided, getting only the row with value of c1 greater than 5.0 would be easier if to go with your method. But the actual data can have repeated rows with c1 values more than 5.0 I want to get the rows where c1 is lesser than 5.0 And if the printed row has skipped 1 second in the time value of the Time_Stamp, then that row that was skipped is the specific row that I want to edit the value of c1` to 0.0

Anton vBR Over a year ago

@SanctaIgnis But wouldn't it make sense to handle those instead?

Sancta Ignis Over a year ago

I tried the updated codes but it only printed the column names and not the values, any idea why?

Anton vBR Over a year ago

@SanctaIgnis Actually no idea why - tried it myself and it did work for me(but those are just the initial steps). I don't feel like the code is super safe though. I'd say you would have to provide a larger data sample and explain the expected output when there are multiple values.

Sancta Ignis Over a year ago

I edited and added little more data and the expected output

|

Collectives™ on Stack Overflow

Python CSV: Replace cell value (float) of row with new value (float)

1 Answer 1

12 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

12 Comments

Your Answer

Sign up or log in

Post as a guest

Related