0

i have a number of arrays with the following format:

ic2 2021-05-06 09:10:30.582455  3.556000789 0
ic2 2021-05-06 09:15:30.999870  3.556000789 0
ic2 2021-05-06 09:20:31.784262  3.556000789 0
ic2 2021-05-06 09:25:32.538884  3.556000789 0
ic2 2021-05-06 09:30:32.767391  3.556000789 0
ic2 2021-05-06 09:35:33.293743  3.556000789 0
ic2 2021-05-06 09:40:33.454079  3.556000789 0
ic2 2021-05-06 09:45:33.853965  3.556000789 0
ic2 2021-05-06 09:50:34.421639  3.556000789 0
ic2 2021-05-06 09:55:34.653818  3.556000789 0
ic2 2021-05-06 10:00:35.134874  3.556000789 0

how do i remove the rows where the datetime > 2021-05-06 09:20 and < 2021-05-06 09:40 ?

I have tried with np.delete:

 raindata[10]=np.delete(raindata[10],[(raindata[10]>2021-11-07 16:42) & (raindata[10]< 2022-11-15 04:07)],axis=0)

and np.where:

raindata[10][:,1]=np.where(raindata[10][:,1]>= 2022-11-07 16:42 and <= 2022-11-15 04:07, Nan, raindata[10][:,1])

but always get the error:

SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

Edit in response to comments:

for ease I have created the datetime objects with

xx=datetime.datetime(2021,5,6, hour=9, minute=20)
yy=datetime.datetime(2021,5,6, hour=9, minute=40)

and incorporated this into Ben Grossman's solution:

newraindata=np.delete(raindata[10],xx < raindata[10] < yy, axis=0)

which gives a type error '>' not supported between instances of 'str' and 'datetime.datetime'. I understand this as column 0 is a string, so I tried

newraindata=np.delete(raindata[10], (raindata[10][:,1] > xx) & (raindata[10][:,1] < yy), axis=0)

which just replicates raindata[10].

Ben also asked for code which replicates the array, I'm sorry but the array is not generated by code, it is data imported from a PGSQL table, which is why i copied the first 11 rows as an example.

1
  • Python doesn't know how to interpret "2022-11-07 16:42", you need to make that date into a datetime object. You can do this using the strptime method for instance. Commented Feb 24, 2022 at 16:59

2 Answers 2

1

This has been solved using numpy:-

set the date range:

xx=datetime(2021,11,7, hour=16, minute=42)
yy=datetime(2021,11,15, hour=4, minute=8)

create a mask:

mask2=(rainmm[:,1] > xx) & (rainmm[:,1] < yy)

create the new array:

rainmm=rainmm[~mask2].copy()

job done :)

Sign up to request clarification or add additional context in comments.

Comments

0

It turns out that this entire problem can be nicely solved using the pandas module.

import pandas as pd

s = '''ic2 2021-05-06 09:10:30.582455  3.556000789 0
ic2 2021-05-06 09:15:30.999870  3.556000789 0
ic2 2021-05-06 09:20:31.784262  3.556000789 0
ic2 2021-05-06 09:25:32.538884  3.556000789 0
ic2 2021-05-06 09:30:32.767391  3.556000789 0
ic2 2021-05-06 09:35:33.293743  3.556000789 0
ic2 2021-05-06 09:40:33.454079  3.556000789 0
ic2 2021-05-06 09:45:33.853965  3.556000789 0
ic2 2021-05-06 09:50:34.421639  3.556000789 0
ic2 2021-05-06 09:55:34.653818  3.556000789 0
ic2 2021-05-06 10:00:35.134874  3.556000789 0'''

data = [line.split() for line in s.splitlines()]
data = [[line[0],line[1]+' '+line[2],float(line[3]),float(line[4])] for line in data]

df = pd.DataFrame(data)
df.loc[:,1] = pd.to_datetime(df.loc[:,1])
df_bool = (df[1] > pd.to_datetime('2021-05-06 09:20')) & (df[1] < pd.to_datetime('2021-05-06 09:40'))
print(df[~df_bool])

The result:

      0                          1         2    3
0   ic2 2021-05-06 09:10:30.582455  3.556001  0.0
1   ic2 2021-05-06 09:15:30.999870  3.556001  0.0
6   ic2 2021-05-06 09:40:33.454079  3.556001  0.0
7   ic2 2021-05-06 09:45:33.853965  3.556001  0.0
8   ic2 2021-05-06 09:50:34.421639  3.556001  0.0
9   ic2 2021-05-06 09:55:34.653818  3.556001  0.0
10  ic2 2021-05-06 10:00:35.134874  3.556001  0.0

8 Comments

Thanks @Ben Grossman. unfortunately I can't mark your answer correct as the code does not work. I understand the concept of what you are saying, but your application doesn't work.
I had a misplaced bracket, perhaps it works now. It would also be helpful if you could add code to you question that produces (perhaps a smaller version of) the array that you're working with.
edited in response
@matt If the code doesn't work, could you say what exactly goes wrong (or what error message you get)?
raindata[10]=np.delete(raindata[10], datetime.datetime(2022,11,07,hour=16,minute=42) < raindata[10] < datetime.datetime(2022,11,15,hour=4,minute=7),axis=0) gives SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers take out the '0' in line 2 and it gives TypeError: '>' not supported between instances of 'str' and 'datetime.datetime'
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.