I have a time series, where one of the columns contains mostly NaN. The dataframe is quite large, so handling all these NaNs is becoming a burden. If I simply drop all of them, plots are getting completely messed up with interpolation between remaining data points.
Here's a short example of what I have:
v x
0.0000 0.000000 NaN
0.0002 0.062791 NaN
0.0004 0.125333 NaN
0.0006 0.187381 95.0
0.0008 0.248690 NaN
0.0010 0.309017 NaN
0.0012 0.368125 NaN
0.0014 0.425779 88.0
0.0016 0.481754 85.0
0.0018 0.535827 91.0
0.0020 0.587785 NaN
0.0022 0.637424 NaN
0.0024 0.684547 NaN
0.0026 0.728969 99.0
...
and what I want to achieve:
v x
0.0004 0.125333 NaN
0.0006 0.187381 95.0
0.0012 0.368125 NaN
0.0014 0.425779 88.0
0.0016 0.481754 85.0
0.0018 0.535827 91.0
0.0024 0.684547 NaN
0.0026 0.728969 99.0
...
Iterating through rows is not an option, as it will be way to slow, but I can't figure out any reasonable approach. Any ideas?
A dataset example (rather short) to work with:
import pandas as pd
import numpy as np
f=50
Tmax = 1
fs= 5000
df = pd.DataFrame(index=np.arange(0, Tmax, 1/fs), data={'x':np.random.randint(0,100, size=int(fs*Tmax))})
df['v'] = np.sin(2*np.pi*f*df.index)
# Most of "x" is NaN
df.loc[df['x']<75, 'x'] = np.NaN