I have a large dataset from a CSV file. It has two columns, the first is Date/Time in hh:mm:ss:ms form and the other is Pressure in number form. The pressure has randon values throughout that are not numerical values (things like 150+AA42BB43). These appear randomly throughout the 50,000 rows in the file and are not the same.
I need a way to change these Pressure values to numeric so I can perform data manipulation on them.
df_cleaned = df['Pressure'].loc[~df['Pressure'].map(lambda x: isinstance(x, float) | isinstance(x, int))]
I tried this, but it got rid of my Date/Time values and also did not clean all the pressure values while also getting rid of my headers.
I was wondering if anyone had any suggestions on how I can easily clean the data in the 2nd column while also keeping my Date/Time values in the first column accurate.
df_cleaned = df.loc[....](or evendf_cleaned = df[....]) instead ofdf['Pressure'].loc[...]df_cleaned = df['Pressure']...you get only one column (Pressure) and you skip other columns - and this is why you don't haveDate/Time. And because it is single column so it may give it asSeriesinstead ofDataFrame- and this can remove your header(s) because Series (single column) doesn't need header.isinstance(x, (float, int))