I am working with some data that I download from the web in csv format. The original data is shown as following.
Test Data
"Date","T1","T2","T3","T4","T5","T6","T7","T8"
"105/11/01","123,855","1,150,909","9.30","9.36","9.27","9.28","-0.06","60",
"105/11/02","114,385","1,062,118","9.26","9.42","9.23","9.31","+0.03","78",
"105/11/03","71,350","659,848","9.30","9.30","9.20","9.28","-0.03","42",
I use following code to read it
import pandas as pd
df = pd.read_csv("test.csv", skiprows=[0], usecols=[0,3,4,5])
I have also tried to use
import pandas as pd
df = pd.read_csv("test.csv", skiprows=[0], usecols=[0,3,4,5], keep_date_col=True)
I always get the following results
Date T3 T4 T5
105/11/01 9.30 9.36 9.27 NaN
105/11/02 9.26 9.42 9.23 NaN
105/11/03 9.30 9.30 9.20 NaN
This is what I want to get
Date T3 T4 T5
105/11/01 9.30 9.36 9.27
105/11/02 9.26 9.42 9.23
105/11/03 9.30 9.30 9.20
As you can see that pandas treat the date string not a part of the data and shift the index to one column left which cause the last column to be NaN.
I have read the pandas document on read_csv() and found it can parse date with parse_dates, keep_date_col parameters, but is there any way to NOT parse date as it is doing now?