While reading in a csv file (kidney_disease.csv from https://www.kaggle.com/mansoordaku/ckdisease/data), pandas mistakenly assigns the columns pcv, wc and rc the dtype object (should be float). Specifying the dtypes leads to the an error:
data = pd.read_csv(file, usecols=["pcv", "wc", "rc"],
dtype={"pcv": np.float64, "wc": np.float64, "rc": np.float64})
ValueError: could not convert string to float: '\t?'
Can anyone explain to me why this happens? All values in these columns are either strings which correspond to numbers or nan. And is there a possibilty for pandas to "guess" the dtype based on the first 100 rows or something like this?
Thanks alot!