3

I'm reading a .csv file in python using command as:

data = np.genfromtxt('home_data.csv', dtype=float, delimiter=',', names=True) 

this csv has one column with zipcode which are numerals but in string format, for eg "85281". This column has values as nan:

data['zipcode']
Output : array([ nan,  nan,  nan, ...,  nan,  nan,  nan])

How can I convert these values in string to integers so as to get an array of values and not of 'nan's.

0

2 Answers 2

1

you must help genfromtxt a little :

 data = np.genfromtxt('home_data.csv',
 dtype=[int,float],delimiter=',',names=True,
 converters={0: lambda b:(b.decode().strip('"'))})

each field is collected as bytes. float(b'1\n') return 1.0 , but float(b'"8210"') give an error. the converters option allow to define for each field (here field 0) a function to do the proper conversion, here converting in string(decode) and removing (strip) the trailing ".

If home_data.csv is :

zipcode,val
"8210",1
"8320",2
"14",3

you will obtain :

data -> array([(8210, 1.0), (8320, 2.0), (14, 3.0)], dtype=[('zipcode', '<i4'), ('val', '<f8')])
data['zipcode'] -> array([8210, 8320,   14])
Sign up to request clarification or add additional context in comments.

Comments

1

Maybe not the most efficient solution, but read your data as string and convert it afterwards to float:

data = np.genfromtxt('home_data.csv', dtype=float, delimiter=',', names=True)


zipcode = data['zipcode'].astype(np.float)

Btw., is there a reason you want to save a zipcode as a float?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.