I have a csv file that looks like this from a !cat
,City,region,Res_Comm,mkt_type,Quradate,National_exp,Alabama_exp,Sales_exp,Inventory_exp,Price_exp,Credit_exp
0,Dothan,South_Central-Montgomery-Auburn-Wiregrass-Dothan,Residential,Rural,2010-01-15,2,2,3,2,3,3
1,Dothan,South_Central-Montgomery-Auburn-Wiregrass-Dothan,Residential,Suburban_Urban,2010-07-15,2,2,3,2,2,2
2,Dothan,South_Central-Montgomery-Auburn-Wiregrass-Dothan,Residential,Suburban_Urban,2011-01-15,2,2,2,2,2,2
When I read it in via a read_csv I get a dataframe all of the ..._exp fields are single digit numbers that I need to do basic math with (It was working great when I was using read-table with another variant of the file)
df = pd.io.parsers.read_csv('/home/tom/Dropbox/Projects/annonallanswerswithmaster1012013.csv',index_col=0,parse_dates=['Quradate'])
But when I go to do any math I get a type error indicating the column is string eg:
df['Credit_exp'] = df['Credit_exp']/2
TypeError: unsupported operand type(s) for /: 'str' and 'int'
I don't see how to convert or get it as a int? I tried specifying field types like ,dtype={'Credit_exp': np.int32, ... in the file read options,, it did not like that and I tried to do a type conversion like df['Credit_exp'] = int(df['Credit_exp']) Which just gave me:
TypeError: only length-1 arrays can be converted to Python scalars
So there is something obvious I'm missing...
df['Credit_exp'].apply(int)could do the trick. NB: your division will be euclideanCredit_expvalues isn't a single digit, it's what looks like a corrupted endline marker.