0

I have constructed a matrix with integer values for columns and index. The matrix is acutally hierachical for each month. My problem is that the indexing and selecting of data does not work anymore as before when I write the data to csv and then load as pandas dataframe.

Selecting data before writing and reading data to file:

matrix.ix[1][4][3]

would for example give 123

In words select, month January and get me the (travel) flow from origin 4 to destination 3.

After writing and reading the data to csv and back into pandas, the original referencing fails but if I convert the column indexing to string it works:

matrix.ix[1]['4'][3]

... the column names have automatically been tranformed from integer into string. But I would prefer the original indexing. Any suggestions?

My current quick fix for handling the data after loading from csv is:

# Writing df to file
mulitindex_df_Travel_monthly.to_csv(
    r'result/Final_monthly_FlightData_countrylevel_v4.csv')

# Loading df from csv
test_matrix = pd.read_csv(
    filepath_inputdata + '/Final_monthly_FlightData_countrylevel_v4.csv',
    index_col=[0, 1])

test_matrix.rename(columns=int, inplace=True)  # Thx, @ayhan

CSV FILE: https://www.dropbox.com/s/4u2opzh65zwcn81/travel_matrix_SO.csv?dl=0

example df

4
  • Can you share a few lines of the CSV and how you read it? Commented May 15, 2016 at 21:23
  • I added the code I am using to save the data and load it back into pandas. I am only specifiying the index_col. But there is at least a minor issue as well. Once loaded its adds me a empty row with name "Unnamed: 1" Commented May 15, 2016 at 22:20
  • Add the other arguments: header=None, skiprows=1 Commented May 16, 2016 at 3:01
  • @ Parfait, did you test this one the dataset I provided in your environment? It does not work for me. Commented May 16, 2016 at 12:21

2 Answers 2

2

You could also do

df.columns = df.columns.astype(int)

or

df.columns = df.columns.map(int)

Related: what is difference between .map(str) and .astype(str) in dataframe

Sign up to request clarification or add additional context in comments.

Comments

1

I used something like this:

df = df.rename(columns={str(c): c for c in columns})

where df is pandas dataframe and columns are column to change

3 Comments

If you know columns, then you can use pd.read_csv(..., names=columns).
@bers This code only changes a subset, not necessarily all columns
I suspect you are talking about your code, in which case you are correct. The OP's solution posted before your answer is test_matrix.rename(columns=int, inplace=True), so I suspect we are talking about renaming all columns.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.