5

I want to convert existing Python list into Pandas DataFrame object. How to specify data format for each column and define index column?

Here is sample of my code:

import pandas as pd

data = [[1444990457000286208, 0, 286],
       [1435233159000067840, 0, 68],
       [1431544002000055040, 1, 55]]
df = pd.DataFrame(data, columns=['time', 'value1', 'value2'])

In above example I need to have the following types for existing columns:

  • time: datetime64[ns]
  • value1: bool
  • value2: int

Additionally time column should be used as index column.

By default all three columns are int64 and I can't find how to specify column types during DataFrame object create.

Thanks!

2 Answers 2

4

value2 is already of the correct dtype.

For time you can convert to datetimes with to_datetime and then set the index with set_index.

For value1 you can cast to bool with astype.

df['time'] = pd.to_datetime(df['time'])
df = df.set_index('time')
df['value1'] = df['value1'].astype(bool)
Sign up to request clarification or add additional context in comments.

1 Comment

Is this method the most optimal way? In my understanding in this way at the beginning we process data for creating DataFrame object, next we must process it again to change data type.
1

You can use the dtype keyword in the pd.DataFrame object constructor. Docs. Please see @alex answer.

To use a specific column as index you can use the set_index method of the dataframe instance.

1 Comment

dtype kwarg is used to set dtype for entire DataFrame

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.