IPython Notebook and Pandas autocomplete

Question

I noticed if I were to type df.column_name(), I can autocomplete the column_name with a tab in IPython notebook.

Now, the proper syntax for doing something to a column would be df['column_name'], where I am unable to autocomplete (I am assuming because it is a string?). Is there any other notation or way to simplyfy typing out column names. I am essentailly looking for a solution that would allow me to tab autocomplete the column name within this df['column_name'].

As you've noticed, you get autocompletion if you use the attribute access of df.column_name, I don't think any other way is really going to be possible. In future, it might be possible if someone writes an IPython notebook plugin that is designed specifically for pandas. — Marius
– Marius, Commented Jan 31, 2014 at 0:42

Maturin · Accepted Answer · 2014-08-21 17:32:30Z

I've found the following method to be useful to me. It basically creates a namedtuple containing the names of all the variables in the data frame as strings.

For example, consider the following data frame containing 2 variables called "variable_1" and "variable_2":

from collections import namedtuple
from pandas import DataFrame
import numpy as np

df = DataFrame({'variable_1':np.arange(5),'variable_2':np.arange(5)})

The following code creates a namedtuple called "var":

def ntuples():
    list_of_names = df.columns.values
    list_of_names_dict = {x:x for x in list_of_names}

    Varnames = namedtuple('Varnames', list_of_names) 
    return Varnames(**list_of_names_dict)

var = ntuples()

In a notebook, when I write var. and press Tab, the names of all the variables in the dataframe df will be displayed. Writing var.variable_1 is equivalent to writing 'variable_1'. So the following would work: df[var.variable_1].

The reason I define a function to do it is that often times you will add new variables to a data frame. In order to update the new variables to your namedtuple "var" simply call the function again, ntuples(), and you are good to go.

o-90 · Accepted Answer · 2014-01-31 03:26:02Z

0

I'm not sure how your data is situated but when I am importing a csv/txt file, I specify the names of the columns in a list, such as...

names = ['col_1', 'col_2', 'col_3']

etc... and then import my file as such...

import pandas as pd
data = pd.read_csv('./some_file.txt', header = True, delimiter = '\t', names = names)

You could then do tab completion like...

new_thing = data[names[1]]

where you would be hitting tab as you started to type "names" and then all you would have to do is specify what 'name' item you wanted. I not sure if this is any more efficient then simply typing out the word.

answered Jan 31, 2014 at 3:26

o-90

17.7k10 gold badges44 silver badges65 bronze badges

Collectives™ on Stack Overflow

IPython Notebook and Pandas autocomplete

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related