9

I noticed if I were to type df.column_name(), I can autocomplete the column_name with a tab in IPython notebook.

Now, the proper syntax for doing something to a column would be df['column_name'], where I am unable to autocomplete (I am assuming because it is a string?). Is there any other notation or way to simplyfy typing out column names. I am essentailly looking for a solution that would allow me to tab autocomplete the column name within this df['column_name'].

1
  • 1
    As you've noticed, you get autocompletion if you use the attribute access of df.column_name, I don't think any other way is really going to be possible. In future, it might be possible if someone writes an IPython notebook plugin that is designed specifically for pandas. Commented Jan 31, 2014 at 0:42

2 Answers 2

4

I've found the following method to be useful to me. It basically creates a namedtuple containing the names of all the variables in the data frame as strings.

For example, consider the following data frame containing 2 variables called "variable_1" and "variable_2":

from collections import namedtuple
from pandas import DataFrame
import numpy as np

df = DataFrame({'variable_1':np.arange(5),'variable_2':np.arange(5)})

The following code creates a namedtuple called "var":

def ntuples():
    list_of_names = df.columns.values
    list_of_names_dict = {x:x for x in list_of_names}

    Varnames = namedtuple('Varnames', list_of_names) 
    return Varnames(**list_of_names_dict)

var = ntuples()

In a notebook, when I write var. and press Tab, the names of all the variables in the dataframe df will be displayed. Writing var.variable_1 is equivalent to writing 'variable_1'. So the following would work: df[var.variable_1].

The reason I define a function to do it is that often times you will add new variables to a data frame. In order to update the new variables to your namedtuple "var" simply call the function again, ntuples(), and you are good to go.

Sign up to request clarification or add additional context in comments.

Comments

0

I'm not sure how your data is situated but when I am importing a csv/txt file, I specify the names of the columns in a list, such as...

names = ['col_1', 'col_2', 'col_3']

etc... and then import my file as such...

import pandas as pd
data = pd.read_csv('./some_file.txt', header = True, delimiter = '\t', names = names)

You could then do tab completion like...

new_thing = data[names[1]]

where you would be hitting tab as you started to type "names" and then all you would have to do is specify what 'name' item you wanted. I not sure if this is any more efficient then simply typing out the word.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.