7

I'm using pandas.read_sql() command to get data from my postgresql database. The SQL query is created generically with many columns from which I only want to get specific columns using one column as index. Creating an example table test_table like this:

column1 column2 column3
1       2       3
2       4       6
3       6       9

I tried to use the index_col and columns parameter from pandas.read_sql() to get column1 as index and column2 as data (and neglecting column3!). But it always returns the whole table. Also when writing columns=['column1', 'column2'] nothing changes...

I'm using python 2.7.6 with pandas 0.17.1 - Thanks for help!

Example Code:

import pandas
import psycopg2
import sqlalchemy


def connect():
    connString = (
        "dbname=test_db "
        "host=localhost "
        "port=5432 "
        "user=postgres "
        "password=password"
    )
    return psycopg2.connect(connString)

engine = sqlalchemy.create_engine(
            'postgresql://',
            creator=connect)
sql = (
    'SELECT '
    'column1, '
    'column2, '
    'column3 '
    'FROM test_table'
)
data = pandas.read_sql(
    sql,
    engine,
    index_col=['column1'],
    columns=['column2'])
print(data)
2
  • why don't you want to change your 'select' query? and i guess you want to use pandas.read_sql_query() instead Commented Mar 11, 2016 at 10:30
  • The sql query should only be build once and used afterwards by different functions, picking specific columns from it. I did not use read_sql_query(), because it has no columns parameter (which is not really doing what i want for now) - for my code read_sql() and read_sql_query() do not differ... Commented Mar 11, 2016 at 11:56

1 Answer 1

8

I think the argument columns did not work for you because you were using sql statement instead of providing it with your table name.

As mentioned from pandas website:

columns : list, default: None List of column names to select from sql table (only used when reading a table).

Therefore, I think if you try:

pandas.read_sql('test_table', engine, index_col=['column1'], columns=['column2'])

columns argument will actually work.

Sign up to request clarification or add additional context in comments.

1 Comment

It's a pity that it doesn't work with sql statements

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.