I'm using pandas.read_sql() command to get data from my postgresql database.
The SQL query is created generically with many columns from which I only want to get specific columns using one column as index.
Creating an example table test_table like this:
column1 column2 column3
1 2 3
2 4 6
3 6 9
I tried to use the index_col and columns parameter from pandas.read_sql() to get column1 as index and column2 as data (and neglecting column3!). But it always returns the whole table. Also when writing columns=['column1', 'column2'] nothing changes...
I'm using python 2.7.6 with pandas 0.17.1 - Thanks for help!
Example Code:
import pandas
import psycopg2
import sqlalchemy
def connect():
connString = (
"dbname=test_db "
"host=localhost "
"port=5432 "
"user=postgres "
"password=password"
)
return psycopg2.connect(connString)
engine = sqlalchemy.create_engine(
'postgresql://',
creator=connect)
sql = (
'SELECT '
'column1, '
'column2, '
'column3 '
'FROM test_table'
)
data = pandas.read_sql(
sql,
engine,
index_col=['column1'],
columns=['column2'])
print(data)
pandas.read_sql_query()insteadread_sql_query(), because it has nocolumnsparameter (which is not really doing what i want for now) - for my coderead_sql()andread_sql_query()do not differ...