python code terminates when using read_sql_query

Question

I have two tables: trainSearcStream and SearchInfo. My code is supposed to merge the two tables based on a common column named SearchID. the problem is that the code runs for sometime and then terminates without reporting any error in my eclipse console. the number or rows in each tables is: trainSearchStream|392,356,948 and SearchInfo|112,159,462. the size of my database (which include other tables) is 40GB. Update: When I ran code in terminal I got a message saying: 'killed: 9'

import pandas as pd
import sqlite3


# Create connection.
con = sqlite3.connect(path + 'database.sqlite')

#read into a data frame
df = pd.read_sql_query("SELECT * FROM trainSearchStream, SearchInfo WHERE  trainSearchStream.SearchID = SearchInfo.SearchID;", con)

#save to file
df.to_csv(path + 'traindata.csv',index=False,encoding='utf-8')

I suppose it terminates during the read_sql_query call and not the to_csv one? Do you know what the size of the tables would be that you want to query? And how much memory do you have? 40GB is not something that fits in typical laptops' memory. — joris
– joris, Commented Jul 7, 2015 at 11:39
BTW, if your only goal of above is to export the data to csv, there are probably better ways to do this than a roundtrip to python. See eg stackoverflow.com/questions/6076984/… — joris
– joris, Commented Jul 7, 2015 at 11:41
A few question : what is the size of the result of your query ? (you can execute it elsewhere than in python) Do you know if your python is 32 or 64 bits ? I guess the result of your query loads in RAM and you are unlikely to have 40GB of RAM. Maybe it's over 2GB and you are on a 32-bit version and it causes the problem. — Pholochtairze
– Pholochtairze, Commented Jul 7, 2015 at 11:41
I am not a python guy so I don't know exactly what your code does but I think pd.read_sql_query will load the result of your query into memory. In this case its not a good idea because of the size of the result. I would stream the query and than write the csv line by line. For small results that way would be slower. — Andre
– Andre, Commented Jul 7, 2015 at 11:48
@joris yes it terminates before to_csv. I have a laptop with 16Gb memory and 500Gb flash storage. I also have the tables in .tsv format. in that format they have size of 9.47Gb and 11.02 Gb. My goal is to join several tables according to a common column then export them to a cvs file. I tried pandas merge function. but my code would terminate. When I ran the pandas code in terminal I got a message saying: 'killed: 9' — MAS
– MAS, Commented Jul 7, 2015 at 11:52

Community · Accepted Answer · 2017-05-23 11:43:58Z

1

As mentionned, the best way I think is to directly use sqlite and not python.

So the solution is, I guess, this:

sqlite> .mode csv
sqlite> .output test.csv
sqlite> SELECT * FROM trainSearchStream, SearchInfo WHERE  trainSearchStream.SearchID = SearchInfo.SearchID; /*your query here*/
sqlite> .output stdout

Sorry, that is not an answer but commenting with the code would have destroyed the indent and we are using python ...

Could you run this and give us the output please ? (what was printed)

import pandas as pd
import sqlite3


# Create connection.
con = sqlite3.connect(path + 'database.sqlite')

try:
    df = pd.read_sql_query("SELECT * FROM trainSearchStream, SearchInfo WHERE  trainSearchStream.SearchID = SearchInfo.SearchID;", con)
    print "read_sql_query went well"
except Exception, e:
    print "read_sql_query failed: "+ str(e))

try:
    df.to_csv(path + 'traindata.csv',index=False,encoding='utf-8')
    print "to_csv went well"
except Exception, e:
    print "to_csv failed: "+ str(e))

edited May 23, 2017 at 11:43

CommunityBot

11 silver badge

answered Jul 7, 2015 at 12:35

Pholochtairze

1,8542 gold badges15 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

MAS Over a year ago

the code terminates without leaving any message. @Pholochtairze

Collectives™ on Stack Overflow

python code terminates when using read_sql_query

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related