3

I'm unsure how I can accomplish filtering my database using only the time field. Right now I have a class called DatabasePolgygon

class DatabasePolygon(dbBase):
    __tablename__ = 'objects'

    begin_time = Column(DateTime) # starting time range of shape
    end_time = Column(DateTime) # ending time range of shape
    # Other entries not relevant to this question

begin_time, and end_time may be equal to such values as 2006-06-01 14:45:23, they represent the X-axis range that an object(in this case a shape over a plot) covers. I want to allow advanced searching for my users, specifically asking for all objects that appear within a range of time. How can I accomplish this with the DateTime field however?

        # Grab all shapes that appear above this certain time
        query_result = query_result.filter(
            DatabasePolygon.begin_time >= datetime.strptime(rng['btime']), %H:%M:%S')
        )

The problem is i'm comparing a datetime object with a Y-m-d H-M-S to an object with only a H-M-S. An example scenario would be if a user wants all objects, regardless of year/month/day, that appear beyond the range of 14:45:24, so we would have rng['btime']=14:45:24 and begin_time=2006-06-01 14:45:23 which doesn't seem to actually filter anything when compared.

Is there some way I can efficiently compare times within this column of data? I'd love to be able to do something like

        # Grab all shapes that appear above this certain time
        query_result = query_result.filter(
            DatabasePolygon.begin_time.time() >= datetime.strptime(rng['btime']), %H:%M:%S').time()
        )
4
  • The way that the underlying DB index works on a datetime column means that there is no efficient way to do this, unless some very specific things are true. E.g. if your data spans a small number of days, then you could do a separate efficient query for each day's time range. Beyond that, I think your best bet is to pull all the data to python and then filter. Some databases support functional indexes - that would allow you to do what you want but I don't know if sqlalchemy supports them. Commented Jan 17, 2016 at 23:42
  • Further to comment from @Tom re: indexes - SQLite added support for indexes on expressions in version 3.9.0 (ref: here) so if SQLAlchemy produces appropriate SQL queries then SQLite should be able to handle them efficiently. Commented Jan 18, 2016 at 0:41
  • @GordThompson Not quite sure how to make sense of indexes in this case, could you provide some code if you plan on answering? Commented Jan 18, 2016 at 0:48
  • 1
    sqlite.org/expridx.html - in your case, the expression is converting the datetime column to the time part only (which is then indexed). Commented Jan 18, 2016 at 1:02

1 Answer 1

3

It appears to be possible, subject to a few conditions.

 
Objective 1: Doing it (at all).

With a class named Thing to hold the id and begin_time values from the "objects" table:

class Thing(Base):
    __tablename__ = 'objects'

    id = Column(Integer, primary_key=True)
    begin_time = Column(DateTime)

    def __repr__(self):
       return "<Thing(id=%d, begin_time='%s')>" % (self.id, self.begin_time)

and test data in the "objects" table of the SQLite database

id  begin_time
--  -------------------
 1  1971-01-14 17:21:53
 2  1985-05-24 10:11:12
 3  1967-07-01 13:14:15

this, unfortunately, does not work:

engine = create_engine(r'sqlite:///C:\__tmp\test.db', echo=True)

Session = sessionmaker(bind=engine)
session = Session()
for instance in session.query(Thing)\
        .filter(Thing.begin_time[11:]<'17:00:00')\
        .order_by(Thing.id):
    print(instance)

producing

NotImplementedError: Operator 'getitem' is not supported on this expression

However, this does work ...

engine = create_engine(r'sqlite:///C:\__tmp\test.db', echo=True)

conn = engine.connect()
result = conn.execute("SELECT id FROM objects WHERE substr(begin_time,12)<'17:00:00'")
id_list = [row[0] for row in result.fetchall()]
result.close()
conn.close()

Session = sessionmaker(bind=engine)
session = Session()
for instance in session.query(Thing)\
        .filter(Thing.id.in_(id_list))\
        .order_by(Thing.id):
    print(instance)

 
Objective 2: Doing it efficiently.

The console output shows us that the first SELECT is indeed

SELECT id FROM objects WHERE substr(begin_time,12)<'17:00:00'

so if we were using SQLite 3.9.0 or later and had created an "index on expression"

CREATE INDEX time_idx ON objects(substr(begin_time,12));

then SQLite would be able to avoid a table scan. Unfortunately, even the latest release of CPython 2.7 at the moment (2.7.11) still ships with a sqlite3 module that is too old

Python 2.7.11 (v2.7.11:6d1b6a68f775, Dec  5 2015, 20:32:19) [MSC v.1500 32 bit (Intel)] on win32
>>> import sqlite3
>>> sqlite3.sqlite_version
'3.6.21'

so that index cannot exist in the database or SQLAlchemy will choke on it:

sqlalchemy.exc.DatabaseError: (sqlite3.DatabaseError) malformed database schema (time_idx) - near "(": syntax error [SQL: "SELECT id FROM objects WHERE substr(begin_time,12)<'17:00:00'"]

So, if the "efficiently" part is really important then you may need to convince Python to use a more current version of SQLite. Some guidance on that may be found in the question

Force Python to forego native sqlite3 and use the (installed) latest sqlite3 version

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.