1

I have a query that I am running from a python script. It looks like this:

app_phone=str(5555555555)
query_string=""""select biz_name, biz_addr, biz_owner
from business_t
where regexp_replace(biz_phone_1, E'\\\\D|^1', '', 'g') = '"""+app_phone+"""'
or regexp_replace(biz_phone_2, E'\\\\D|^1', '', 'g') = '"""+app_phone+"""'
or regexp_replace(biz_cell_1, E'\\\\D|^1', '', 'g') = '"""+app_phone+"""'
or regexp_replace(biz_cell_2, E'\\\\D|^1', '', 'g') = '"""+app_phone+"""'
;"""
result=run_query(query_string)

the query runs fine - my question here is really what is the best way to write this type of query "grammatically," in an easy to read format that is not going to add unnecessary processing to the script? Or is this a good way of writing this?

This just looks kind of ugly, but maybe that's just the curse of needing raw SQL in your code.

1
  • I'd write that with string.format, because I find it more readable, but it does add a (tiny amount) of run-time overhead. (i.e. query_string = """... {app_phone} ... {app_phone}...""".format(app_phone = "555555555") Commented Sep 26, 2013 at 19:41

5 Answers 5

1

I can recommend the following options:

1) I would embed query parameters directly in the query, I would pass then separately as a tuple/dict to the cursor.execute (see your db api for exact formatting) method:

app_phone = 5555555555
query_string="""SELECT biz_name, biz_addr, biz_owner
                 FROM business_t
                 WHERE regexp_replace(biz_phone_1, E'\\\\D|^1', '', 'g') = '%(phone)s'
                     OR regexp_replace(biz_phone_2, E'\\\\D|^1', '', 'g') = '%(phone)s'
                     OR regexp_replace(biz_cell_1, E'\\\\D|^1', '', 'g') = '%(phone)s'
                     OR regexp_replace(biz_cell_2, E'\\\\D|^1', '', 'g') = '%(phone)s';
             """
result = run_query(query_string, {'phone': app_phone})

This solution will save you from (most) sql injection attacks

2) To build the query you can consider using sql query building library (https://pypi.python.org/pypi/python-sql/0.2). That will allow you to build SQL queries in terms of expressions, not using string editing. Not sure if this query builder will support usage of regexp in where though

3) You can try to use loop, but the question if it's become more readable will be subjective, imho:

app_phone = 5555555555
cmp_phones = "regexp_replace(%s, E'\\\\D|^1', '', 'g') = '%%(phone)s'"
db_phone_columns = (biz_phone_1, biz_phone_2, biz_cell_1, biz_cell_2)
where_condition = 'OR'.join(cmp_phones % phone for phone in db_phone_columns)
result = run_query(query_string, {'phone': app_phone}
query_string="""SELECT biz_name, biz_addr, biz_owner
                 FROM business_t
                 WHERE %(where_condition)s;""" % 
             {'where_condition': where_condition}
result = run_query(query_string, {'phone': app_phone})

I personally find solution 1) most readable

4) Use stored procedure with phone as a parameter

5) Formatting of the query inside the query string that I personally prefer is demonstrated in the examples

Sign up to request clarification or add additional context in comments.

Comments

1

I would do it something like this:

app_phone = 5555555555
query_string = r"""
    SELECT biz_name, biz_addr, biz_owner
    FROM   business_t
    WHERE  regexp_replace(biz_phone_1, E'\\D|^1', '', 'g') = '{0}'
        OR regexp_replace(biz_phone_2, E'\\D|^1', '', 'g') = '{0}'
        OR regexp_replace(biz_cell_1,  E'\\D|^1', '', 'g') = '{0}'
        OR regexp_replace(biz_cell_2,  E'\\D|^1', '', 'g') = '{0}'
    ;""".format(app_phone)
result = run_query(query_string)

Note that I've made it a raw string with the r prefix to get rid of one level of backslash-escaping. I think using the .format() feature is overall the best way to do string substitution in Python (and certainly better than concatenation), particularly when you want to repeat the same substituted value multiple times. Some may argue you should use meaningful names instead of numbers (i.e. replace {0} with a name), and that's fine. Some may also argue that you should use SQL's parameters instead of Python's string substitution, and I think that's good advice. But you seemed focused mainly on how to manage long strings, especially with concatenation, so my answer focuses on using Python's string features.

1 Comment

the r prefix would definitely clean things up a bit - good additional tip.
0

I usually clean up my raw query strings by lining up the SQL and using caps for SQL terms:

app_phone = 5555555555
query_string = """SELECT biz_name, biz_addr, biz_owner
                  FROM   business_t
                  WHERE  regexp_replace(biz_phone_1, E'\\\\D|^1', '', 'g') = '"""+app_phone+"""'
                  OR     regexp_replace(biz_phone_2, E'\\\\D|^1', '', 'g') = '"""+app_phone+"""'
                  OR     regexp_replace(biz_cell_1, E'\\\\D|^1', '', 'g')  = '"""+app_phone+"""'
                  OR     regexp_replace(biz_cell_2, E'\\\\D|^1', '', 'g')  = '"""+app_phone+"""'
                  ;"""
result = run_query(query_string)

3 Comments

Fair point, but I was more concerned with the stringing together of strings
This might be better suited as just a SQL question then. The python tag suggests python formatting.
It's not the SQL that I am concerned with so much as it is the string concatenation in python. I'm also wondering how people would normally write this, if they would pull out a regex and make that a variable, or if they would run a for loop to create all the where clauses.
0

Well you could automate it a bit, without adding too much overhead:

val = '55555'
cols = ('biz_phone_1','biz_phone_2','biz_cell_1','biz_cell_2')
q = "SELECT biz_name, biz_addr, biz_owner FROM business_t WHERE "
q += ' OR '.join("regexp_replace({0}, E'\\\\D|^1', '', 'g') = '{1}'"
               .format(x,val)
               for x in  cols)

However you should probably use prepared statements which your driver should support.

Comments

0

I haven't done much SQL from Python, but in other languages I've just arranged things neatly by whatever conventions the language has, which usually involves breaking it down line by line:

query_stmt = "SELECT stuff FROM tables ",
             "WHERE conditions = something"

I've also built queries in sections and pieced them all together, so that you end up with something like:

queryStmt = "SELECT " + selectClause + " FROM " + fromClause + " WHERE " + whereClause

Although I've run into different ways to put a query together and to run it, I've never had anything bark at me for assembling a query in any of the ways it's valid to assemble strings ~ so as long as you're creating a valid SQL with the right quotes and escape sequences needed when running it, you have leeway to make it look pretty.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.