0

I have a table of three columnsid,word,essay.I want to do a query using (?). The sql sentence is sql1 = "select id,? from training_data". My code is below:

def dbConnect(db_name,sql,flag):
    conn = sqlite3.connect(db_name)
    cursor = conn.cursor()
    if (flag == "danci"):
        itm = 'word'
    elif flag == "wenzhang":
        itm = 'essay'
    n = cursor.execute(sql,(itm,))
    res1 = cursor.fetchall()
    return res1

However, when I print dbConnect("data.db",sql1,"danci") The result I obtained is [(1,'word'),(2,'word'),(3,'word')...].What I really want to get is [(1,'the content of word column'),(2,'the content of word column')...]. What should I do ? Please give me some ideas.

1 Answer 1

3

You can't use placeholders for identifiers -- only for literal values.

I don't know what to suggest in this case, as your function takes a database nasme, an SQL string, and a flag to say how to modify that string. I think it would be better to pass just the first two, and write something like

sql = {
    "danci":    "SELECT id, word  FROM training_data",
    "wenzhang": "SELECT id, essay FROM training_data",
}

and then call it with one of

dbConnect("data.db", sql['danci'])

or

dbConnect("data.db", sql['wenzhang'])

But a lot depends on why you are asking dbConnect to decide on the columns to fetch based on a string passed in from outside; it's an unusual design.


Update - SQL Injection

The problems with SQL injection and tainted data is well documented, but here is a summary.

The principle is that, in theory, a programmer can write safe and secure programs as long as all the sources of data are under his control. As soon as they use any information from outside the program without checking its integrity, security is under threat.

Such information ranges from the obvious -- the parameters passed on the command line -- to the obscure -- if the PATH environment variable is modifiable then someone could induce a program to execute a completely different file from the intended one.

Perl provides direct help to avoid such situations with Taint Checking, but SQL Injection is the open door that is relevant here.

Suppose you take the value for a database column from an unverfied external source, and that value appears in your program as $val. Then, if you write

my $sql = "INSERT INTO logs (date) VALUES ('$val')";
$dbh->do($sql);

then it looks like it's going to be okay. For instance, if $val is set to 2014-10-27 then $sql becomes

INSERT INTO logs (date) VALUES ('2014-10-27')

and everything's fine. But now suppose that our data is being provided by someone less than scrupulous or downright malicious, and your $val, having originated elsewhere, contains this

2014-10-27'); DROP TABLE logs; SELECT COUNT(*) FROM security WHERE name != '

Now it doesn't look so good. $sql is set to this (with added newlines)

INSERT INTO logs (date) VALUES ('2014-10-27');
DROP TABLE logs;
SELECT COUNT(*) FROM security WHERE name != '')

which adds an entry to the logs table as before, end then goes ahead and drops the entire logs table and counts the number of records in the security table. That isn't what we had in mind at all, and something we must guard against.

The immediate solution is to use placeholders ? in a prepared statement, and later passing the actual values in a call to execute. This not only speeds things up, because the SQL statement can be prepared (compiled) just once, but protects the database from malicious data by quoting every supplied value appropriately for the data type, and escaping any embedded quotes so that it is impossible to close one statement and another open another.

This whole concept was humourised in Randall Munroe's excellent XKCD comic

Sign up to request clarification or add additional context in comments.

15 Comments

I would stress more the reasoning behind not simply replacing the word (sql-injection). Other than that, good answer.
@deets: Yes, in general, however, there's not really a SQL injection vulnerability here because the column name is embedded in the code. In this specific case it is safe to use string formatting to build the query.
Thanks for your answers. You mean I can't use '?' for column name?
@wanglan8498: Yes, I mean exactly that. As you've seen, if you use a placeholder then it replaces it with the string value of the bound variable rather than a column name.
@Borodin I know this, you know this, and your code isn't vulnerable - the OP though doesn't know I presume, thus I tend to emphasise this whenever I give answers that essentially boil down to modifying queries dependent on user-input.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.