5

I'm new to Apache Cassandra (using Python 3) and I'm trying to create a table based on a csv file. Here's how the file looks like this one: https://i.sstatic.net/aYRS1.jpg (sorry but I don't have enough reputation points to post the image here)

First I create the table

query1 = "CREATE TABLE IF NOT EXISTS table1(artist text, title text, \
            length text, sessionId text, itemInSession text, PRIMARY KEY (sessionId, title, artist))"     

session.execute(query1)

And then I try to read the file and insert the desired data into the table:

file = 'event_datafile_new.csv'

with open(file, encoding = 'utf8') as f:
    csvreader = csv.reader(f)
    next(csvreader) # skip header
    for line in csvreader:
        query = "INSERT INTO table1(artist, title, length, sessionId, itemInSession)"
        query = query + "VALUES(%s, %s, %s, %s, %s)"
        session.execute(query, (line[0], line[9], line[5], line[8], line[3]))

However, I get the follow error:

---> 13         session.execute(query, (line[0], line[9], line[5], line[8], line[3]))

/opt/conda/lib/python3.6/site-packages/cassandra/cluster.cpython-36m-x86_64-linux-gnu.so in cassandra.cluster.Session.execute (cassandra/cluster.c:38536)()

/opt/conda/lib/python3.6/site-packages/cassandra/cluster.cpython-36m-x86_64-linux-gnu.so in cassandra.cluster.ResponseFuture.result (cassandra/cluster.c:80834)()

InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid STRING constant (288.9922) for "length" of type float"

Even when I tried changing the format of "length" to float - and %s to %f on the INSERT statement - it didn't workout. Does anyone know what might be the issue? Many thanks! :)

6
  • FWIW, it makes sense to change the datatype of length to float. And here in your program the line query = query + "VALUES(%s, %s, %s, %s, %s)" you may want to substitute the values and then call session.execute. It may help to print out what values are being received before executing session.execute (in case some rows are not being parsed as desired). Commented Mar 22, 2019 at 0:15
  • can you execute in cqlsh following commands: use put_your_keyspace_here; and describe table1; ? Commented Mar 22, 2019 at 7:50
  • Also, I'm not sure about data conversion in the python CSV module - by default it's string, but can convert to float if some options are specified. can you print type(line[5]) before executing insert? Commented Mar 22, 2019 at 7:52
  • Have you tried to put single "ticks" around the "%s"? Not a python guy, but as these are text elements, text needs single ticks. E.g.: VALUES('%s', '%s', '%s', '%s', '%s') Commented Mar 22, 2019 at 15:16
  • 2
    This question is an assignment on the Udacity Data engineering programme. Commented Apr 12, 2020 at 13:17

1 Answer 1

14

Whenever you read from a file with csvreader: "Each row read from the csv file is returned as a list of strings No automatic data type conversion is performed unless the QUOTE_NONNUMERIC format option is specified" from: https://docs.python.org/3/library/csv.html

with a table defined with types such as:

"CREATE TABLE IF NOT EXISTS table1(artist text, title text, \
            length double, sessionId int, itemInSession int, PRIMARY KEY (sessionId, title, artist))"

If you cast your values to the correct type it should work work. I tried this and it worked.

session.execute(query, (line[0], line[9], float(line[5]), int(line[8]), int(line[3])))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.