1

I'm trying to load a .csv file with Python into Apache Cassandra database. The command "COPY" integrated with session.execute seems don't work. It gives an unexpected indent in correspondance of =',' but...I red something about and I found that the command COPY in this way is not supported.

In this script time_test and p are two float variables

from cassandra.cluster import Cluster

cluster = Cluster()

session = cluster.connect('myKEYSPACE')


rows = session.execute('COPY table_test (time_test, p) 
                        from'/home/mypc/Desktop/testfile.csv' with delimiter=',' and header=true;
                       ')
                                                                     

print('DONE')

Thank you for help!

1 Answer 1

1

Main problem here is that COPY is not a CQL command, but a cqlsh command, so it couldn't be executed via session.execute.

I recommend to use DSBulk to load data into Cassandra - it's very flexible, performant, and doesn't require programming. For simplest case, when you have direct mapping of columns in header of CSV file into column names in database, then the command-line will be very simple:

dsbulk load -url file.csv -k keyspace -t table -header true

There is a series of blog posts about DSBulk that covers a lot of topics:

Sign up to request clarification or add additional context in comments.

3 Comments

thank you...but...I need to use datastax? Cause I'm using APC on my ubuntu
No, it should work with any supported Cassandra version
and no, it's not only the one way, but it's the better way than manually written code. P.S. btw, COPY isn't very scalable & buggy - that's was one of the reasons for DSBulk creation

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.