We have a requirement where application reads file and inserts data in Cassandra database, however the table can grow up to 300+ MB in one shot during the day. The table will have below structure
create table if not exists orders (
id uuid,
record text,
status varchar,
create_date timestamp,
modified_date timestamp,
primary key (status, create_date));
'Status' column can have value [Started, Completed, Done] As per couple of documents on internet, READ performance is best if it's < 100 MB and index should be used on a column that's least modified (so I cannot use 'status' column as index). Also if I use buckets with TWCS as Minutes then there will be lots of buckets and may impact.
So, how can I better make use of partitions and/or buckets for inserting evenly across partitions and reading records with appropriate status.
Thank you in advance.