This is actually a design question / problem. And I am not sure if writing and reading the file is an ideal solution here. Nonetheless, I will outline what I am trying to do below:
I have the following static method that once the reqStreamingData method of obj is called, it starts retrieving data from client server constantly at a rate of 150 milliseconds.
public static void streamingDataOperations(ClientSocket cs) throws InterruptedException, IOException{
// call - retrieve streaming data constantly from client server,
// and write a line in the csv file at a rate of 150 milliseconds
// using bufferedWriter and printWriter (print method).
// Note that the flush method of bufferedWriter is never called,
// I would assume the data is in fact being written in buffered memory
// not the actual file.
cs.reqStreamingData(output_file); // <- this method comes from client's API.
// I would like to another thread (aka data processing thread) which repeats itself every 15 minutes.
// I am aware I can do that by creating a class that extends TimeTask and fix a schedule
// Now when this thread runs, there are things I want to do.
// 1. flush last 15 minutes of data to the output_file (Note no synchronized statement method or statements are used here, hence no object is being locked.)
// 2. process the data in R
// 3. wait for the output in R to come back
// 4. clear file contents, so that it always store data that only occurs in the last 15 minutes
}
Now, I am not well versed in multithreading. My concern is that
- The request data thread and the data processing thread are reading and writing to the file simultaneously but at a different rate, I am not sure if the data processing thread would delay the request data thread by a significant amount, since the data processing have more computational heavy task to carry out than the request data thread. But given that they are 2 separate threads, would any error or exception occur here ?
- I am not too supportive of the idea of writing and reading the same file at the same time but because I have to use R to process and store the data in R's dataframe in real time, I really cannot think of other ways to approach this. Are there any better alternatives ?
- Is there a better design to tackle this problem ?
I understand that this is a lengthy problem. Please let me know if you need more information.