2

We have seperate Java instances program processing raw weblog like that:

jvm instance 1 read fileA --> process
jvm instance 2 read fileA ---> process
jvm instance 3 read fileA ---> process
....

I think when the number of jvm instance increase , the number of disk IO process 'll increase. And there comes a time this solution is can not work propertly.

So can anyone tell me another solution for reduce the disk IO.

I think an ideal is using a JMS server (like Apache ActiveMQ) to read file an store in queue and process.

have any problem if i use JMS ???

Please kindly help me.

5
  • 1
    I don't think that adding a queue will reduce the amount of I/O in your system. May I ask why you have a growing number of JVMs instances (I'm assuming that they are in one server) Commented Jul 26, 2012 at 9:05
  • Every jvm instance is a program have an exactly separate responsibility ! so i don't want to use multithreads in here. Can you give an ideal for this ?? Commented Jul 26, 2012 at 9:10
  • Do I read correctly that every instance reads the same fileA ? or is that just a typo in the example? Commented Jul 26, 2012 at 9:12
  • 1
    If each process reads the same file, then it sounds reasonable to introduce a central node to monitor/read file, and post LOG updates as new messages on topic. From I/O perspective it's much better than reading the same file over and over by multiple nodes. But there's a question -- do new nodes require the full content of the LOG file, or only latest updates are sufficient? Commented Jul 26, 2012 at 9:15
  • It would be beneficial if you could add more information to your question. Is every process reading the same file? What is the processing afterwards doing? If every JVM is reading the same file, then you can reduce I/o by reading the file ONLY ONCE and then distributing the information to every process. Separation of responsabilities can be achieved in 1 JVM as well. There's no need for multiple VM's replicating the same I/O work. Commented Jul 26, 2012 at 9:20

1 Answer 1

2

Event driven solution is certainly a good option here, so JMS would probably be a good solution.

But you should keep in mind that if your consumers won't keep up with producer and you'll be using persistent delivery, messages will be stored on your hard drive and this will cause disk IO. But I think this won't be a problem, as you can always increase number of concurrent consumers, or even use cluster (which is really easy to configure with ActiveMQ for example) to keep up with load.

To summarize, I think that JMS would be a great solution to your problem, as you won't need to actively poll filesystem for changes and makes it really easy to scale your processing application.

If you are interested in topic of integration you might visit enterprise integration site and read a extremely good book by Gregor Hohpe and Bobby Woolf on this topic. You can find link to it at the mentioned site. In it you'll find all pros and conses of both approaches as well as familiarize yourself with others available. Anyways messaging is definitely great way to go.

You might consider using camel framework as an implementation of mentioned there patterns.

Sign up to request clarification or add additional context in comments.

3 Comments

well, I would like to know what 'process' is doing before prescribing JMS or any other solution. From far, it "smells" like a lightweight producer/consumer thread model could address the problem at hand.
@maasg you are completely right. But I assumed that if current solution is using several jvm instances, they are using lots of memory for processing, as this is a common approach for fighting logs gc stop the world pauses (another solution is to use good garbage collector of course). As such I thought that isn't an option, and they are looking for something really scalable. And JMS is a great choice for scalability. But you're certainly right that knowledge about processors in this particular case would be very useful.
Indeed there could be few reasons to use several JVM's on one system, but this comment makes me think that there could be a better solution through design change (and not bloating the system with middleware when the problem can be solved in the under ware): "Every jvm instance is a program have an exactly separate responsibility ! so i don't want to use multithreads in here" -- anyway, looks like OP is not interested.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.