0

I have implemented Spring batch local step partitioning in Windows with Grid size 15 and corepoolsize and maxPoolsize as 10, so when i execute it 10 threads are executing parallel( data size is 1 Million records and it completed in 50 seconds, 8 GB RAM Configuration).

I wanted to execute with more data so we executed the jar on linux with 10 million data with same configuration(grid size 15, pool size 10), but started with only one thread, then after sometime it started two other threads and so on. The linux machine is with server configuration like more than 100 gb RAM(data size 10 million and it took about 16 minutes to complete,i feel it is very very slow). Ideally 10 threads should run parallel based on my configuration, i am confused.

The xml configuration is:

<batch:step id="step6">
    <batch:partition step="loadFlatFiles" partitioner="multiFileResourcePartitioner">
                <batch:handler grid-size="15" task-executor="loadCustomerTaskExecutor" />
    </batch:partition>
</batch:step>

<bean id="loadCustomerTaskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="10" />
<property name="maxPoolSize" value="10" />
<property name="allowCoreThreadTimeOut" value="true" />
</bean>

<batch:step id="loadFlatFiles">
  <batch:tasklet>
      <batch:chunk reader="masterFileItemReader" writer="masterFileWriter" processor="itemProcessor" commit-interval="5000" skip-limit="1000000" >
          <batch:skippable-exception-classes>
            <batch:include class="org.springframework.batch.item.file.FlatFileParseException"/>
          </batch:skippable-exception-classes>
          <batch:listeners>
            <batch:listener ref="recordSkipListener"/>
        </batch:listeners>
    </batch:chunk>
  </batch:tasklet>
</batch:step>

<bean id="recordSkipListener" class="com.cdi.batch.listener.RecordSkipListener" scope="step">
</bean>

<bean id="multiFileResourcePartitioner" class="com.cdi.batch.partitioner.MultiFileResourcePartitioner"
scope="step">
<property name="keyName" value="fileResource" />
<property name="fileName" value="fileName" />
<property name="directory" value="file:${input.files.location}" />
</bean>

Is any one faced the same issue and i would like to know why its behaving like this?

Update: For storing the job related meta data i am using in memory

<!-- stored job-meta in memory -->
<bean id="jobRepository"
    class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
    <property name="transactionManager" ref="transactionManager" />
</bean>

The code was implemented using Java 6 and Spring batch 3.

4
  • It seems a problem of the Linux JVM or some configuration. What is the version of JVM used to run the batch on your server? Commented Feb 18, 2015 at 10:53
  • 1
    This link should be useful to understand why threads are few and slow on starting stackoverflow.com/questions/7726871/… Commented Feb 18, 2015 at 11:00
  • Thanks for the link.. i will look into this... BTW i am using JDK 6 on linux. Commented Feb 19, 2015 at 9:03
  • 1
    While googling I found that on Linux a max thread limit could be set (on OS level). You could try to find out the value of this limit in your server and ask for more. Commented Feb 20, 2015 at 10:53

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.