Spring Batch : Write a List to a database table using a custom batch size

Question

Background

I have a Spring Batch job where :

FlatFileItemReader - Reads one row at a time from the file
ItemProcesor - Transforms the row from the file into a List<MyObject> and returns the List. That is, each row in the file is broken down into a List<MyObject> (1 row in file transformed to many output rows).
ItemWriter - Writes the List<MyObject> to a database table. (I used this implementation to unpack the list received from the processor and delegae to a JdbcBatchItemWriter)

Question

At point 2) The processor can return a List of 100000 MyObject instances.
At point 3), The delegate JdbcBatchItemWriter will end up writing the entire List with 100000 objects to the database.

My question is : The JdbcBatchItemWriter does not allow a custom batch size. For all practical purposes, the batch-size = commit-interval for the step. With this in mind, is there another implementation of an ItemWriter available in Spring Batch that allows writing to the database and allows configurable batch size? If not, how do go about writing a custom writer myself to acheive this?

Mahmoud Ben Hassine · Accepted Answer · 2019-10-24 09:41:22Z

2

I see no obvious way to set the batch size on the JdbcBatchItemWriter. However, you can extend the writer and use a custom BatchPreparedStatementSetter to specify the batch size. Here is a quick example:

public class MyCustomWriter<T> extends JdbcBatchItemWriter<T> {

    @Override
    public void write(List<? extends T> items) throws Exception {
        namedParameterJdbcTemplate.getJdbcOperations().batchUpdate("your sql", new BatchPreparedStatementSetter() {
            @Override
            public void setValues(PreparedStatement ps, int i) throws SQLException {
                // set values on your sql
            }

            @Override
            public int getBatchSize() {
                return items.size(); // or any other value you want
            }
        });
    }

}

The StagingItemWriter in the samples is an example of how to use a custom BatchPreparedStatementSetter as well.

answered Oct 24, 2019 at 9:41

Mahmoud Ben Hassine

32.1k5 gold badges38 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Ping Over a year ago

@MohmoudBenHassine This looks promising. Regarding the StagingItemWriter example, instead of using a BatchPreparedStatementSetter, I could simply call getJdbcTemplate().batchUpdate in the write method passing it the required batch size?

Mahmoud Ben Hassine Over a year ago

Yes, that's another option.

Ping Over a year ago

I've got the necessary direction to go forward. Thanks.

Mahmoud Ben Hassine Over a year ago

glad it helped! However, this is tricky. There will be a single transaction for the whole chunk, no matter how many items will result from the processing logic. For example, If commit-size=10 and for each item the processor returns 100 items, then the writer will receive 1000 items per chunk. Those 1000 items will be committed in a single transaction, even if the jdbc batch_size = 500. The batch size governs how many queries will go to the db (2 batchUpdate queries in this case, 500 items in each batch) and not how many transactions (a single transaction driven by Spring Batch in this case).

Mahmoud Ben Hassine Over a year ago

There are no issues. The idea of chunk processing is to process the whole chunk as a unit (all or nothing semantics). Just wanted to clarify that the batch size you are trying to customize does not influence this model (it only influences how many queries will be sent to the db).

|

Ping · Accepted Answer · 2019-10-24 13:54:20Z

The answer from Mahmoud Ben Hassine and the comments pretty much covers all aspects of the solution and is the accepted answer.

Here is the implementation I used if anyone is interested :

public class JdbcCustomBatchSizeItemWriter<W> extends JdbcDaoSupport implements ItemWriter<W> {

    private int batchSize;
    private ParameterizedPreparedStatementSetter<W> preparedStatementSetter;
    private String sqlFileLocation;
    private String sql;

    public void initReader() {
        this.setSql(FileUtilties.getFileContent(sqlFileLocation));
    }

    public void write(List<? extends W> arg0) throws Exception {
        getJdbcTemplate().batchUpdate(sql, Collections.unmodifiableList(arg0), batchSize, preparedStatementSetter);
    }

    public void setBatchSize(int batchSize) {
        this.batchSize = batchSize;
    }

    public void setPreparedStatementSetter(ParameterizedPreparedStatementSetter<W> preparedStatementSetter) {
        this.preparedStatementSetter = preparedStatementSetter;
    }

    public void setSqlFileLocation(String sqlFileLocation) {
        this.sqlFileLocation = sqlFileLocation;
    }

    public void setSql(String sql) {
        this.sql = sql;
    }
}

Note :

The use of Collections.unmodifiableList prevents the need for any explicit casting.
I use sqlFileLocation to specify an external file that contains the sql and FileUtilities.getfileContents simply returns the contents of this sql file. This can be skipped and one can directly pass the sql to the class as well while creating the bean.

Dean Clark · Accepted Answer · 2019-10-24 15:52:59Z

0

I wouldn't do this. It presents issues for restartability. Instead, modify your reader to produce individual items rather than having your processor take in an object and return a list.

answered Oct 24, 2019 at 15:52

Dean Clark

3,8981 gold badge13 silver badges27 bronze badges

2 Comments

Ping Over a year ago

The processing required to convert 1 row in the file to multiple rows is relatively complex to put in an ItemReader. If I don't need to restart jobs considering my jobs never need to be restarted, is there any other issue with this approach that you see?

Dean Clark Over a year ago

Much time has passed, but did want to follow-up on this. It would not be that complicated to create a custom ItemReader to return individual items (multiple per row). You would just need to appropriately track your itemCount to preserve restartability. This is much cleaner than the alternative you proposed.

Collectives™ on Stack Overflow

Spring Batch : Write a List to a database table using a custom batch size

3 Answers 3

7 Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related