1

I have a use case that I'm not sure if can be solved the way I want with Spring batch.

Use case

  1. Read 2 types of XML files with different structures, from a directory (can be multiple files of both types), into 2 types of object
  2. Process these objects
  3. Write a new flat file (.txt) which serves as a report, using data from both files/objects

Issue

If I understand the design correctly, ItemReaders read one kind of object, ItemProcessors take one kind of object and return another, whilst ItemWriters write one kind of object. As far as I know, there is no way to do chunk-processing on multiple files with different structures. Or rather, have two readers, one processor and one writer.

Any suggestions on how this can be solved in a good way?

I believe this kind of processing can be achieved using Tasklets, but I find the code often gets a bit messy with holding data between steps using the context etc.

This is a WIP on one of the readers I've made for one of the types ("Car" is just to make the example more readable)

@Bean
fun multiResourceItemReader(): ItemReader<CarData> {
    val patternResolver: ResourcePatternResolver = PathMatchingResourcePatternResolver()
    val resources: Array<Resource> = patternResolver.getResources("/some/directory")
    val reader: MultiResourceItemReader<CarData> = MultiResourceItemReader()
    reader.setResources(resources)
    reader.setDelegate(carItemReader())
    return reader
}

@Bean
fun carItemReader(): StaxEventItemReader<CarData> =
    StaxEventItemReaderBuilder<CarData>()
        .name("CarItemReader")
        .addFragmentRootElements("Car")
        .unmarshaller(carDataMarshaller())
        .build()

@Bean
fun carDataMarshaller(): XStreamMarshaller {
    val aliases: MutableMap<String, Class<*>> = HashMap()
    aliases["CarDetails"] = CarDetailType::class.java
    aliases["carProp1"] = Int::class.java
    aliases["carProp2"] = Int::class.java
    aliases["carProp3"] = Int::class.java
    aliases["carProp4"] = Int::class.java
    aliases["carProp5"] = Int::class.java

    val marshaller: XStreamMarshaller = XStreamMarshaller()
    marshaller.setAliases(aliases)
    return marshaller
}

Now a Step definition would typically look something like this for a single reader, but I haven't gotten this far as I'm pondering about how to implement the use case at all:

    stepBuilderFactory.get("step1").chunk(5)
        .reader(multiResourceItemReader())
        .writer(someWriter())
        .build();

1 Answer 1

1

Since multiple readers is not an option, this trick could tackle this issue:

Implement a pre-process step that merges the 2 XML files, each file content under dedidated root node, rootNodeA and rootNodeB

Encapsulate the 2 XML classes in a wrapper class:

    @XmlRootElement(name = "root")
    @XmlAccessorType(XmlAccessType.FIELD)
    public class AB {
    
        @XmlElement(name = "rootNodeA")
        private A a = new A();
    
        @XmlElement(name = "rootNodeB")
        private B b = new B();

        //Getters & Setters
    }

Then AB can easily be read and processed in a classic way

NB: It is also possible to do the pre-process in a beforeStep stepExecutionListener, and delete the merged file in afterStep if disk space is a potential issue

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.