Spring Batch/Boot converting csv to multiple json and write to Marklogic database

Question

I have requirement where reader being is used to convert a file to a composite object, I need to take different objects inside that object and write it as separate json file. That means for single line of csv file, there will be multiple jsons files created and that needs to be written to Marklogic database. I have used multiple item writer to convert a file to single output file, but now I need to split each line to multiple line and write same to marklogic database. Any idea how single line can be split to multiple files and written to Marklogic database.

example of composite object created out of Item reader, below is just an example not the actual issue scenario:

    Person{
        HomeAddress homeadd;
        OfficeAddress officeAdd;
    }

A single line of csv represents home add and office add. I need in output two json files/objects (one for each type of add) written to Marklogic database. Thanks

David Ennis -CleverLlamas.com · Accepted Answer · 2017-08-30 19:28:03Z

1

If you were using MLCP to process the CSV to one record per line of CSV, then you could also define a transform rule on the input and hijack that process to parse/insert the additional documents.

You could also use a post-commit trigger and after the initial insert, process the documents into the required pieces. If this is high-volume, then you may decide to do this via Corb2.

You could pre-process the CSV into multiple csv files suitable for immidiate ingestion.

Considering all of the options above you could use the data movement SDK to author your solution: https://developer.marklogic.com/learn/data-movement-sdk (or even the MLCP/Hadoop related libraries)

answered Aug 30, 2017 at 19:28

David Ennis -CleverLlamas.com

7,81514 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user1575601 Over a year ago

Thanks David. I am using ItemReader and custom implementation for ItemProcessor and ItemWriter for doing so. I have split the object into more than one object and created file in writer. Now looking into ways to write back into Marklogic database, without writing the file onto disk.

rjrudin Over a year ago

If you're reading data from a file, I think David's first suggestion of using an MLCP transform to split the line into two documents is the easiest way to go. MLCP does well when your data is in a file; I normally bring in Spring Batch when data has to be retrieved from a different source.

user1575601 Over a year ago

Actually data has to be transformed before writing into Marklogic, which is via processor is used to process the data and, by using writer, I am trying to avoid disk writes by just populating the streams instead of file as output and then inserting them into marklogic.

David Ennis -CleverLlamas.com Over a year ago

MLCP has an option to transform data on the way in. Why does it have to be transformed externally?

Collectives™ on Stack Overflow

Spring Batch/Boot converting csv to multiple json and write to Marklogic database

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related