2

I have a file as an input which contain a json Array :

[ {
  ...,
  ...
  },
  {
  ...,
  ...
  },
  {
  ...,
  ...
  }
]

I want to read it without breaking the spring batch principales (With the same way as FlatFileReader or XmlReader)

I didn't find any way to do it the readers already implemented in spring-batch .

What's the best way to implement this reader ?

Thanks in Advance

2 Answers 2

1

Assuming you want to model the StaxEventItemReader in that you want to read each item of the JSON array as an item in Spring Batch, here's what I'd recommend:

  • RecordSeparatorPolicy - You'll need to implement your own RecordSepartorPolicy that indicates if you've finished reading in the full item or not. You can also use the RecordSeparatoerPolicy#postProcess to cleanup the beginning and ending [] you'll need to deal with as well as the comma delimiters.
  • LineTokenizer - You'll then want to create your own LineTokenzier that parses JSON. I was just working on one today for a project so you can use that code as a start (consider it untested):

    public class JsonLineTokenizer implements LineTokenizer {
    
        @Override
        public FieldSet tokenize(String line) {
            List<String> tokens = new ArrayList<>();
    
            try {
                HashMap<String,Object> result =
                        new ObjectMapper().readValue(line, HashMap.class);
    
                tokens.add((String) result.get("field1"));
                tokens.add((String) result.get("field2")));
    
            } catch (IOException e) {
                throw new RuntimeException("Unable to parse json: " + line);
            }
    
            return new DefaultFieldSet(tokens.toArray(new String[0]), {"field1", "field2"});
        }
    }
    
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your response, The RecordSeparatorPolicy will be difficult to implement because i don't have a simple JSON item format , but a complex one that can contain as well a JSON array , any idea
The only complexity that I can see is that the RecordSeparatorPolicy will need to be able to ignore the [] that wrap the entire document and the commas in between each item. Am I missing something?
0

This is the record separator policy I wrote starting from your suggestions and from the default implementation. I use an internal plain string representation for the read record, but i found out very simple to parse the JSON with codehaus jettison JSON object.

public class JsonRecordSeparatorPolicy extends SimpleRecordSeparatorPolicy {

/**
 * True if the line can be parsed to a JSON object.
 * 
 * @see RecordSeparatorPolicy#isEndOfRecord(String)
 */
@Override
public boolean isEndOfRecord(String line) {
    return StringUtils.countOccurrencesOf(line, "{") == StringUtils.countOccurrencesOf(line, "}")
            && (line.trim().endsWith("}") || line.trim().endsWith(",") || line.trim().endsWith("]") );
}

@Override
public String postProcess(String record) {
    if(record.startsWith("[")) record = record.substring(1);
    if(record.endsWith("]")) record = record.substring(0, record.length()-1);
    if(record.endsWith(",")) record = record.substring(0, record.length()-1);
    return super.postProcess(record);
}

}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.