I've been doing a bit of work writing some batch processing code on CSV data. I found a tutorial online and so far have been using it without really understanding how or why it works, which means I'm unable to solve a problem I'm currently facing.
The code I'm working with is below:
@Bean
public LineMapper<Employee> lineMapper() {
DefaultLineMapper<Employee> lineMapper = new DefaultLineMapper<Employee>();
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setNames(new String[] { "id", "firstName", "lastName" });
lineTokenizer.setIncludedFields(new int[] { 0, 1, 2 });
BeanWrapperFieldSetMapper<Employee> fieldSetMapper = new BeanWrapperFieldSetMapper<Employee>();
fieldSetMapper.setTargetType(Employee.class);
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(fieldSetMapper);
return lineMapper;
}
I'm not entirely clear on what setNames or setIncludedFields is really doing. I've looked through the docs, but still don't know what's happening under the hood. Why do we need to give names to the lineTokenizer? Why can't it just be told how many columns of data there will be? Is its only purpose so that the fieldSetMapper knows which fields to map to which data objects (do they all need to be named the same as the fields in the POJO?)?
I have a new problem where I have CSVs with a large amount of columns (about 25-35) that I need to process. Is there a way to generate the columns in setNames programmatically with the variable names of the POJOs, rather than editing them in by hand?
Edit:
An example input file may be something like:
test.csv:
field1, field2, field3,
a,b,c
d,e,f
g,h,j
The DTO:
public class Test {
private String field1;
private String field2;
private String field3;
//setters and getters and constructor
