0

I am working on a Spring batch project where I have to push data from a CSV file into a DB. Managed to implement the batch and the rest, currently the data is being pushed as it should but I wonder if there's anyway to skip some of the columns in the CSV file as some of them are irrelevant.

I did a bit of research but I wasn't able to find an answer, unless I missed something.

Sample of my code below.

<bean id="mysqlItemWriter"
      class="org.springframework.batch.item.database.JdbcBatchItemWriter">
    <property name="dataSource" ref="dataSource" />
    <property name="sql">
        <value>
            <![CDATA[
            insert into WEBREPORT.RAWREPORT(CLIENT,CLIENTUSER,GPS,EXTENSION) values (:client, :clientuser, :gps, :extension)
        ]]>
        </value>
    </property>

2 Answers 2

1

You can implement your FieldSetMapper which will map structure from one line to your POJO in reader.

Lets say you have:

name, surname, email
Mike, Evans, [email protected]

And you have model of Person with only name and email. You are not interested in surname. Here is reader example:

@Component
@StepScope
public class PersonReader extends FlatFileItemReader<Person> {

    @Override
    public void afterPropertiesSet() throws Exception {
        //load file in csvResource variable
        setResource(csvResource);
        setLineMapper(new DefaultLineMapper<Person>() {
            {
                setLineTokenizer(new DelimitedLineTokenizer());
                setFieldSetMapper(new PersonFieldSetMapper());
            }
        });
        super.afterPropertiesSet();
    }
}

And you can define PersonFieldSetMapper:

@Component
@JobScope
public class PersonFieldSetMapper implements FieldSetMapper<Person> {

    @Override
    public Person mapFieldSet(final FieldSet fieldSet) throws BindExceptio   
    {
        final Person person = new Person();
        person.setName(fieldSet.readString(0)); // columns are zero based
        person.setEmail(fieldSet.readString(2));

        return person;
    }
}

This is for skipping columns, if I understood right this is what you want. If you want to skip rows, it can be done as well and I explained how to skip blank lines for example in this question.

Sign up to request clarification or add additional context in comments.

1 Comment

@user2342259 did you mean to skip row or skip column? Was this answer what you are looking for?
0

if the check for the skip is simple and does not need a database roundtrip, you can use a simple itemProcessor, which returns null for skipped items

real simple pseudo code

public class SkipProcessor implements ItemProcessor<Foo,Foo>{
    public Foo process(Foo foo) throws Exception {
        //check for a skip
        if(skip(foo)) {
          return null;
        } else {
          return foo;
        }
    }
}

if the skip check is more complex and needs a database roundtrip, you can use the item processor, but the performance (if needed) will suffer

if performance is critical...well then it depends on setup, requirements and your possibilities, i would try it with 2 steps, one step loads cvs into database (without any checks), second steps reads data from database, and the skip check is done with a clever sql JOIN in the SQL for the itemReader

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.