0

I am new to spring batch, and I'm encountering an issue when using multiple data source in my batch.

Let me explain.

I am using 2 databases in my server with Spring Boot.

So far everything worked fine with my implementation of RoutingDataSource.

@Component("dataSource")
public class RoutingDataSource extends AbstractRoutingDataSource {

  @Autowired
  @Qualifier("datasourceA")
  DataSource datasourceA;

  @Autowired
  @Qualifier("datasourceB")
  DataSource datasourceB;

  @PostConstruct
  public void init() {
    setDefaultTargetDataSource(datasourceA);
    final Map<Object, Object> map = new HashMap<>();
    map.put(Database.A, datasourceA);
    map.put(Database.B, datasourceB);
    setTargetDataSources(map);
  }

  
  @Override
  protected Object determineCurrentLookupKey() {
    return DatabaseContextHolder.getDatabase();
  }
}

The implementation require a DatabaseContextHolder, here it is :

public class DatabaseContextHolder {
    private static final ThreadLocal<Database> contextHolder = new ThreadLocal<>();

    public static void setDatabase(final Database dbConnection) {
        contextHolder.set(dbConnection);
    }

    public static Database getDatabase() {
        return contextHolder.get();
    }
}

When I received a request on my server, I have a basic interceptor that sets the current database based on some input I have in the request. with the method DatabaseContextHolder.setDatabase(db); Everything works fine with my actual controllers.

It gets more complicated when I try to run a job with one tasklet.

One of my controller start an async task like this.

@GetMapping("/batch")
public void startBatch() {
  return jobLauncher.run("myJob", new JobParameters());
}

@EnableBatchProcessing
@Configuration
public class MyBatch extends DefaultBatchConfigurer {


  @Autowired private JobBuilderFactory jobs;

  @Autowired private StepBuilderFactory steps;

  @Autowired private MyTasklet tasklet;

  @Bean
  public Job job(Step step) {
    return jobs.get("myJob").start(step).build();
  }

  @Bean
  protected Step registeredDeliveryTask() {
    return steps.get("myTask").tasklet(tasklet).build();
  }

  /** Overring the joblauncher get method to make it asynchornous */
  @Override
  public JobLauncher getJobLauncher() {
    try {
      SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
      jobLauncher.setJobRepository(super.getJobRepository());
      jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
      jobLauncher.afterPropertiesSet();
      return jobLauncher;
    } catch (Exception e) {
      throw new BatchConfigurationException(e);
    }
  }
}

And my Tasklet :

@Component
public class MyTasklet implements Tasklet {

  @Autowired
  private UserRepository repository;

  @Override
  public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)throws Exception {

  //Do stuff with the repository.

  }

But the RoutingDataSource doesn't work, even if I set my Context before starting the job. For example if I set my database to B, the repo will work on database A. It is always the default datasource that is selected. (because of this line setDefaultTargetDataSource(datasourceA); )

I tried to set the database, by passing the value in the parameters, inside the tasklet, but still got the same issue.

@GetMapping("/batch")
public void startBatch() {
  Map<String, JobParameter> parameters = new HashMap<>();
  parameters.put("database", new JobParameter(DatabaseContextHolder.getCircaDatabase().toString()));
  return jobLauncher.run("myJob", new JobParameters(parameters));
}
  @Override
  public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)throws Exception {

    String database =
        chunkContext.getStepContext().getStepExecution().getJobParameters().getString("database");
    DatabaseContextHolder.setDatabase(Database.valueOf(database));
  //Do stuff with the repository.

  }

I feel like the problem is because the database was set in a different thread, because my job is asynchronous. So it cannot fetch the database set before launching the job. But I couldn't find any solution so far.

Regards

1 Answer 1

0

Your routing datasource is being used for Spring Batch's meta-data, which means the job repository will interact with a different database depending on the thread processing the request. This is not needed for batch jobs. You need to configure Spring Batch to work with a fixed data source.

Sign up to request clarification or add additional context in comments.

4 Comments

Can we not have a case where we have a fixed destination data source, but multiple source datasources? SB can use destination for meta-data, but all datasources can be part of routing
It's a matter of configuration. It is up to you to set the datasource you want to use for meta-data on the job repository.
Yes, that is correct. But I was wondering if the scenario where we can configure to read from multiple data source for same job, is readily available.
I think it is a very common use case (I maybe wrong) for Spring Batch. The issue with Tasklet is to pass data to reader (what if it is a big collection) (Also, it was not my question, so I cannot accept the answer)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.