2

I have a table of 20GB data having 50 million rows. Need to migrate to ElasticSearch using logstash jdbc input plugin. I have tried all basic implementation but need help in migrating data in batch i.e only 10,000 rows at a time. I am not sure how and where to specify this count and how to update it the next time i run logstash. Please help me solve this issue

This is what i have:

input {
    jdbc {
        jdbc_driver_library => "mysql-connector-java-5.1.12-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        jdbc_connection_string => "jdbc:mysql://localhost/db"
        jdbc_validate_connection => true
        jdbc_user => "root"
        jdbc_password => "root"
        clean_run => true
        record_last_run => true
        use_column_value => true
        jdbc_paging_enabled => true
        jdbc_page_size => 5
        tracking_column => id
        statement => "select * from employee"
    }
}

Thanks in advance.

7
  • 1
    Does this help? elastic.co/guide/en/logstash/current/… Commented May 13, 2016 at 14:45
  • Thank you Val, can you please give me some example how to implement it. Commented May 16, 2016 at 5:02
  • There's an excellent blog post that provides an example from A to Z: elastic.co/blog/logstash-jdbc-input-plugin Commented May 16, 2016 at 5:03
  • 1
    Yes, i went through this blog, but it does not satisfy my requirement. I can limit the query to fetch 10,000 records at a time but how will it know the next time to fetch 10,000 records starting from 10,001. So this is some parameter which needs to be passed to the plugin but not sure how to pass this parameter externally. ex : statement => "select * from employee where id > :sql_last_value limit 10" takes only 10 records the first time and stores id of 10th row in the last run record. But next time when i run it, it again fetches the same rows but not the 11-20th rows Commented May 16, 2016 at 6:01
  • 1
    You can set jdbc_paging_enabled to true in order to achieve what you need. Commented May 16, 2016 at 11:06

1 Answer 1

5

You need to set jdbc_paging_enabled to true in order for pagniation to work.

But you also need to make sure that clean_run is set to false, otherwise pagination won't work.

Sign up to request clarification or add additional context in comments.

4 Comments

thank you so much for your guidance, i could achieve what i wanted.
@Val How does pagination behave if aggregate is used ?, that is, if it just cuts a page with a certain id and on the next page some identical id remained, the aggregate will clean and the data will be lost or added to the existing one?
@Max please create a new question with your exact needs
@Val Here's a similar unanswered question, so I didn't want to duplicate the question. :) stackoverflow.com/questions/61032373/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.