1

I am using PutHBaseJSon processor that will fetch data from hdfs location and to put it into hbase.The data present in hdfs location is like below format and this is in a single file.

{"EMPID": "17", "EMPNAME": "b17", "DEPTID": "DNA"}            
{"EMPID": "18", "EMPNAME": "b18", "DEPTID": "DNA"}
{"EMPID": "19", "EMPNAME": "b19", "DEPTID": "DNA"}

when I execute the PutHBaseJSon processor, it's only fetching the first row and put it into the hbase table which I have created. Can't we able to fetch all the rows present in that file using this processor? or How to fetch all the records from the single file to hbase?

2 Answers 2

1

PutHBaseJSON takes a single JSON document as input. After fetching from HDFS, you should be able to use the SplitText processor with a line count of 1 to get each of your JSON documents into a single flow file.

If you have millions of JSON records in a single HDFS file, then you should perform a two phase split, the first SplitText should split with a line count of say 10,000 then then a second SplitText should split those down to 1 line each.

Sign up to request clarification or add additional context in comments.

Comments

-1

You can make use of SplitJson processor to split them as individual records serially they will be sent to puthbasejson

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.