2

I have two csv files that are funnelled into a MergeContent Processor. I want them to be merged together. They both have the same columns. If the first and second csv's look like this:

First CSV:

id, name
12,John
11,Keels

Second CSV:

id, name
22,Kelly
25,Felder

My output should look like this:

id, name
12,John
11,Keels
22,Kelly
25,Felder

I have tried doing this through the MergeContent Processor. But it Changes the data into a different format I don't want that to happen. Both the Input files and the output files must be .csv and also contain the same name as the input files. (The input files have the same name)

1 Answer 1

1

Use MergeRecord processor with the common attribute. For example, both flow files have the same attribute such as filename = test.csv then you can set the MergeRecord processor as follows:

Record Reader                      CSVReader
Record Writer                      CSVRecordSetWriter
Merge Strategy                     Bin-Packing Algorithm
Correlation Attribute Name         filename
Attribute Strategy                 Keep Only Common Attributes
Minimum Number of Records          3

The important thing is the minimum number of records, which is the number of rows to be merged. In this case, it should be larger than 2 because each CSV has 2 rows. Then, the CSV will wait for the other CSV to exceed the minimum.

Sign up to request clarification or add additional context in comments.

14 Comments

I get an error saying filename with the same name already exists. When trying to use a putfile processor.
If you input several files, then you have to specify the filename that should not be duplicated. That is not a problem of merging but your definition about the filename. You may specify the filename attribute with some time format such as ${filename:append(${now():format("yyyy-MM-dd_HH:mm:ss", "GMT")}):append('.csv')}.
I have a csv file which I break into two flowfiles and process independently. I send the two files to the mergeRecord Processor. Therefore both of them have the same filename. I am confused as to why it says to have a file with the same name. As it should be one file at the end.
Oh, I see. Why don't you choose the correlation attribute name as filename? And Since you split the record from a file, it is better to use the defragment strategy.
What exactly is correlation Attribute Name ? Also I get an error when I changed to defragment from merge record processor saying ` Could not merge bin with 1 flowfiles because the fragment.count attribute was not present on any of the flowfiles`
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.