0

I am trying to execute the following pipeline in Azure Data Factory with the following Activities: enter image description here

The following additional activities are within the ForEach activity: enter image description here

My issue is that after successfully copying the CSV file to the database and deleting it from the input container, the pipeline will trigger again (the same amount of time as blob created or file uploaded) but after the second trigger there will be no more file in the input container.

Despite everything working fine, the Delete Activity will throw me the error that it cannot delete a non-existing file, which make sense in this setup.

What do I do to not have this warning? Do I need to restructure the logic of the pipeline? If so how?

1
  • how about setting the "wait on completion" flag in the Execute Pipeline activity? Commented Aug 3, 2023 at 5:44

3 Answers 3

0

I agree with @Nandan that setting concurrency can be a good option in your case.

But You can try the below approach as well.

I have observed that, you are copying every csv file in the list and deleting it after copy. As per your pipeline structure, in next trigger run, it will only copy the triggered csv file after copying and deleting every file from the source in the first trigger run.

So, first copy and delete every csv file of source by using debug pipeline with above pipeline structure and for the trigger run change the pipeline structure like below.

In the trigger, add the filter condition for the csv file like below.

enter image description here

And for the file path use the trigger parameters like @triggerBody().fileName. Store the trigger parameter in pipeline parameters.

Use dataset parameter for the file name of the source (give this source to copy activity source and delete activity).

  • First take the copy activity and give this dataset as source. In the copy activity source, give the pipeline parameter to the dataset parameter for the filename. Go through this SO answer to understand how to use trigger parameters in the pipeline.
  • In sink give your target SQL table name(You can use dataset parameter for the table name and give the same pipeline parameter value to that in copy activity sink) and click on Auto create table.
  • After copy activity, use the delete activity and give the same dataset to it. This will delete the exact csv file which is triggered.

This approach will trigger the pipeline for every csv file, copies that to target table and deletes it after copy activity and won't affect by concurrent trigger runs.

Sign up to request clarification or add additional context in comments.

Comments

0

Verify in the foreach items setting you have given the output from filter, but not the output from 'get Metadata' Activity.

If that step is correct, check the input to the delete activity for the failed iteration.

Comments

0

Can you please confirm whether the concurrency property in your pipeline is having value as 1 : enter image description here

If that is enabled, then the pipelines would run sequentially and in 1st iteration all CSV files would be processed and deleted. and in subsequent pipeline triggers, there wont be any csv files for it to iterate and hence wont enter the foreach

my blog: https://datasharkx.wordpress.com/2022/09/24/event-trigger-of-synapse-azure-data-factory-pipeline-on-arrival-of-nth-file-in-azure-blob-storage/ somewhat highlights the same aspects

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.