2

I created a state machine to run some Glue/ETL jobs in parallel. I'm experimenting the Map state to take advantage of Dynamic parallelism. Here is the step function definition:

{
 "StartAt": "Map",
 "States": {
   "Map": {
     "Type": "Map",
     "InputPath": "$.data",
     "ItemsPath": "$.array",
     "MaxConcurrency": 2,
     "Iterator": {
       "StartAt": "glue job",
       "States": {
         "glue Job": {
           "Type": "Task",
           "Resource": "arn:aws:states:::glue:startJobRun.sync",
           "End": true,
           "Parameters": {
             "JobName": "glue-etl-job",
             "Arguments": {
               "--db": "db-dev",
               "--file": "$.file",
               "--bucket": "$.bucket"
          }
        }
      }
    }
  },
  "Catch": [
    {
      "ErrorEquals": [
        "States.ALL"
      ],
      "Next": "NotifyError"
    }
  ],
  "Next": "NotifySuccess"
},

}
}

The input format that been passed to the step function is like this:

{
 "data": {
   "array": [
     {"file": "path-to-file1", "bucket": "bucket-name1"},
     {"file": "path-to-file2", "bucket": "bucket-name2"},
   ]
 }
}

The problem is the file and bucket job arguments don't get resolved and they are being passed to the glue job like $.file and $.bucket. How can I pass the argument actual values from the input?

1 Answer 1

5

You need to add in the '.$' end of the parameter when using state field for parameter.

"--file.$": "$.file",
"--bucket.$": "$.bucket"

For complete guide check out the spec sheet. https://states-language.net/spec.html#parameters

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.