0

I'm running couple of PowerShell scripts as part of our DevOps pipelines on Windows Azure VMs with az vm run-command create .. tool. Sometimes the commands freeze and won't complete in decent time. When this happens I'm not able to execute those commands anymore and it starts to block the devops pipeline runs. ...and also if I try to delete the run-command with az vm run-command delete .. it also freezes.

Example of run-command execution (and delete) as part of devops pipeline:

# Execute the script on VM with run-command
az vm run-command create \
    --name RecreateBinariesFolder$(Build.BuildId) \
    --vm-name ${{ parameters.vmName }} \
    --resource-group my-group-${{ parameters.environment }} \
    --timeout-in-seconds 600 \
    --script "if(Test-Path C:\\temp\\packages\\){ Remove-Item -Path c:\\temp\\packages -Force -Recurse } ; if(!(Test-Path C:\\temp\\packages\\)){ mkdir c:\\temp\\packages }"
# Delete the run-command from VM
az vm run-command delete \
    --name RecreateBinariesFolder$(Build.BuildId) \
    --vm-name ${{ parameters.vmName }} \
    --resource-group my-group-${{ parameters.environment }} \
    --yes

It works well usually on fresh VM but then executions start to freeze after awhile in later pipeline runs.

Any way to execute run-commands in more stabile way or are there any other handy ways to run commands on Windows VMs more easily and in more stabile manner?


Update 2024-01-23: After re-creating the VMs and starting to run the delete command with --no-wait option we haven't noticed similar freezing anymore after tens of pipeline runs. There was most likely some unexpected issue with the VM itself or the CustomScriptExtension that runs the commands on the VM.


Update 2024-03-12: We also started to delete all existing Run Commands from all the VMs in selected resource group before we run any new ones. This has helped in some cases as pointed out in the selected answer.

RG_NAME=my-resource-group
VM_NAMES=$(az vm list -o json|jq -r '.[].name')
for VM_NAME in $VM_NAMES; do
  VM_NAME=$(echo $VM_NAME|tr -d '\r\n')
  RUN_COMMANDS=$(az vm run-command list --vm-name $VM_NAME --resource-group $RG_NAME -o json)
  for CMD_NAME in $(echo $RUN_COMMANDS|jq -r '.[].name'); do
    CMD_NAME=$(echo $CMD_NAME|tr -d '\r\n')
    az vm run-command delete --name $CMD_NAME --vm-name $VM_NAME --resource-group $RG_NAME --yes --no-wait
  done
done
5
  • 1. Are you using Microsoft-hosted agent or a self-hosted agent? 2. If you do not use the DevOps pipeline, what will be the result of running these commands multiple times on the same VM directly in the Azure portal? Will it also freeze? 3. Please add variable system.debug and set the value to true. Then trigger the pipeline and check if there is any useful info in the debug log. Commented Jan 17, 2024 at 3:24
  • 1. I'm running the pipeline from self-hosted agent as I want to secure the network so that only the agent VM can connect to the VMs. 2. When it started to freeze yes, then it didn't even work from the Portal. When I today re-created the VMs and started running the run-commands in similar way but also adding --no-wait option for the delete now I haven't seen freezing for quite many runs. Commented Jan 17, 2024 at 17:11
  • I do have a hunch that maybe it might have been an issue with the Microsoft.Compute.CustomScriptExtension that is handling run-commands on the server (C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\..). I did see .NET related exceptions in the Windows Event log that were from CustomScriptExtension. Unfortunately I didn't get a screenshot before re-creating the VMs. If I see those again I'll report those to Microsoft. Commented Jan 17, 2024 at 17:11
  • "When I today re-created the VMs and started running the run-commands in similar way ...now I haven't seen freezing for quite many runs." - Do you mean that the pipeline works fine in your new VM? To narrow down the issue, you can create a new VM, run the same commands multiple times on it directly in the Azure portal and check if the VM will freeze or not. If it still freeze, the issue may be related to the VM, otherwise, it may be related to the DevOps self-hosted agent. Commented Jan 22, 2024 at 8:44
  • @ZiyangLiu-MSFT yes, pipeline has now been working multiple times with same VMs. I have also re-created the VMs couple of times but no freezes this far after tens of pipeline runs - So most likely the issue was on the VM and/or the CustomScriptExtension app on the server. Thanks for all the suggestions and help! Commented Jan 23, 2024 at 10:11

2 Answers 2

1

Solution: After re-creating the VMs and starting to run the delete command with --no-wait option, they haven't noticed similar freezing anymore.

If you have the same issue, you can try the followings to narrow down the issue.

  1. Use MS-hosted agent to run the pipeline. If it has the same issue, it may be related to the agent or your scripts.
  2. Run the same commands multiple times on the affected VM directly in the Azure portal.
  3. Create a new VM, run the same commands multiple times on it directly in the Azure portal. If the old VM freezes but the new one works, the issue may be related to the old VM itself. If both the old and new VM work, but freezes when used in the pipeline, the issue may be related to the agent.
Sign up to request clarification or add additional context in comments.

Comments

0

If your requirement is to run a powershell script, why simply don't you use the default command RunPowerShellScript as described in the third example of az vm run-command invoke documentation?

The RunPowershellScript is part of available commands for Windows VMs as described in Run scripts in your Windows VM by using action Run Commands.

This could be the command in your case:

az vm run-command invoke \
  --command-id RunPowerShellScript \
  --name ${{ parameters.vmName }} \
  -g my-group-${{ parameters.environment }} \
  --scripts 'if(Test-Path C:\\temp\\packages\\){ Remove-Item -Path c:\\temp\\packages -Force -Recurse } ; if(!(Test-Path C:\\temp\\packages\\)){ mkdir c:\\temp\\packages }'

In this way you doesn't need to create a new command type and remove it.

In addition, in your code you create a new command but never invoke it. Did you miss the code to invoke it?

It seems that the maximum number of commands that could be created in a vm is 25 (see docs, so if you try to create more commands probably the system doesn't work.

3 Comments

I did use the invoke and RunPowerShellScript first but that would as well freeze randomly after some runs.
The docs for the create vs. invoke are really bad. The Azure CLI docs do mention the invoke and based on that you might expect that you should first create and then invoke but actually based on Windows Run-Command docs it mentions that create will also execute the command and it doesn't even mention the invoke.
And in some use-cases I did have a requirement to run those commands with a different user so I used the create as that allows to use --run-as-user and --run-as-password options to run script with given local user.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.