Role Assignment Error with UAA Constraints

Role Assignment Error with UAA Constraints

Introduction

Recently I ran into one of those infrastructure failures that just suddenly appears out of nowhere. A deployment pipeline that had been working for months suddenly failed with an authorization error on a role assignment. The error looked like a permissions problem on the Service Connection. The initial fix, removing the constraint from the Azure DevOps Service Connection’s User Access Administrator role assignment, appeared to resolve it. But removing security controls is not a permanent fix.

It took a while to track down, but the root cause was a combination of two things: a stricter ARM template validation change that rolled out in December 2025, and a silent bug in the RBAC assignment module that had gone undetected because the old validation didn’t catch it.

The Error

The error we were seeing in the pipeline looked like this:

The template deployment failed with error: 'Authorization failed for template resource
<GUID> of type Microsoft.Authorization/roleAssignment. The client <GUID> with object id
<GUID> does not have permissions to perform action
'Microsoft.Authorization/roleAssignment/write' at scope
/subscriptions/GUID/resourceGroups/<RGNAME>/providers/Microsoft.Authorization/roleAssignment/GUID'.

On the surface this reads as a permissions problem. The service principal does not have enough access to create the role assignment. Open up the IAM blade, check the role assignments, and everything looks correct. The Managed Identity behind the Service Connection has User Access Administrator (UAA) scoped appropriately. The pipeline has been running this same deployment for months if not years.

UAA with conditions

In Azure, when you assign UAA to a service principal, you can add conditions that constrain what that administrator is actually allowed to do. Common constraints include limiting which roles can be delegated and restricting what principal types can be targeted.

This is good practice. UAA without constraints is a highly privileged role and constraining it reduces the blast radius if the service principal is ever compromised.

When we removed the conditions from the UAA assignment, the deployment started working again. This pointed firmly at the authorization layer and change in the ARM / Bicep APIs. Nothing had changed in our codebase so we felt like it had to be a Microsoft issue.

The Actual Bug

After digging into the module code, we found a typo that had been sitting there, quietly, for a long time:

resource roleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(resourceId, principalId, acdd72a7-3385-48ef-bd42-f606fba81ae7)
  scope: targetResource
  properties: {
    principalType: 'ServicePrincipal'
    principalId: principalId
    // The bug was here - roleAssignments instead of roleDefinitions
    roleDefinitionId: resourceId('Microsoft.Authorization/roleAssignments', acdd72a7-3385-48ef-bd42-f606fba81ae7)
  }
}

Corrected resource block:

resource roleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(resourceId, principalId, acdd72a7-3385-48ef-bd42-f606fba81ae7)
  scope: targetResource
  properties: {
    principalType: 'ServicePrincipal'
    principalId: principalId
    roleDefinitionId: resourceId('Microsoft.Authorization/roleDefinitions', acdd72a7-3385-48ef-bd42-f606fba81ae7)
  }
}

Why It Worked Before December

The obvious question is why this module worked without issue for months. The answer lies in how ARM was validating role definition IDs prior to the December 2025 change.

ARM was previously resolving role definitions by examining only the last segment of the resource ID, the role GUID. Given that resourceId('Microsoft.Authorization/roleAssignments', 'acdd72a7-3385-48ef-bd42-f606fba81ae7') still produces a resource ID ending in acdd72a7-3385-48ef-bd42-f606fba81ae7, ARM could extract the GUID and find the correct role definition. The wrong resource type in the middle of the path was effectively ignored.

Around the beginning of December 2025, Microsoft rolled out stricter ARM validation. Instead of extracting only the last segment, it began validating the full resource ID. A path containing Microsoft.Authorization/roleAssignments as the type would no longer resolve to a valid role definition—the type is wrong. This is the same issue reported in the Bicep GitHub issue #18696, where multiple teams hit the same breaking validation change around the same timeframe.

Why the conditions made the Error Misleading

Here’s where conditions complicated the diagnosis. Without conditions on the UAA assignment, the stricter ARM validation did surface the validation error. With conditions in place, the evaluation took a different path. With conditions present, the system attempted to evaluate the constraint before running the validation. The malformed role definition ID caused that evaluation to fail, triggering a fallback attempt with alternative permissions. These permissions didn’t have sufficient privilege, resulting in an authorization error rather than the underlying validation error.

Removing the conditions bypassed that condition-based check, which masked the real problem and made the deployment succeed temporarily. This is why removing security controls appeared to fix the issue, even though the underlying bug remained in the module.

The Fix

The fix was straightforward once we had found the root cause. Correcting the resource type in the roleDefinitionId reference resolved everything:

  • The conditions were reinstated on the UAA assignment
  • The corrected module produces properly formed role definition IDs
  • The condition evaluation now passes correctly
  • The deployment runs cleanly

Key Lessons

There are a few things I took away from this.

ARM validation changes can surface existing bugs. The December change didn’t introduce a bug in the codebase, it exposed one that had existed for a long time. If you see deployment failures appear after a period of stability, check whether an Azure backend change has made previously loose validation stricter.

Authorization error messages don’t always mean permissions. The error pointed to the service principal not having permission to write a role assignment. The actual cause was a malformed role definition ID that failed the condition evaluation. Don’t take the error message at face value if your permissions look correct.

Review your roleDefinitionId references. The distinction between Microsoft.Authorization/roleAssignments and Microsoft.Authorization/roleDefinitions is subtle and easy to miss during code review. A wrong type here will now fail with stricter ARM validation, whereas it previously would have worked silently.

Further Resources

  • Bicep GitHub Issue #18696 - the original issue documenting the subscriptionResourceId related breaking change and community discussion around the same December validation update
  • Bicep resourceId function - reference for correct resource ID construction