7 minute read

All customers data in Azure is encrypted “at rest” on the Microsoft datacenters. By default, the keys used in the cryptographic operations are totally managed by Microsoft throughout their entire lifecycle: they are defined as “Microsoft Managed Keys” (MMKs) (sometimes also referred as “Platform Managed Keys” or “Service Managed Keys”).

A wide majority of Azure PaaS and IaaS services storing customers’ data allows to encrypt this data by using a “Customer Managed Key” (CMK), generated by the customer - possibly in an on premises HSM -, updated/managed by the customer, and securely transferred in Azure Key Vault (AKV) or, when supported, in an Azure Managed HSM (M-HSM) module. This enable in Azure the “Bring Your Own Key” (BYOK) scenarios; a great article about how BYOK works is here. Each of these PaaS/IaaS services has its own procedures and constraints for moving from MMK to CMK.

Recently I supported customers in testing these “CMK activation procedures” for a few of these services: Storage Accounts, VM Disks, Information Protection and PaaS Databases (specifically MySQL). Each time, after the configuration change was made, there was a need to verify that the change was effective.

Thanks to the fact that we were operating in test environments, we have always approached these post-change verifications by temporarly disabling the new CMK(s) in AKV: as expected, this action, after a short amount of time, made it impossible to access the data managed by each related service and, in most cases, blocked entirely the service itself.

key-disabled

In our simple tests we saw these expected behaviors:

  • It was no longer possible to access the data in the Storage Account by using Storage browser err-storage-account

  • It was no longer possible to start-up the Virtual Machine using the Disk Encryption Set err-VM

  • It was no longer possible to apply a Sensitivity Label with encryption on an email sent in OWA err-sensitivity-label

  • The MySQL single server instance went in “Inaccessible” state err-mysql

It was enough to re-enable the key(s) to restore the operation of the services and the access to their data (for MySQL it was necessary to perform a simple additional action manually: revalidate the customer-managed key in the data encryption settings).

In a test environment, the approach of temporaly disabling the new CMK in AKV is quite instructive also because it gives the opportunity to “touch with hands” what could be the consequences of unwanted mistakes or deliberate destructive actions in key management operations. Fortunately, AKV offers soft delete and purge protection functionalities to allow recovery from this kind of incidents.

A more generic way to test the effectiveness of the CMK configuration - clearly more adequate to production environments, where it’s not possible to temporarly disable an encryption key - is by leveraging the AKV diagnostic logging capability and the power of Azure Log Analytics. Specifically, in the “Diagnostic settings” configuration of AKV, it is possible to specify that “AuditEvents” must be collected and archived in a Log Analytics workspace.

akv-diagnostic-settings

The audit events generated by AKV can be identified in the “AzureDiagnostics” table by filtering for the value “MICROSOFT.KEYVAULT” in the field “ResourceProvider”. The following sample KQL query shows the total number of events logged in the specified time frame by:

  • Target Key Vault name, Key name and Key version
  • Caller ID (the GUIDs of the Managed Identity or of the Enterprise Application accessing AKV, according to the permissions specified in the AKV data plane)
  • Operation name and result

AzureDiagnostics
| where ResourceProvider =="MICROSOFT.KEYVAULT" 
| extend targetKeyVaultName = replace_string(tostring(split(id_s,"/")[2]),".vault.azure.net","")
| extend targetKeyName = tostring(split(id_s,"/")[4])
| extend targetKeyVersion = tostring(split(id_s,"/")[5])
| summarize count() by targetKeyVaultName, targetKeyName, targetKeyVersion, callerId = identity_claim_appid_g, OperationName, ResultSignature, ResultDescription
| project targetKeyVaultName, targetKeyName, targetKeyVersion, callerId , OperationName, ResultSignature, ResultDescription, CallsNumber = count_
| order by CallsNumber desc

query1

When also the “Diagnostic settings” section of Azure Active Directory is configured to collect the logs of the categories “NonInteractiveUserSignInLogs” and “ManagedIdentitySignInLogs” on the same Log Analytics workspace, in the query shown above it is easy to replace the Caller ID with the name of the “managed identity” (e.g., the storage account) or “enterprise application” (e.g., Information Protection) accessing the AKV. These are the GUIDs of the identity principals added in the “Access Policy” page of the AKV.

The following query shows the last occurrence of successul/unsuccessful access to the Sign and Decrypt operations. Those are exacltly the accesses done or attempted by Storage Accounts, Disk Encryption Sets and Azure Information Protection for accessing their respective CMKs.


AzureDiagnostics
| where ResourceProvider =="MICROSOFT.KEYVAULT" and OperationName startswith "Key" and not(OperationName == "KeyGet") 
| join kind=inner (
    AADManagedIdentitySignInLogs
        | distinct AppId, ServicePrincipalName
) 
on $left.identity_claim_appid_g == $right.AppId
| extend callerName=ServicePrincipalName
| union (
AzureDiagnostics
| where ResourceProvider =="MICROSOFT.KEYVAULT" and OperationName startswith "Key" and not(OperationName == "KeyGet") 
| join kind=inner (
    AADNonInteractiveUserSignInLogs
    | distinct ResourceIdentity, ResourceDisplayName
) 
on $left.identity_claim_appid_g == $right.ResourceIdentity
| extend callerName=ResourceDisplayName
)
| where callerName != "Microsoft Azure KeyVault portal extension" and callerName  != "Azure Key Vault" 
| extend targetKeyVaultName = replace_string(tostring(split(id_s,"/")[2]),".vault.azure.net","")
| extend targetKeyName = tostring(split(id_s,"/")[4])
| extend targetKeyVersion = tostring(split(id_s,"/")[5])
| summarize arg_max(lastCallTime=TimeGenerated, OperationName) by callerName, targetKeyVaultName, targetKeyName, targetKeyVersion, ResultSignature, ResultDescription
| order by targetKeyVaultName, targetKeyName, targetKeyVersion, ResultSignature, lastCallTime 
| project targetKeyVaultName, targetKeyName, targetKeyVersion, OperationName, ResultSignature, ResultDescription, lastCallTime, callerName 

query2

Alternatively, without doing the two “join” operations in the query just shown above, it is possible to retrieve the name of a calling application directly in the AzureDiagnostics table, by reading the “trustedService_s” field. In the same table, the name of a calling managed identity, instead, can be retrieved by parsing the content of the field “identity_claim_xms_mirid_s”. For a managed identity related to a Storage Account, the field has a content similar to this example:

/subscriptions/«subscription-id»/resourcegroups/«rg-name»/providers/Microsoft.Storage/storageAccounts/staccsecuritylabwe001

By writing a custom KQL query as shown above, it is possible to parse the exact information desired (key name, version, etc…) and filter on any time range containing data. Instead of writing a custom KQL query, it is also possible to leverage the information provided in the built-in “Insights” page of the Azure Key Vault resource, on the “Operations” tab. The KQL query behind this table can be easily accessed and taken as a starting point for further customizations.

keyvault-insights

Just as a side note, in my lab environment the query shown above returns a quite high number of “Forbidden” results due to the repeated operations of disabling keys for testing purposes.

From a monitoring point of view, it is important to consider that the different diagnostics and insights capabilities in AKV allow to take under continuous control the performances and failures of the existing vaults. This is an extremely important functionality, especially when AKV hosts CMKs which may be frequently accessed by their corrispective Azure services. The service limits of AKV are described here.

azure-monitor-for-AKV

It is also extremely important to set alerts on the monitoring parameters related to the AKV health, like the “saturation”. Details on how to set these alerts can be found here.

As a final tought, it’s worth to remind that AKV is a “Tier 0” service, hosting the most valuable assets in the organizational IT infrastructure. Because of that, it must be carefully protected and monitored from a security perspective. In terms of security posture optimization and threat monitoring, the “Defender for Key Vault” protection plan in Microsoft Defender for Cloud allows to get actionable recommendations on the issue to be corrected and clear alerts on the threats to be investigated.

defender-for-keyvault

Defender for Cloud also verify continuosly the compliance of the Azure resources againts the Azure Security Benchmark (ASB). ASB includes a specific baseline for AKV.

asb-compliance-for-keyvault

In terms of additional detection, correlation and investigation, Microsoft Sentinel offers predefined Analytic Rules, a connector, a honeytoken solution and a Workbook specific for AKV protection.

The following image shows the available predefined Analytic Rules templates related to AKV. As can be deduced from their names, it is extremely important to have such detection capabilities. You may notice that one of these rules is of type “Near Real Time” (NRT); this ensures that the detection of these “Sensitive Azure Key Vault operations” is almost immediate and the possible response actions defined for these events are also trigged almost immediately.

sentinel-analytic-rules

Additional recommendations for custom Analytic Rules or Hunting queries in Sentinel can be found on the internet. For example this article containst interesting queries: Protecting Azure Key Vault with Azure Sentinel – Microsoft Sentinel 101 (learnsentinel.blog)

As always, I hope that the information provided in this post may be useful.

PS. Feedback on this blog post can be added in this related LinkedIn post