Azure Virtual Machine Scale Sets (VMSS)
Azure Virtual Machine Scale Sets (VMSS) provide a powerful, scalable computing service designed to automate the deployment and management of identical virtual machines (VMs) in Azure. This document offers a technical primer focused on self-managed VMSS deployments, explaining core concepts, features, and best practices for advanced users. Although VMSS technology underpins services like Azure Kubernetes Service (AKS) and Azure CycleCloud, the scope here will specifically address standalone usage.
For detailed official documentation, see:
Overview and Key Concepts
What are VM Scale Sets?
VMSS allows you to create and manage a group of identical, load-balanced VMs. These sets are highly scalable, resilient, and ideal for workloads that require automatic scaling and redundancy. With VMSS, you can efficiently manage large-scale deployments without manual intervention, providing significant operational efficiencies.
Key Benefits
- Automated Scaling: Automatically add or remove instances based on defined metrics or schedules.
- High Availability: Distribute VMs across fault domains and availability zones.
- Simplified Management: Centralized deployment, updating, and management through declarative templates.
- Integration: Seamlessly integrates with load balancers, application gateways, and Azure Monitor.
Further reading:
Technical Implementation (Bicep Example)
The following practical example utilizes Azure Bicep, Azure’s declarative infrastructure-as-code language, providing clarity and simplicity compared to traditional ARM templates.
Summary of Attached Bicep File (mi300x-vmss.bicep
)
The provided Bicep file sets up a comprehensive environment in the francecentral
region with:
- A Virtual Machine Scale Set (
Standard_ND96isr_MI300X_v5
SKU) intended for high-performance computing workloads. - Azure Storage Account and a corresponding Azure File Share for shared storage.
- Virtual Network (VNet) with a default subnet.
- Network Security Group (NSG) with rules to allow SSH access.
- Custom initialization script (cloud-init) via the
customData
parameter for additional configuration at VM boot.
Resources Defined:
- Virtual Machine Scale Set (VMSS)
- Storage Account (Standard_LRS)
- Azure File Share
- Network Security Group (NSG) allowing SSH (
TCP/22
) - Virtual Network (
myVnet
) and subnet (default
)
Special Notes:
- The script leverages dynamic parameters to securely pass storage account credentials at runtime via cloud-init script processing (
sed
replacement within the VM during initialization). - SSH key-based authentication is enforced, enhancing security.
Cloud-Init Script (mi300x-cloudinit.sh
)
The VM initialization script automates mounting an Azure File Share (vmshare
) onto each VM instance.
Key operations in the script:
- Updates package lists and installs required tools (
cifs-utils
). - Securely stores Azure Storage credentials.
- Configures file share mounting via
/etc/fstab
for automatic persistence.
Security Consideration:
Credentials are securely handled by:
- Placing them in a restricted permissions file (
600
mode). - Ensuring credentials are dynamically generated at VM creation, avoiding hardcoded secrets.
Example Bicep Code
param location string = 'francecentral'
param vmssName string = 'mi300x-vmss'
param instanceCount int = 16
param adminUsername string = 'azureuser'
@secure()
param sshPublicKey string
param initScriptUrl string = 'https://public-url-to-cloudinit/mi300x-cloudinit.sh'
resource storageAcct 'Microsoft.Storage/storageAccounts@2024-11-01' = {
name: toLower('stg${resourceGroup().name}')
location: location
sku: {
name: 'Standard_LRS'
}
kind: 'StorageV2'
properties: {
accessTier: 'Hot'
}
}
resource fileShare 'Microsoft.Storage/storageAccounts/fileServices/shares@2024-11-01' = {
name: '${storageAcct.name}/default/vmshare'
properties: {}
}
resource nsg 'Microsoft.Network/networkSecurityGroups@2024-11-01' = {
name: '${vmssName}-nsg'
location: location
properties: {
securityRules: [
{
name: 'Allow-SSH'
properties: {
priority: 1000
protocol: 'Tcp'
access: 'Allow'
direction: 'Inbound'
sourceAddressPrefix: '*'
sourcePortRange: '*'
destinationAddressPrefix: '*'
destinationPortRange: '22'
}
}
]
}
}
resource vmss 'Microsoft.Compute/virtualMachineScaleSets@2024-11-01' = {
name: vmssName
location: location
sku: {
name: 'Standard_ND96isr_MI300X_v5'
tier: 'Standard'
capacity: instanceCount
}
identity: {
type: 'SystemAssigned'
}
properties: {
upgradePolicy: {
mode: 'Manual'
}
virtualMachineProfile: {
storageProfile: {
imageReference: {
publisher: 'microsoft-dsvm'
offer: 'ubuntu-hpc'
sku: '2204-rocm'
version: 'latest'
}
osDisk: {
createOption: 'FromImage'
}
}
diagnosticsProfile: {
bootDiagnostics: {
enabled: true
storageUri: storageAcct.properties.primaryEndpoints.blob
}
}
osProfile: {
computerNamePrefix: vmssName
adminUsername: adminUsername
linuxConfiguration: {
disablePasswordAuthentication: true
ssh: {
publicKeys: [
{
path: '/home/${adminUsername}/.ssh/authorized_keys'
keyData: sshPublicKey
}
]
}
}
customData: base64('#cloud-config\nruncmd:\n - curl -s "${initScriptUrl}" -o /tmp/init.sh\n - sed -i "s|__STORAGE_ACCOUNT__|${storageAcct.name}|g" /tmp/init.sh\n - sed -i "s|__STORAGE_KEY__|${listKeys(storageAcct.id, storageAcct.apiVersion).keys[0].value}|g" /tmp/init.sh\n - bash /tmp/init.sh')
}
networkProfile: {
networkInterfaceConfigurations: [
{
name: '${vmssName}-nic'
properties: {
primary: true
networkSecurityGroup: {
id: nsg.id
}
ipConfigurations: [
{
name: '${vmssName}-ipconfig'
properties: {
subnet: {
id: resourceId('Microsoft.Network/virtualNetworks/subnets', 'myVnet', 'default')
}
publicIPAddressConfiguration: {
name: '${vmssName}-pip'
properties: {
idleTimeoutInMinutes: 10
}
}
}
}
]
}
}
]
}
}
overprovision: true
}
}
resource vnet 'Microsoft.Network/virtualNetworks@2024-05-01' = {
name: 'myVnet'
location: location
properties: {
addressSpace: {
addressPrefixes: [
'10.0.0.0/16'
]
}
subnets: [
{
name: 'default'
properties: {
addressPrefix: '10.0.0.0/24'
}
}
]
}
}
Summary and Documentation of Initialization Script (mi300x-cloudinit.sh
)
This bash-based cloud-init script is executed at first boot of each VMSS instance. It ensures the necessary environment and connectivity setup are completed:
- Logs VM initialization timestamps and configuration details to
/var/log/cloud-init.log
. - Installs dependencies (
cifs-utils
) for mounting Azure File Shares. - Dynamically generates SMB credentials file securely.
- Mounts an Azure File Share (
vmshare
) on each VM to enable shared storage access, facilitating collaboration or shared workloads.
Example CloudInit
#!/bin/bash
echo "=== CLOUD INIT START ===" >> /var/log/cloud-init.log
echo "VM initialized at $(date)" >> /var/log/cloud-init.log
echo "Hello from cloud-init!" >> /var/log/cloud-init.log
echo "Storage Acct: __STORAGE_ACCOUNT__" >> /var/log/cloud-init.log
# These values are substituted via Bicep using `sed`
STORAGE_ACCOUNT=__STORAGE_ACCOUNT__
STORAGE_KEY=__STORAGE_KEY__
# Install necessary tools
apt-get update
apt-get install -y cifs-utils
# Mount Azure File Share
mkdir -p /mnt/vmshare
mkdir -p /etc/smbcredentials
# Create credentials file
cat <<EOF > /etc/smbcredentials/${STORAGE_ACCOUNT}.cred
username=${STORAGE_ACCOUNT}
password=${STORAGE_KEY}
EOF
chmod 600 /etc/smbcredentials/${STORAGE_ACCOUNT}.cred
# Add to /etc/fstab
echo "//${STORAGE_ACCOUNT}.file.core.windows.net/vmshare /mnt/vmshare cifs nofail,vers=3.0,credentials=/etc/smbcredentials/${STORAGE_ACCOUNT}.cred,dir_mode=0777,file_mode=0777,serverino" >> /etc/fstab
# Mount file share
mount -a
echo "=== CLOUD INIT END ===" >> /var/log/cloud-init.log
Deploying the Bicep Template
Prerequisites
- Azure CLI installed (version 2.30.0 or later).
-
Logged in to Azure:
az login
-
(Optional) Set the right subscription if you have multiple:
az account set --subscription <YOUR_SUBSCRIPTION_ID>
1. Create the Resource Group
Before you deploy the VMSS you must have a resource group:
az group create \
--name myResourceGroup \
--location francecentral
This will create (or confirm) a resource group named myResourceGroup
in the francecentral region.
2. Deploy the VMSS via Bicep
With your RG in place, deploy the template:
az deployment group create \
--resource-group myResourceGroup \
--template-file mi300x-vmss.bicep \
--parameters \
sshPublicKey="$(cat ~/.ssh/id_rsa.pub)" \
instanceCount=16 \
adminUsername=azureuser \
# (Optional) override init script URL:
# initScriptUrl="https://path/to/your-cloudinit.sh"
- sshPublicKey: Reads your local public key for VM SSH access.
- instanceCount: Number of VM instances in the scale set.
- adminUsername: Linux administrator user name.
- initScriptUrl: URL to the cloud-init script (if you’ve forked or customized it).
Once complete, Azure will provision the storage account, file share, VNet, NSG, and VM Scale Set as defined in mi300x-vmss.bicep
.
Next Steps and Further Learning
For further refinement of this implementation, explore advanced scenarios like automatic scaling based on custom metrics, integration with Azure Monitor, or continuous delivery pipelines.
Refer to: