Introduction

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob Storage.

Data Lake Storage Gen2 converges the capabilities of Azure Data Lake Storage Gen1 with Azure Blob Storage. For example, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. Because these capabilities are built on Blob storage, you’ll also get low-cost, tiered storage, with high availability/disaster recovery capabilities.

Create with Azure CLI

The code snippet below does the following initial steps:

  • Logs in to your Azure account.
  • Sets the active subscription where the create operations will be done.
  • Creates a new resource group for the new deployment activities.
  • Creates a user-assigned managed identity.
  • Adds an extension to the Azure CLI to use features for Data Lake Storage Gen2.
  • Creates a new storage account with Data Lake Storage Gen2 by using the —hierarchical-namespace true flag.

Azure CLI Login

az login
az account set --subscription <SUBSCRIPTION_ID>

NOTE: If you don’t know your subscription. please list your subscription first.

az group

If you have exise resource group: you could List Group and use. If you have not, Create new one.

Creates a user-assigned managed identity

# Create managed identity
az identity create -g <RESOURCEGROUPNAME> -n <MANAGEDIDENTITYNAME>

For example:

az identity create -g permanent -n boyangIdentity

Adds an extension

az extension add --name storage-preview

Creates a new storage account with Data Lake Storage Gen2

az storage account create --name <STORAGEACCOUNTNAME> \
    --resource-group <RESOURCEGROUPNAME> \
    --location eastus --sku Standard_LRS \
    --kind StorageV2 --hns true

Reference List

  1. https://learn.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-data-lake-storage-gen2-azure-cli