Azure Data Lake

Setumo Raphela
3 min readAug 3, 2022

--

· Hierarchical namespaces organize blob data into directories and stores metadata about each directory and the files within it

· A SAS ( shared access signature (SAS) is a string that contains a security token that can be attached to a URI. Use a SAS to delegate access to storage objects and specify constraints, such as the permissions and the time range of access

· Along with role-based access control (RBAC), Azure Data Lake Storage Gen2 provides access control lists (ACLs) that are POSIX-compliant, and that restrict access to only authorized users, groups, or service principals.

· It applies restrictions in a way that’s flexible, fine-grained, and manageable.

  • If you store your data as many small files, this can negatively affect performance. In general, organize your data into larger-sized files for better performance (256 MB to 100 GB in size).

· A benefit of Data Lake Storage Gen2 is that you can treat the data as if it’s stored in a Hadoop Distributed File System.

· Data Lake Storage Gen2 supports access control lists (ACLs) and Portable Operating System Interface (POSIX) permissions

· You can set permissions at a directory level or file level for the data stored within the data lake

· Data Lake Storage Gen2 takes advantage of the Azure Blob replication models that provide data redundancy in a single data center with locally redundant storage (LRS), or to a secondary region by using the Geo-redundant storage (GRS) option

· Azure Data Lake Storage Gen2 provides a cloud storage service that is available, secure, durable, scalable, and redundant.

· Azure Storage accounts provide several high-level security benefits for the data in the cloud:

  • Protect the data at rest — encryption
  • All data written to Azure Storage is automatically encrypted by Storage Service Encryption (SSE) with a 256-bit Advanced Encryption Standard (AES) cipher, and is FIPS 140–2 compliant. SSE automatically encrypts data when writing it to Azure Storage
  • Protect the data in transit — SSL secure transport layer
  • Keep your data secure by enabling transport-level security between Azure and the client.
  • Always use HTTPS to secure communication over the public internet.
  • Support browser cross-domain access — with CORS
  • Azure Storage supports cross-domain access through cross-origin resource sharing (CORS
  • Control who can access data — ACLS and posix
  • Audit storage access
  • Azure creates two of these keys (primary and secondary) for each storage account you create.
  • The keys give access to everything in the account.
  • A shared access signature (SAS) is a string that contains a security token that can be attached to a URI.
  • Use a SAS to delegate access to storage objects and specify constraints, such as the permissions and the time range of access.
  • By default, storage accounts accept connections from clients on any network
  • Microsoft Defender for Storage provides an extra layer of security intelligence that detects unusual and potentially harmful attempts to access or exploit storage accounts.

--

--

Setumo Raphela

Entrepreneur | Data Scientist | AI | Jet Skier | Author |Oracle