This article serves as reference for storage types provided by major cloud providers and trend I observe: using fuse to make “object storage” as “file storage”
Even “object storage” is very popular in cloud, there are lots of use cases to have shared “file storage” (e.g. multiple compute instances need to access the same filesystem, e.g. loading machine learning models, media files, etc.). Traditionally, all major cloud providers have file-share-like services, e.g. Azure files, AWS EFS, GCP Filestore. However, that usually means you need to store data on different places, e.g. object storage and file storage. Can you simplify the process and maintain one data repository? Can you store data in “object storage” and mount them as local filesystem. Recently, I found that Fuse comes to serve this purpose. Azure, AWS, GCP all have similar technologies. Also, on Azure, Azure storage keeps improving bandwidth, even higher than Azure files now. Next we will introduce Azure blobfuse.
Basic step
We will be using blobfuse v2
Install blobfuse v2
Blobfuse2 Build from source code
Installation DEB file
Download link: https://github.com/Azure/azure-storage-fuse/releases/tag/blobfuse2-2.0.2
-
Debian 11
-
Configure the Microsoft package repository
wget https://github.com/Azure/azure-storage-fuse/releases/download/blobfuse2-2.0.2/blobfuse2-2.0.2-Debian-11.0-x86-64.deb apt-get update sudo apt-get install libfuse3-dev fuse3 -y dpkg -i blobfuse2-2.0.2-Debian-11.0-x86-64.deb
NOTE: If you want to Check Debian version.
-
sudo apt-get install blobfuse2
-
Arch Linux
paru -S azure-storage-fuse
blobfuse2 --help
blobfuse2 --version
Configure BlobFuse2
Modify /etc/fuse.conf and uncomment user_allow_other
sed '/user_allow_other/s/^#//g' -i /etc/fuse.conf
Configure blobfuse v2 caching (blobfuse uses cache to speed up repeated file retrieval)
-
RAM
sudo mkdir /mnt/ramdisk sudo mount -t tmpfs -o size=2g tmpfs /mnt/ramdisk sudo mkdir /mnt/ramdisk/blobfuse2tmp sudo chown $USER /mnt/ramdisk/blobfuse2tmp
Set-up rsyslog
Set-up
sudo mkdir /etc/rsyslog.d
cd /etc/rsyslog.d
sudo wget https://raw.githubusercontent.com/Azure/azure-storage-fuse/main/setup/11-blobfuse2.conf
Set-up Logrotate
sudo wget -O blobfuse2 https://raw.githubusercontent.com/Azure/azure-storage-fuse/main/setup/blobfuse2-logrotate
Create mount folder
#+begin_src bash mkdir ~/mycontainer #+end_sre
Use this config file from blobfuse repo to populate config.yaml
# Refer ./setup/baseConfig.yaml for full set of config parameters
allow-other: false
logging:
type: syslog
level: log_debug
components:
- libfuse
- file_cache
- attr_cache
- azstorage
libfuse:
attribute-expiration-sec: 120
entry-expiration-sec: 120
negative-entry-expiration-sec: 240
file_cache:
path: /mnt/ramdisk/blobfuse2tmp
timeout-sec: 120
max-size-mb: 4096
attr_cache:
timeout-sec: 7200
azstorage:
type: block
account-name: mystorageaccount
account-key: mystoragekey
endpoint: https://mystorageaccount.blob.core.windows.net
mode: key
container: mycontainer
Create group:
sudo groupadd fuse
Add to group:
sudo usermod -aG fuse yanboyang713
Mount with blobfuse
blobfuse2 mount /home/yanboyang713/mycontainer/ --config-file=/home/yanboyang713/fileCacheConfig-ok.yaml --ignore-open-flags --foreground=true --allow-other
NOTE: Ignoring invalid max threads value 4294967295 > max (100000). Please, go to Set the max threads value
Now you can access Blob through the mounted directory, and you can see the file in Blob
cd ~/mycontainer
mkdir test
echo "hello world" > test/blob.txt
To unmount
sudo blobfuse2 unmount ~/mycontainer
usr/bin/fusermount
blobfuse2 unmount ~/bf2a/ Error: failed to unmount home/yanboyang713/bf2a [exec: “fusermount”: executable file not found in $PATH]
Solution: sudo ln -s /usr/bin/fusermount3 /usr/bin/fusermount
Show mount
blobfuse2 mount list
Create User
sudo useradd -m azure
Create DIR
mkdir azure-storage-fuse
mkdir mntblobfuse
Create Blob Configure File: BlobConfigFile=/home/azure/azure-storage-fuse/blobfuse2.yaml
In modern Linux, systemd is to manage services in a robust way, providing fault-tolerance, proper initialization. Following is systemd example for blobfuse.
systemd
/etc/systemd/system/blobfuse2.service
Description=A virtual file system adapter for Azure Blob storage.
After=network.target
[Service]
# Configures the mount point.
Environment=BlobMountingPoint=<path/to/the/mounting/point>
# Config file path
Environment=BlobConfigFile=<path/to/the/config/file>
Type=forking
ExecStart=/usr/bin/blobfuse2 mount ${BlobMountingPoint} --config-file=${BlobConfigFile}
ExecStop=/usr/bin/blobfuse2 unmount ${BlobMountingPoint}
[Install]
WantedBy=multi-user.target
NOTE:
foreground: true
Start systemd unit
sudo systemctl daemon-reload
sudo systemctl start blobfuse2
sudo systemctl status blobfuse2
sudo systemctl enable blobfuse2
https://github.com/mikaelweave/blobfuse-automount/tree/master/etc https://github.com/Azure/azure-storage-fuse/tree/c8fa8aab4936dcfc32254b8d4f1de818b45bb7ac/systemd/without-config-file
Add an Existing User Account to a Group
usermod -a -G examplegroup exampleusername
How to make it more secure? You can see our storage account key is stored as plain text in a file. Keeping secret in a file is not that secure. While developers can securely store the secrets in Azure Key Vault, services need a way to access Azure Key Vault. Managed identities provide an automatically managed identity in Azure Active Directory for applications to use when connecting to resources that support Azure Active Directory (Azure AD) authentication. Applications can use managed identities to obtain Azure AD tokens without having to manage any credentials. Lots of Azure services support managed identities, e.g. you can assign managed identity to Azure VM, then the VM can use managed identity to access Azure resources (think about not VM accessing resources, but a specific application (therefore multiple VMs forming an application accessing services))
Use managed identity
Troubleshoot
/var/log/blobfuse2.log https://github.com/Azure/azure-storage-fuse/blob/main/TSG.md
Reference List
- https://learn.microsoft.com/en-us/azure/storage/blobs/blobfuse2-what-is
- https://learn.microsoft.com/en-us/azure/storage/blobs/storage-how-to-mount-container-linux
- https://github.com/Azure/azure-storage-fuse
- https://aur.archlinux.org/packages/azure-storage-fuse
- https://learn.microsoft.com/en-us/azure/storage/blobs/blobfuse2-configuration
- https://toggen.com.au/it-tips/blobfuse2/