Slow iSCSI performance on ZFS Volumes (zvol)

TL;DR: For reasons, don’t use ZVOLs for iSCSI volumes. Instead, just use a generic file.

I’ve been reorganizing my lab a bit to consolidate some storage and wanted to experiment with iSCSI. I thought “wow, what a great use-case for ZFS ZVOLS…”.

If you recall, ZFS has the ability to create block devices called ZVOLs. When you do this, you get a new device presented on the machine under /dev/zvol/<poolname>/ that you can use as you would any other disk. As part of my consolidation effort, I decided to use one and present it over iSCSI to my workstation. To my surprise, the performance was dismal, maxing out at around 30MB/s when writing to it over iSCSI.

Here are the steps I took to create the ZVOL and present over iSCSI. Note, I’m using FreeBSD as my storage server.

# zfs create -V 500gb zroot/luns/backup
# cat > /etc/ctl.conf <<EOF
portal-group pg0 {
	discovery-auth-group no-authentication
	listen 0.0.0.0
}
target iqn.2020-01.life.shaner:target0 {
	portal-group pg0
	lun 0 {
		path /dev/zvol/zroot/backup
		size 500G
	}
}
EOF
# sysrc ctld_enable=YES
# service ctld start
# ctladm lunlist
(7:1:0/0): <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-5 SCSI device

With this in place, we can move to the (Linux) client machine (initiator in iSCSI parlance) and initiate a connection to the iSCSI drive then format it.

# iscsiadm --mode discovery -t sendtargets --portal 192.168.1.10
192.168.1.10:3260,-1 iqn.2020-01.life.shaner:target0

# iscsiadm --mode node --targetname  iqn.2020-01.life.shaner:target0 --portal 192.168.1.10 --login
Logging in to [iface: default, target: iqn.2020-01.life.shaner:target0, portal: 192.168.1.10,3260]
Login to [iface: default, target: iqn.2020-01.life.shaner:target0, portal: 192.168.1.10,3260] successful.

# dmesg |tail
[117514.525034] sd 9:0:0:0: Attached scsi generic sg5 type 0
[117514.525245] sd 9:0:0:0: Power-on or device reset occurred
[117514.527424] sd 9:0:0:0: [sdg] 1048576000 512-byte logical blocks: (537 GB/500 GiB)
[117514.527428] sd 9:0:0:0: [sdg] 131072-byte physical blocks
[117514.527706] sd 9:0:0:0: [sdg] Write Protect is off
[117514.527709] sd 9:0:0:0: [sdg] Mode Sense: 7f 00 10 08
[117514.528159] sd 9:0:0:0: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA
[117514.528750] sd 9:0:0:0: [sdg] Optimal transfer size 8388608 bytes
[117514.675486] sd 9:0:0:0: [sdg] Attached SCSI disk

# mkfs.ntfs -Q /dev/sdg 

Let’s format it then mount it. Note this drive will eventually be mounted by a Windows machine thus we’re formatting it with NTFS.

# mkfs.ntfs -Q /dev/sdg
# mount -t ntfs3 /dev/sdg /mnt

At this point I proceeded to copy data onto the drive where it maxed out at 35MB/s. Abysmal. So, I decided to switch from ZVOL to a plain file on disk and use that instead.

# zfs destroy zroot/luns/backup
# zfs create -o mountpoint=/luns/backup zroot/luns/backup
# cd /luns/backup
# truncate -s 500G disk.img
# sed -i 's/\/dev\/zvol\/zroot/backup/\/luns\/backup\/disk.img/g' /etc/ctl.conf
# service ctld restart

After setting it up this way I was maxing out my 1Gb connnection with writes speeds of over 100MB/s, a 2x improvement in speed.

Lesson learned.

Finding Idle Cloud Desktops (Linux)

Suppose you’re hosting remote Linux desktops in your cloud environment and want to discover which ones could be able to shutdown to save on valuable resources like money, RAM, or CPU.

Most Linux remote desktop protocols still utilize Xorg (as opposed to Wayland) for their display server. Prime examples would be tigervnc, tightvnc, or X2go. Because of this, the utility xprintidle is still useful for determining how long an X session has been idle, as its name suggests. With it, we can automate the discovery process with a simple script, querying each desktop to see when it was last used. This assumes you have permissions on the host to run commands as the user actually running the X server (or have access to their .Xauthority file).

Depending on your infrastructure you might choose to run something like the below script via SSH, Ansible, Salt Stack, Puppet, or something else.

This is a rough example and assumes the username is the same on all hosts. You’ll likely have different usernames on each host so you’d need to adjust the script to filter out the users and corresponding display number.

#!/usr/bin/env bash
# A contrived example of checking for idle X sesssions on remote systems.

HOSTS="host1 host2 host3"
USER=shaner  # the user running the X session
DISPLAY=:1  # typical/default display for most VNC servers
XPIPATH=./xprintidle  # path to 'xprintidle' binary.

for h in ${HOSTS}; do
  echo "put ${XPIPATH} /usr/local/bin/" | sftp -b- root@$h >/dev/null
  IDLE=$(ssh root@$h sudo -u ${USER} DISPLAY=${DISPLAY} /usr/local/bin/xprintidle)
  IDLE=$(echo $IDLE/1000/60 | bc)
  printf "[*] ${USER}@${h}:${DISPLAY} idle for ${IDLE} minutes\n"
done

Here’s what it looks like in practice. From the output, we could probably shutdown host1 for the time being.

$ ./check_idles.sh
[*] shaner@host1:1 idle for 18564 minutes
[*] shaner@host2:1 idle for 20 minutes
[*] shaner@host3:1 idle for 108 minutes
$

Hashicorp Vault Dev Mode

Ever needed to spin-up a quick Vault cluster to test commands or functionality? Sure, you could spin up minikube and deploy a helm chart, but what if you could do it even faster, without Kubernetes?

Vault actually has some *currently* undocumented command-line options that can save you a ton of time. Read on, brother.

I debated on even writing a post about it because it’s so simple. It’s literally a command-line flag -dev-three-node . Below, I’m redirecting STDERR to STDOUT and redirecting to a file called output, if you’re not a Linux fan.

$ vault server -dev-three-node -dev-root-token-id="root" > output 2>&1 &

I redirect to a file because the output is too fast to catch the needed info. Let’s use head to see the useful bits.

$ head -30 output
==> Vault server configuration:

                     Cgo: disabled
 Cluster Parameters Path: /tmp/vault-test-cluster-282710121
              Go Version: go1.16.12
               Log Level: info
      Node 0 Api Address: https://127.0.0.1:8200
      Node 1 Api Address: https://127.0.0.1:8201
      Node 2 Api Address: https://127.0.0.1:8202
                 Version: Vault v1.7.9
             Version Sha: 571cd46419fe273d75de1e0d5aa46af60a222961

==> Three node dev mode is enabled

The unseal key and root token are reproduced below in case you
want to seal/unseal the Vault or play with authentication.

Unseal Key 1: +V7oGQ/q3lHGgWoVjRgKxS0OLUs9KZs8aDppOMWcYDFj
Unseal Key 2: ZlmQLgpPohGOAb7m1XUfikiHSneei+AFIwxyqmkNAq5H
Unseal Key 3: tHr08qqUd7GAtcfY+ynqo6+Go2vovj1wbdGIQtSWJ/r0

Root Token: root


Useful env vars:
VAULT_TOKEN=root
VAULT_ADDR=https://127.0.0.1:8200
VAULT_CACERT=/tmp/vault-test-cluster-282710121/ca_cert.pem

==> Vault server started! Log data will stream in below:

Alrighty, let’s just export those variables and we can begin using our cluster!

$ export VAULT_TOKEN=root
$ export VAULT_ADDR=https://127.0.0.1:8200
$ export VAULT_CACERT=/tmp/vault-test-cluster-282710121/ca_cert.pem

Ok, let’s make sure vault is on the same page as us by checking its status.

$ vault status
Key             Value
---             -----
Seal Type       shamir
Initialized     true
Sealed          false
Total Shares    3
Threshold       3
Version         1.7.9
Storage Type    n/a
Cluster Name    vault-cluster-7a71b0b6
Cluster ID      75e763bc-78f1-9783-8cc4-505a5a5861d9
HA Enabled      true
HA Cluster      https://127.0.0.1:45555
HA Mode         active
Active Since    2022-03-09T02:12:27.947440981Z
$

Looks good! We can now start testing whatever we need. In future posts, we’ll explore more of the cluster and play with some of the available vault secrets engines.

Setup Package Cache Server on Ubuntu 18.04

If you’re like me and have several Debian/Ubuntu machines on your network there’s going to come a time when you need to upgrade them. Doing so, will use up a lot of bandwidth while every machine will likely be downloading the same packages. This may or may not upset your significant other who’s binge-watching Gilmore Girls on Netflix.

Since you’ve slowed the Internet down to a crawl, this it might  be a good excuse to leave the computer and get outside for some fresh air. HA, who am I kidding, we got stuff to do. Let’s setup a cache!

Here, I’ll be using Ubuntu 18.04 LTS and setting up apt-cacher-ng.

While we could setup Squid to function in the same way, and cache way more than just debian/ubuntu packages, using apt-cacher-ng is a quick win and requires hardly any configuration to get going. Maybew I’ll cover how to setup Squid in a future post.

First, we’ll make sure everything is up to date, then install apt-cacher-ng.

sudo apt-get update
sudo apt-get dist-upgrade -y
sudo apt-get install apt-cacher-ng -y

Let’s go over a few config options. We won’t go over every single one, just the ones that might be relevant. Open /etc/apt-cacher-ng/acng.conf using your favorite text editor and let’s start.

CacheDir. This is where acng will actually do its caching and store packages as they’re downloaded. You may want to change this if you’d like to save packages to a different partition with more space.

CacheDir: /var/cache/apt-cacher-ng

Port. Here, you can change the TCP port apt-cacher will listen on. Note, this should be higher than 1024. Otherwise, you would need to run acng as root.

Port: 3142

BindAddress. If you have a multi-honed server with several IP addresses, you might want acng to only listen on one. Just provide the IP here to do so. By default, it will listen on all interfaces (0.0.0.0).

BindAddress

ReportPage. If you’d like to see some misc statistics about the caching of packages (hit/miss ratio, space usage, etc…) set the page name here. To disable it, just comment out this line.

ReportPage: acng-report.html

Here’s how the page looks from my server:

ExThreshold. The number of days before deleting unreferenced files. You may want to tweak this to cache packages for longer periods of time.

ExTreshhold: 4

MaxDlSpeed. Here, you can limit how much bandwidth acng will use up. Very handy in some environments. Units are KiB/s.

MaxDlSpeed: 250

There are several other options layed out in the config file, feel free to read more on them and tweak as needed. For now, let’s move onto setting up our hosts to point at our new cache server for packages.

Log into one of your Ubuntu/Debian machines and create a new file at /etc/apt/apt.conf.d/21acng. Replace SERVER_IP with the IP address of the cache server you setup. If you specified a BindAddress above, use that one instead.

echo '"Acquire::HTTP::proxy "http://SERVER_IP:3142";' | sudo tee /etc/apt/apt.conf.d/21acng

Now, let’s try it out! From this client machine, run:

sudo apt-get update

Back on the caching server, you can see what’s happening by tailing the log file.

sudo tail -f /var/log/apt-cacher-ng/apt-cacher.log
.......
............
1532529680|O|227|10.137.5.1|ppa.launchpad.net/openjdk-r/ppa/ubuntu/dists/bionic/InRelease
1532529680|O|228|10.137.5.1|packages.cloud.google.com/apt/dists/cloud-sdk-bionic/InRelease
1532529680|O|217|10.137.5.1|dl.google.com/linux/chrome/deb/dists/stable/Release
.........
..............

Don’t forget, you can access the statistics page by opening a web browser to http://SERVER_IP:3142/acng-report.html