Slow iSCSI performance on ZFS Volumes (zvol)

TL;DR: For reasons, don’t use ZVOLs for iSCSI volumes. Instead, just use a generic file.

I’ve been reorganizing my lab a bit to consolidate some storage and wanted to experiment with iSCSI. I thought “wow, what a great use-case for ZFS ZVOLS…”.

If you recall, ZFS has the ability to create block devices called ZVOLs. When you do this, you get a new device presented on the machine under /dev/zvol/<poolname>/ that you can use as you would any other disk. As part of my consolidation effort, I decided to use one and present it over iSCSI to my workstation. To my surprise, the performance was dismal, maxing out at around 30MB/s when writing to it over iSCSI.

Here are the steps I took to create the ZVOL and present over iSCSI. Note, I’m using FreeBSD as my storage server.

# zfs create -V 500gb zroot/luns/backup
# cat > /etc/ctl.conf <<EOF
portal-group pg0 {
	discovery-auth-group no-authentication
	listen 0.0.0.0
}
target iqn.2020-01.life.shaner:target0 {
	portal-group pg0
	lun 0 {
		path /dev/zvol/zroot/backup
		size 500G
	}
}
EOF
# sysrc ctld_enable=YES
# service ctld start
# ctladm lunlist
(7:1:0/0): <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-5 SCSI device

With this in place, we can move to the (Linux) client machine (initiator in iSCSI parlance) and initiate a connection to the iSCSI drive then format it.

# iscsiadm --mode discovery -t sendtargets --portal 192.168.1.10
192.168.1.10:3260,-1 iqn.2020-01.life.shaner:target0

# iscsiadm --mode node --targetname  iqn.2020-01.life.shaner:target0 --portal 192.168.1.10 --login
Logging in to [iface: default, target: iqn.2020-01.life.shaner:target0, portal: 192.168.1.10,3260]
Login to [iface: default, target: iqn.2020-01.life.shaner:target0, portal: 192.168.1.10,3260] successful.

# dmesg |tail
[117514.525034] sd 9:0:0:0: Attached scsi generic sg5 type 0
[117514.525245] sd 9:0:0:0: Power-on or device reset occurred
[117514.527424] sd 9:0:0:0: [sdg] 1048576000 512-byte logical blocks: (537 GB/500 GiB)
[117514.527428] sd 9:0:0:0: [sdg] 131072-byte physical blocks
[117514.527706] sd 9:0:0:0: [sdg] Write Protect is off
[117514.527709] sd 9:0:0:0: [sdg] Mode Sense: 7f 00 10 08
[117514.528159] sd 9:0:0:0: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA
[117514.528750] sd 9:0:0:0: [sdg] Optimal transfer size 8388608 bytes
[117514.675486] sd 9:0:0:0: [sdg] Attached SCSI disk

# mkfs.ntfs -Q /dev/sdg 

Let’s format it then mount it. Note this drive will eventually be mounted by a Windows machine thus we’re formatting it with NTFS.

# mkfs.ntfs -Q /dev/sdg
# mount -t ntfs3 /dev/sdg /mnt

At this point I proceeded to copy data onto the drive where it maxed out at 35MB/s. Abysmal. So, I decided to switch from ZVOL to a plain file on disk and use that instead.

# zfs destroy zroot/luns/backup
# zfs create -o mountpoint=/luns/backup zroot/luns/backup
# cd /luns/backup
# truncate -s 500G disk.img
# sed -i 's/\/dev\/zvol\/zroot/backup/\/luns\/backup\/disk.img/g' /etc/ctl.conf
# service ctld restart

After setting it up this way I was maxing out my 1Gb connnection with writes speeds of over 100MB/s, a 2x improvement in speed.

Lesson learned.

Finding Idle Cloud Desktops (Linux)

Suppose you’re hosting remote Linux desktops in your cloud environment and want to discover which ones could be able to shutdown to save on valuable resources like money, RAM, or CPU.

Most Linux remote desktop protocols still utilize Xorg (as opposed to Wayland) for their display server. Prime examples would be tigervnc, tightvnc, or X2go. Because of this, the utility xprintidle is still useful for determining how long an X session has been idle, as its name suggests. With it, we can automate the discovery process with a simple script, querying each desktop to see when it was last used. This assumes you have permissions on the host to run commands as the user actually running the X server (or have access to their .Xauthority file).

Depending on your infrastructure you might choose to run something like the below script via SSH, Ansible, Salt Stack, Puppet, or something else.

This is a rough example and assumes the username is the same on all hosts. You’ll likely have different usernames on each host so you’d need to adjust the script to filter out the users and corresponding display number.

#!/usr/bin/env bash
# A contrived example of checking for idle X sesssions on remote systems.

HOSTS="host1 host2 host3"
USER=shaner  # the user running the X session
DISPLAY=:1  # typical/default display for most VNC servers
XPIPATH=./xprintidle  # path to 'xprintidle' binary.

for h in ${HOSTS}; do
  echo "put ${XPIPATH} /usr/local/bin/" | sftp -b- root@$h >/dev/null
  IDLE=$(ssh root@$h sudo -u ${USER} DISPLAY=${DISPLAY} /usr/local/bin/xprintidle)
  IDLE=$(echo $IDLE/1000/60 | bc)
  printf "[*] ${USER}@${h}:${DISPLAY} idle for ${IDLE} minutes\n"
done

Here’s what it looks like in practice. From the output, we could probably shutdown host1 for the time being.

$ ./check_idles.sh
[*] shaner@host1:1 idle for 18564 minutes
[*] shaner@host2:1 idle for 20 minutes
[*] shaner@host3:1 idle for 108 minutes
$

Create SSL Cert and Key

Sometimes during development you may find yourself needing an SSL certificate and key to test with. I’ve had to do this so much I went ahead and added the below function to my ~/.bashrc file.

createss () 
{ 
    openssl req -x509 -nodes -newkey rsa:4096 \
      -keyout ${1}.key -out ${1}.crt -days 365 \
      -subj "/C=US/ST=Ohio/L=Elida/O=ShanerOPS/OU=OPS/CN=${1}"
}

Now, I can create certs on-the-fly without having to look it up in my notes. Here’s how it looks in practice.

$ createss demo.site.local
Generating a RSA private key
...........snip....

$ ls -l
-rw------- 1 shane shane 3272 Jul 20 20:47 demo.site.local.key
-rw-rw-r-- 1 shane shane 2033 Jul 20 20:47 demo.site.local.crt

Windows VM using LXD

It’s not entirely obvious how to create a Windows Virtual Machine when using LXD. Here are the most basic steps to get it up and running. This is largely for my own documentation but will probably help someone else out there I’m sure.

The easiest option is to embed the VirtIO drivers directly into the Windows ISO using distrobuilder (via snap). This is the method I’ll be demonstrating. Alternatively, you could attach both ISO’s and when it comes to selecting a drive to install to (during windows installer) you’ll have to click the ‘load drivers from removable media’ and select them from their.

First, install the needed tools. Note, you’re more than welcome to compile distrobuilder from source, but using the snap is much quicker.

sudo snap install distrobuilder --classic
sudo apt install -y libguestfs-tools wimtools

Proceed to download the needed ISOs. You can download Windows ISO HERE and VirtIO ISO HERE.

Now we’re ready to embed the drivers directly into the Windows ISO. Here’s the command:

sudo distrobuilder repack-windows /home/shaner/Downloads/win10_21H2.iso ./win10_packed.iso --drivers=/home/shaner/Downloads/virtio-win.iso

Next, we create an empty machine (VM),set the disk size, and attach our custom ISO with virtio drivers.

lxc init win10pro --empty --vm -c security.secureboot=false
lxc config device override win10pro root size=40GiB
lxc config device add win10pro iso disk source=/home/shaner/win10_packed.iso  boot.priority=10

Now, start the machine and attach to the VGA console to walk-through the installer:

lxc start win10pro --console=vga

Once the install is complete, be sure to enable Remote Desktop as it’s a much better experience than using the LXC (spice) console.

Hashicorp Vault Dev Mode

Ever needed to spin-up a quick Vault cluster to test commands or functionality? Sure, you could spin up minikube and deploy a helm chart, but what if you could do it even faster, without Kubernetes?

Vault actually has some *currently* undocumented command-line options that can save you a ton of time. Read on, brother.

I debated on even writing a post about it because it’s so simple. It’s literally a command-line flag -dev-three-node . Below, I’m redirecting STDERR to STDOUT and redirecting to a file called output, if you’re not a Linux fan.

$ vault server -dev-three-node -dev-root-token-id="root" > output 2>&1 &

I redirect to a file because the output is too fast to catch the needed info. Let’s use head to see the useful bits.

$ head -30 output
==> Vault server configuration:

                     Cgo: disabled
 Cluster Parameters Path: /tmp/vault-test-cluster-282710121
              Go Version: go1.16.12
               Log Level: info
      Node 0 Api Address: https://127.0.0.1:8200
      Node 1 Api Address: https://127.0.0.1:8201
      Node 2 Api Address: https://127.0.0.1:8202
                 Version: Vault v1.7.9
             Version Sha: 571cd46419fe273d75de1e0d5aa46af60a222961

==> Three node dev mode is enabled

The unseal key and root token are reproduced below in case you
want to seal/unseal the Vault or play with authentication.

Unseal Key 1: +V7oGQ/q3lHGgWoVjRgKxS0OLUs9KZs8aDppOMWcYDFj
Unseal Key 2: ZlmQLgpPohGOAb7m1XUfikiHSneei+AFIwxyqmkNAq5H
Unseal Key 3: tHr08qqUd7GAtcfY+ynqo6+Go2vovj1wbdGIQtSWJ/r0

Root Token: root


Useful env vars:
VAULT_TOKEN=root
VAULT_ADDR=https://127.0.0.1:8200
VAULT_CACERT=/tmp/vault-test-cluster-282710121/ca_cert.pem

==> Vault server started! Log data will stream in below:

Alrighty, let’s just export those variables and we can begin using our cluster!

$ export VAULT_TOKEN=root
$ export VAULT_ADDR=https://127.0.0.1:8200
$ export VAULT_CACERT=/tmp/vault-test-cluster-282710121/ca_cert.pem

Ok, let’s make sure vault is on the same page as us by checking its status.

$ vault status
Key             Value
---             -----
Seal Type       shamir
Initialized     true
Sealed          false
Total Shares    3
Threshold       3
Version         1.7.9
Storage Type    n/a
Cluster Name    vault-cluster-7a71b0b6
Cluster ID      75e763bc-78f1-9783-8cc4-505a5a5861d9
HA Enabled      true
HA Cluster      https://127.0.0.1:45555
HA Mode         active
Active Since    2022-03-09T02:12:27.947440981Z
$

Looks good! We can now start testing whatever we need. In future posts, we’ll explore more of the cluster and play with some of the available vault secrets engines.

IP Address Discovery for LXD Machines

I’m currently working on a side project that uses LXD as the primary back-end hypervisor (more backends to come, looking at libcloud). It seems pretty well thought out and, so far, it’s been really nice to work with.

I did run into a snag however, but it’s not really a fault of LXD. I needed to be able to learn the IP address of a VM that doesn’t have the LXD-agent running. This is the case for any machine images not provided by Canonical’s upstream image servers. For example, Windows, FreeBSD, OpenBSD, and any custom Linux distribution certainly will not have it available by default.

Doing some research, I discovered that the dnsmasq DHCP server has the ability to call external programs when creating new leases. This is perfect and should be easy enough to implement. Here’s snippet from the dnsmasq man page:

--dhcp-script=<path> Whenever a new DHCP lease is created, or an old one destroyed, or a TFTP file transfer completes, the executable specified by this option is run. <path> must be an absolute pathname, no PATH search occurs. The arguments to the process are "add", "old" or "del", the MAC address of the host (or DUID for IPv6) , the IP address, and the hostname, if known.

So, we need some sort of middleware that I could push new leases to when they’re created by dnsmasq. Since I already had a redis server running handling celery tasks, I decided to just use that.

Great, let’s whip up a python script for dnsmasq to call out to and update redis.

#!/usr/bin/env python3
# /usr/local/bin/dhcpredis.py

import sys
import redis


if len(sys.argv) < 4:
    sys.exit(1)
op = sys.argv[1]
mac = sys.argv[2]
ip = sys.argv[3]
# hostname = sys.argv[4]  # Not interested at this time.

RHOST = '172.16.0.252'
RPORT = 6379
RDB = 0

r = redis.Redis(host=RHOST, port=RPORT, db=RDB)
if 'add' in op:
    r.set(mac, ip)

if 'del' in op:
    r.delete(mac)

Certainly some room for improvement in the script above, but it gets the point across. If it’s not apparent, we’re using the MAC address as the redis key, and the IP as it’s value. Now, configure dnsmasq to point at the script and restart it to pick up the changes.

# echo "dhcp-script=/usr/local/bin/dhcpredis.py" >> /etc/dnsmasq.conf
# systemctl restart dnsmasq

Awesome, we’re good to go! The last thing I had to do is just update my code to pull the KEY:VALUE pair out of redis. If you’re curious to see what that might look like, here’s the (poorly-coded) function I used. Note, I’m using the python LXD library (pylxd) and getting the MAC address from LXD directly.

def get_mach_ip(mach):
    retries = 60
    sleep_time = 5
    hwaddr = None
    while retries > 0:
        retries -= 1
        r = redis.Redis(host=app.config['REDIS_HOST'],
                        port=app.config['REDIS_PORT'])
        try:
            hwaddr = mach.config['volatile.eth0.hwaddr']
            cip = r.get(hwaddr).decode('utf-8') if hwaddr else None
            return cip if cip
        except Exception:
            logging.info(f'Waiting on IP for ether {hwaddr}')
        time.sleep(sleep_time)
    return None

That’s pretty much it!