Performance Optimization

Explore this Page

Overview

To ensure optimal performance of the Replicated PV Mayastor, it is crucial to allocate dedicated CPU resources to its I/O engine.This document outlines best practices and advanced configuration options to reduce latency, increase throughput, and optimize the use of available CPU and network infrastructure.

Performance Optimization: CPU Isolation

The Replicated PV Mayastor fully utilizes each CPU core assigned to it by spawning a dedicated thread (reactor) on each. These reactor threads execute continuously, serving I/O operations without sleeping or blocking. Other threads within the I/O engine, which are not bound to specific CPUs, may block or sleep as needed.

For optimal performance, it is important that these bound reactor threads experience minimal interruptions. Ideally, they should only be interrupted by essential kernel-based time accounting processes. In practice, this is difficult to achieve, but improvements can be made using the isolcpus kernel parameter.

The isolcpus boot parameter does not prevent kernel threads or other Kubernetes pods from running on the isolated CPUs. However, it does prevent system services such as kubelet from interfering with the I/O engine's dedicated cores.

Configure Kernel Boot Parameters

Add the isolcpus kernel parameter to instruct the Linux scheduler to isolate specific CPU cores from general scheduling.

The location of the GRUB configuration file may vary depending on your Linux distribution. For example:

  • Standard Linux: /etc/default/grub
  • Ubuntu 20.04 on AWS EC2: /etc/default/grub.d/50-cloudimg-settings.cfg

In this example, we isolate CPU cores 2 and 3 (on a 4-core system).

Copy
Add isolcpus parameter to GRUB
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=2,3"

Update GRUB Configuration

After modifying the GRUB configuration file, update the bootloader to apply changes.

Copy
Update GRUB
sudo update-grub
Copy
Sample Output
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/40-force-partuuid.cfg'
Sourcing file `/etc/default/grub.d/50-cloudimg-settings.cfg'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.8.0-29-generic
Found initrd image: /boot/microcode.cpio /boot/initrd.img-5.8.0-29-generic
Found linux image: /boot/vmlinuz-5.4.0-1037-aws
Found initrd image: /boot/microcode.cpio /boot/initrd.img-5.4.0-1037-aws
Found Ubuntu 20.04.2 LTS (20.04) on /dev/xvda1
done

Reboot the System

Reboot the system to enable the new kernel parameters.

Copy
Reboot
sudo reboot

Verify Isolated CPU Cores

Once the system is back online, confirm that the isolcpus parameter is active and functioning as expected.

Copy
View Kernel Boot Parameters
cat /proc/cmdline
Copy
Sample Output
BOOT_IMAGE=/boot/vmlinuz-5.8.0-29-generic root=PARTUUID=7213a253-01 ro console=tty1 console=ttyS0 nvme_core.io_timeout=4294967295 isolcpus=2,3 panic=-1

 

Copy
View Isolated CPUs
cat /sys/devices/system/cpu/isolated
Copy
Sample Output
2-3

Update Helm Configuration

To ensure Replicated PV Mayastor utilizes the isolated cores, update its configuration using the kubectl puls8 mayastor plugin.

Ensure that the kubectl puls8 mayastor plugin is installed and matches the Helm chart version of your deployment.

Copy
Update CPU Core Allocation
kubectl puls8 mayastor upgrade -n <namespace> --set 'openebs.mayastor.io_engine.coreList={2,3}'

CPU core indexing begins at 0. Therefore, coreList={2,3} corresponds to the third and fourth cores.

Performance Optimization: RDMA Enablement

Remote Direct Memory Access (RDMA) support in Replicated PV Mayastor enables significant improvements in storage performance by reducing latency and increasing throughput for workloads using NVMe-over-Fabrics (NVMe-oF). This feature utilizes RDMA-capable network interfaces (RNICs) to achieve high-speed, low-latency communication across nodes.

Requirements

Interface Validation

Ensure the interface specified by the io-engine.target.nvmf.iface Helm parameter exists on all io-engine nodes and is RDMA-capable. If not, those nodes will default to TCP communication.

Application Node Requirements

Application nodes must also have RDMA-capable devices to establish RDMA connections. This requirement is independent of the iface parameter and specific to where the application is scheduled.

Enabling RDMA via Helm

To enable the RDMA feature via Helm:

  1. Set openebs.mayastor.io-engine.target.nvmf.rdma.enabled to true.
  2. Set openebs.mayastor.io-engine.target.nvmf.iface to a valid network interface name that exists on an RNIC.
  3. Verify that all nodes are properly configured with RDMA-capable hardware and that network interfaces are correctly identified and accessible.
  • Once enabled, all Replicated PV Mayastor volumes will attempt RDMA connections.
  • If an application runs on a non-RDMA-capable node, it will fall back to TCP unless disabled via Helm:
Copy
Disable TCP Fallback
openebs.mayastor.csi.node.nvme.tcpFallback

When fallback is disabled, pods on non-RDMA nodes will fail to connect to volumes. Either re-enable fallback or move the pods to RDMA-capable nodes.

  • Software-emulated RDMA (Soft-RoCEv2) is supported on nodes without RNICs. Create a virtual RDMA device using:
Copy
Create RDMA Device on a Standard Ethernet Interface
rdma link add rxe0 type rxe netdev eth0

GID assignment on Soft-RoCEv2 depends on CNI and cluster networking. Variability in behavior has not been fully tested.

Learn More