Known Issues
The following is intended to make DataCore Software users aware of any issues that affect performance, access or may give unexpected results under specific conditions when SANsymphony is used in configurations with VMware ESXi hosts.
Some of these Known Issues have been found during DataCore’s own testing, but others have been reported by users. Often, the solutions identified for these issues were not related to DataCore's own products.
DataCore cannot be held responsible for incorrect information regarding another vendor’s products and no assumptions should be made that DataCore has any communication with these other vendors regarding the issues listed here.
We always recommend that the vendors should be contacted directly for more information on anything listed in this section.
For ‘Known Issues’ that apply specifically to DataCore Software’s own products, refer to the relevant DataCore Software Component’s release notes.
This section includes the following topics:
ESXi Host Settings
Affects ESXi 7.x Update 2 or 2a and ESXi 6.7 Update 3
Using DataCore’s recommended DiskMaxIOSize of 512KB causes unexpected IO timeouts and latency.
As reported in ESXi 7.0 PR 2751564 and ESXi 6.7 PR 2752542
If you lower the value of the DiskMaxIOSize advanced config option, ESXi hosts I/O operations might fail. If you change the DiskMaxIOSize advanced config option to a lower value, I/Os with large block sizes might get incorrectly split and queue at the PSA path. As a result, ESXi hosts I/O operations might time out and fail.
This can result in unexpected IO timeouts on the host, leading to increased latency.
Workaround:
Users that are on ESXi 7.0 Updates 2 or 2a and ESXi 6.7 Update 3 should increase the DiskMaxIOSize setting from the DataCore recommended 512KB to 1024KB as a workaround. Refer to VMware ESXi Host Settings on how to change the ESXi DiskMaxIOSize setting.
Permanent Fix:
ESXi 7.0 Update 2 and 2a
Apply either ESXi 7.0 Update 2c or Update 3 and later.
A description of the fix can be found in the Release Notes.
ESXi 6.7 Update 3
A fix is available in Patch Release ESXi670-202111001 or later.
A description of the fix can be found in the Release Notes.
High Availability
Affects ESXi 6.x and 7.x
Hosts may see premature APD events during fail-over when a DataCore Server is stopped or shutdown if the SATP ‘action_OnRetryErrors’ default setting is used
The default ‘action_OnRetryErrors’ setting switched between disabled and enabled and back to disabled again for subsequent releases of ESXi 6.x. Therefore, to guarantee expected failover behavior when a DataCore Server is stopped or shutdown on any ESXi 6.x host, DataCore require that the SATP ‘action_OnRetryErrors’ setting is always explicitly set to ‘disabled’ (i.e., off). Also, see VMware Path Selection Policies – Round Robin PSP for more information.
Affects all ESXi Versions
Sharing the same ‘inter-site’ connection for both Front-end (FE) and Mirror (MR) ports may result in loss of access to virtual disks for ESXi hosts if a failure occurs on that shared connection.
Sharing the same physical connection for both FE and MR ports will work as expected while everything is healthy but any kind of failure-event over the ‘single’ link may cause both MR and FE I/O to fail at the same time. This will cause virtual disks to be unexpectedly inaccessible to hosts even though there is technically still an available I/O path to one of the DataCore Servers.
This is not a DataCore issue per se as the correct SCSI notification is sent back to the ESXi hosts (LUN_NOT_AVAILABLE) to inform it that a path to the virtual disk is no longer available. However, the ESXi host will ignore this SCSI response and continue to try to access the virtual disk on a path reported as either 'Permanent Device Loss' (PDL) or 'All-Paths- Down' (APD) and the host will not attempt any 'failover' (HA) or ‘move’ (Fault Tolerance) so will lose access to the virtual disk.
Because of this ESXi behaviour, DataCore cannot guarantee failover when hosts are being served virtual disks (FE I/O) over the same physical link(s) that are also using for MR I/O and recommend that at least two, physically separate links are used; one for MR I/O and the other for FE I/O.
Affects ESXi 6.7
Failover/Failback may take significantly longer than expected.
Users have reported that before ESXi 6.7, Patch Release ESXi-6.7.0-20180804001 (or later) was applied ESXi failover could take more than 5 minutes.
DataCore recommends to apply the most up-to-date patches to your Hosts.
Fibre Channel Adaptors
Affects all ESXi versions
Hosts with Fibre Channel HBAs that are FC-NVME capable may report their storage adapters twice in ESXi and be unable to discover SANsymphony Virtual Disks served to them
FC-NVME is not supported for SANsymphony Virtual Disk mappings. Turn off the FC-NVME feature on the ESXi host’s HBA driver by using one of the following commands:
Qlogic HBAs:
esxcfg-module -s 'ql2xnvmesupport=0' qlnativefc
Emulex HBAs:
esxcli system module parameters set -m lpfc -p lpfc_enable_fc4_type=1
A reboot of the ESXi host is required in both cases.
Affects ESXi 7.x and 6.7
Hosts with QLogic HBAs connected to DataCore front-end (FE) ports remain ‘logged out’
Any event that occurs which causes a fibre channel port to log out (e.g. after stopping and starting a DataCore Server) may result in the ESXi host logs reporting many ‘ADISC failure’ messages, preventing the port from logging in. Manually re-initializing the DataCore FE port may workaround the problem. For ESXi 7.x, this affects all ESXi qlnative fc driver versions between 4.1.34.0 and 4.1.35.0.
For ESXi 6.7 the affected versions are 3.1.36.0.
DataCore recommends a minimum qlnativefc driver version of 4.1.35.0 for ESXi 7.x or 3.1.65.0 for ESXi 6.7.
Affects ESXi 7.0 Update 2
Increased read latency of Hosts with QLogic 16Gb Fibre Channel adapters using the qlnativefc driver under certain conditions
Refer to the Networking Issues section for more information.
Affects ESXi 6.7
When using QLogic's Dual-Port, 10Gbps Ethernet-to-PCIe Converged Network Adaptor
When used in CISCO UCS solutions, disable both the adaptor's ‘BIOS’ settings and 'Select a LUN to Boot from' option to avoid misleading and unexpected ‘disconnection-type’ messages reported by the DataCore Server Front End ports during a reboot of hosts sharing these adaptors.
ISCSI Connections
Affects all ESXi Versions
ESXi hosts may experience degraded IO performance when Delayed ACK is 'enabled' on ESXi’s iSCSI initiator
For more specific information and how to disable the 'Delayed ACK' feature on ESXi hosts: Refer to the section ‘Configuring Delayed ACK in ESXi’ from VMware’s own knowledge base article . Note that a reboot of the ESXi host will be required.
Native Drive Sector Size Considerations
Affects all ESXi Versions
Support statement for 512e and 4K Native drives for VMware vSphere and vSAN
Refer to the Knowledge base article for more information.
Also see the section ‘Device Sector Formats’ in VMware’s own vSphere Storage guide for ESXi 8.0 for more limitations of 4K enabled storage devices. Note that it is possible to create 4KB virtual disks from a 512B configured Disk Pool.
Refer to the 4 KB Sector Support documentation for more information.
(Un)Serving Virtual Disks
Affects all ESXi Versions
ESXi hosts need to perform a ‘manual’ rescan whenever virtual disks are unserved
Without this rescan, ESXi hosts will continue to send SCSI requests to the DataCore Servers for these, now unserved, virtual disks. The DataCore Servers will respond appropriately – i.e. with an ILLEGAL_REQUEST – but in extreme cases, e.g. when large numbers of virtual disks are suddenly unserved, or where one or more virtual disks have been unserved from large numbers of ESXi hosts, the amount of continual SCSI responses generated between the ESXi hosts and the DataCore Servers can significantly interfere with normal IO on the Front End Ports thereby impacting overall performance for any (other) host using the same front-end ports that the virtual disks were served from.
Refer to the Knowledge base KB articles 2004605 and 1003988.
VAAI
Affects ESXi 8.x, 7.x, and 6.7
Under ‘heavy’ load the VMFS heartbeat may fail with a 'false' ATS miscompare message.
Previously, the ESXi VMFS 'heartbeat' used normal 'SCSI reads and writes' to perform its function. Starting from ESXi 6.0, the heartbeat method was changed to use ESXi's VAAI ATS commands directly to the storage array (i.e., the DataCore Server). DataCore Server do not require (and thus do not support) these ATS commands. Therefore, it is recommended to disable the VAAI ATS heartbeat setting.
Refer to the Knowledge base article.
If ESXi hosts are connected to other storage arrays contact VMware to see if it is safe to disable this setting for these arrays.
VMotion
Affects ESXi 6.7
VMs get corrupted on vVOL datastores after vMotion
When a VM residing on a vVOL datastore is migrated using vMotion to another host by either DRS or manually, and if the VM has one or more of the following features enabled:
- CBT
- VFRC
- IOFilter
- VM Encryption
A corruption of data/backups/replicas and/or performance degradation is experienced after vMotion. Refer to the Knowledge base article for more information.
vSphere
Affects ESXi 7.x and 6.7
Cannot extend datastore using VMware’s vCenter
If a SANsymphony virtual disk served to more than one ESXi host is not using the same LUN on all front-end paths for all hosts and then has its logical size extended, vSphere may not be able to display the LUN in its UI to then expand the VMware datastore. This following VMware article provides steps to work around the issue:
Refer to the Knowledge base article for more information.
While SANsymphony will always attempt to match a LUN on all hosts for the same virtual disk, in some cases it is not always possible to do so – e.g. If the LUN is being used by a an already-mapped virtual disk. This has no functional impact on SANsymphony. Also, see the Serving virtual disks section- To more than one host port.
Microsoft Clusters
Affects ESXi 7.x and 6.7
The SCSI-3 Persistent Reserve tests fail for Windows 2012 Microsoft Clusters running in VMware ESXi Virtual Machines.
This is expected. Refer to the Knowledge base article for more information.
Specifically read the 'additional notes' (under the section 'VMware vSphere support for running Microsoft clustered configurations').
Affects ESXi All versions
ESXi/ESX hosts with visibility to RDM LUNs being used by MSCS nodes with RDMs may take a long time to start or during LUN rescan.
Refer to the Knowledge base article for more information.
DataCore Servers Running in Virtual Machines (HCI)
Affects ESXi 8.x and 7.x
DataCore Servers running in ESXi Virtual Machines may experience unexpected mirror IO failure causing loss of access for some or all paths to a DataCore Virtual Disk
This problem has been reported as resolved in the following VMware Knowledge base article:
“3rd party Hyper Converged Infrastructure setups experience a soft lock up and goes unresponsive indefinitely”
Refer to the Knowledge base article for more information.
However, DataCore working with VMware, it was discovered that the problem still existed in the version listed in the article (i.e. ESXi 7.0.3i) and that it was not until ESXi 7.0.3o that the problem was fixed.
DataCore recommends a minimum ESXi version of:
- ESXi 7.0.3o (build number 22348816) or later, or
- 8.0.2 (build number 22380479) or later
This problem does not apply to DataCore Servers running in ‘physical’ Windows servers or when running in non-VMware Virtual Machines (e.g. Hyper-V).