Sizing

Both the Anvil and DSX will auto-size internal tunables based on the available resources available at boot time. This sizing section provides recommendations however it is recommended that the infrastructure should be monitored using your favorite infrastructure tools to ensure that there is sufficient CPU and Memory available to provide high performance.

Anvil

In order to deliver fast metadata operations, it is highly recommended to provide Anvil’s metadata disk with NVMe storage.

The size of the metadata disk determines how many files and snapshots that can be stored in the system.

It is recommended to size the system using 4,096 bytes per unique file or directory. That covers the most common configurations and regular snapshot usage (daily/weekly snapshot for example). If a large number of snapshots (256+) are kept, it is recommended to double to 8,192 bytes to ensure sufficient metadata space. It is worth noting that when the file count scales into 10’s of millions, optimizations built into the metadata storage is going to reduce the amount of actual capacity used per file. This reduction is not taken into account in the table below as it can be difficult to programmatically predict.

Metadata Storage Capacity

The following simple sizing table uses 4,096 bytes as a reference.

Number of Files and Directories Recommended Metadata Capacity
10 million 41 GB
100 million 410 GB
500 million 2 TB
2 billion 8 TB

The other vector for sizing is memory consumption on the Anvil. Each file that is opened and actively used by a client utilizes a small amount of memory, approx. 12k bytes per file.

DSX

The DSXhas multiple roles and depending on the function, the sizing is a little different.

Network throughput is critical as the DSX is either serving data, storing data or moving data to the cloud or between NAS volumes. It is recommended to use at least 10 Gbit networking.

Each open SMB connection to a DSX consumes approx. 1 MB of memory. 1,000 SMB clients would consume 1 GB of memory while being connected.

The CPU consumption on DSX will depend on the task. Moving data to and from the cloud is the most CPU intensive task due to compression. For environments where 100+ GB’s of data will be moved to and from Object/Cloud every day, it is recommended to use a minimum of 8 CPUs / 32 GB of RAM for the most optimal performance.