Architecture of DataCore Bolt

In this section, we will see the interactions between the components of DataCore Bolt and the function of each of these components.

Below is an architectural diagram of DataCore Bolt representing high-level interactions between the components.

The table below provides a high-level overview of the resources and components that are an integral part of the DataCore Bolt architecture.

Name Resource Type Function

Frequency in

the cluster

agent-core Deployment Principal control plane actor Single
csi-controller Deployment Hosts Bolt's CSI controller implementation and CSI provisioner side car Single
api-rest Pod Hosts the public API REST server Single
api-rest Service Exposes the REST API server via NodePort -
operator-diskpool Deployment Hosts Bolt's pool operator Single
csi-node DaemonSet Hosts CSI Driver node plugin containers All worker nodes
etcd StatefulSet Hosts etcd server container Configurable (Recommended: Three replicas)
etcd Service Exposes etcd DB endpoint Single
etcd-headless Service Exposes etcd DB endpoint Single
io-engine DaemonSet Hosts Bolt I/O engine User-selected nodes
DiskPool CRD Declares DiskPool's desired state and reflects its current state User-defined, one or many
Additional services
metrics-exporter-pool Sidecar container (within bolt-io DaemonSet) Exports pool related metrics in Prometheus format All worker nodes
pool-metrics-exporter Service Exposes exporter API endpoint to Prometheus Single
promtail DaemonSet Scrapes logs of Bolt-specific pods and exports them to Loki All worker nodes
loki StatefulSet Stores the historical logs exported by promtail pods Single
loki Service Exposes the Loki API endpoint via ClusterIP Single

Component Roles

io-engine

The io-engine pods encapsulate DataCore Bolt containers, which implement the I/O path from the block devices at the persistence layer, up to the relevant initiators on the worker nodes mounting volume claims.

The instance of the DataCore Bolt binary running inside the container performs four major classes of functions:

  • Presents a gRPC interface to the components which allow it to orchestrate the creation, configuration, and deletion of DataCore Bolt-managed objects hosted by that instance.

  • Creates and manages storage pools hosted on that node.

  • Creates, exports, and manages volume controller objects hosted on that node.

  • Creates and shares replicas from storage pools hosted on that node over NVMe-TCP.

When an io-engine pod starts running, an init container attempts to verify connectivity to the agent-core in the namespace where DataCore Bolt has been deployed. If a connection is established, the Bolt container is started, and the instance performs registration with the control plane. In this way, the agent-core maintains a registry of nodes and their current state.

The scheduling of these pods is determined declaratively by using a DaemonSet specification. By default, a nodeSelector field is used within the pod spec to select all worker nodes to which the user has attached the label datacore.com/engine=bolt as recipients of an io-engine pod. In this way, the node count and location are set appropriately to the hardware configuration of the worker nodes, and the capacity and performance demands of the cluster.

csi-node

The csi-node pods within a cluster implement the node plugin component of DataCore Bolt's CSI driver. As such, their function is to orchestrate the mounting of DataCore Bolt-provisioned volumes on worker nodes on which application pods consuming those volumes are scheduled. By default, a csi-node pod is scheduled on every node in the target cluster, as determined by a DaemonSet resource of the same name. Each of these pods encapsulates two containers, bolt-csi, and csi-driver-registrar.

The node plugin does not need to run on every worker node within a cluster and this behavior can be modified, if desired, through the application of appropriate node labeling and the addition of a corresponding nodeSelector entry within the pod spec of the bolt-csi DaemonSet. It should be noted that if a node does not host a plugin pod, then it will not be possible to schedule a pod on it which is configured to mount DataCore Bolt volumes.

etcd

etcd is a distributed reliable key-value store for the most critical data of a distributed system. DataCore Bolt uses etcd as a reliable persistent store for its configuration and state data.

Supportability CLI:

The supportability CLI is a plugin tool used to create support bundles (archive files) by interacting with multiple services present in the system. These bundles contain information about the entire DataCore Bolt system, as well as specific DataCore Bolt resources like volumes, pools and nodes, and can be used for debugging. It can collect the following information:

  • Topological information of DataCore Bolt resource(s) by interacting with the REST service

  • Historical logs by interacting with Loki. If Loki is unavailable, it interacts with the kube-apiserver to fetch logs.

  • DataCore Bolt-specific Kubernetes resources by interacting with the kube-apiserver

  • DataCore Bolt-specific information from etcd (internal) by interacting with the etcd server.

Loki:

Loki aggregates and centrally stores logs from all DataCore Bolt containers which are deployed to the cluster.

Promtail:

Promtail is a log collector built specifically for Loki. It uses the configuration file for target discovery and includes analogous features for labeling, transforming, and filtering logs before ingesting them to Loki.