Eventing

Explore this Page

Overview

Eventing enables real-time monitoring and observability by broadcasting system events across a message bus. This mechanism supports proactive operations and seamless integration with observability tools. Events are transmitted using a NATS-based event bus, which allows different components to publish and subscribe to significant system changes, such as the lifecycle operations of volumes, replicas, and pools.

Currently, the call-home service is the primary consumer of these events. It listens on the event bus through the obs-callhome-stats container, collecting and exporting the events as part of the telemetry data sent by Callhome. This facilitates system health checks, diagnostics, and customer support insights.

Events Consumed by Call-home

The following tables list the types of events currently collected and consumed by the Callhome service. Each event is categorized by resource type, action, source, and description.

Volume Events

Category Action Source Description
Volume Create Control Plane Triggered when a volume is successfully created
Delete Triggered when a volume is successfully deleted

Replica Events

Category Action Source Description
Replica Create Data Plane Triggered when a volume is successfully created
Delete Triggered when a volume is successfully deleted
StateChange Triggered upon a change in replica state

Pool Events

Category Action Source Description
Pool Create Data Plane Triggered when a pool is successfully created
Delete Triggered when a pool is successfully deleted

Nexus Events

Category Action Source Description
Nexus Create Data Plane Triggered when a nexus is created
Delete Triggered when a nexus is deleted
StateChange Triggered when the state of a nexus changes
RebuildBegun Triggered when a rebuild operation begins
RebuildEnd Triggered when a rebuild operation completes
AddChild Triggered when a child device is added to a nexus
RemoveChild Triggered when a child device is removed from a nexus
OnlineChild Triggered when a child device becomes online
SubsystemPause Triggered when an I/O subsystem is paused
SubsystemResume Triggered when an I/O subsystem is resumed
Init Triggered when a nexus enters the initialization state
Reconfiguring Triggered when a nexus enters the reconfiguring state
Shutdown Triggered when a nexus is destroyed

Node Events

Category Action Source Description
Node StateChange Control Plane Triggered upon a change in node state

High Availability Events

Category Action Source Description
HighAvailability SwitchOver Control Plane Triggered during the initiation, failure, or completion of a volume switchover

NVMe Path Events

Category Action Source Description
NvmePath NvmePathSuspect Control Plane Triggered when an NVMe path is suspected of failure
NvmePathFail Triggered when an NVMe path is confirmed as failed
NvmePathFix Triggered when an NVMe controller reconnects to a nexus

Host Initiator Events

Category Action Source Description
HostInitiator NvmeConnect Data Plane Triggered upon a host connecting to a nexus
NvmeDisconnect Triggered upon a host disconnecting from a nexus
NvmeKeepAliveTimeout Triggered when a keep-alive timeout occurs on a nexus

IO-Engine Events

Category Action Source Description
IoEngineCategory Start Data Plane Triggered when the IO-Engine initializes
Shutdown Triggered when the IO-Engine shutdown begins
Stop Triggered when the IO-Engine stops
ReactorUnfreeze Triggered when the IO-Engine reactor becomes healthy
ReactorFreeze Triggered when the IO-Engine reactor freezes

Snapshot and Clone Events

Category Action Source Description
Snapshot Create Data Plane Triggered when a snapshot is successfully created
Clone Delete Triggered when a clone is successfully created

Benefits of Eventing

  • Enables real-time visibility into system operations and state transitions.
  • Supports proactive incident detection and faster troubleshooting.
  • Facilitates automated workflows and integrations with monitoring tools.

Learn More