Core Concepts¶
All applications in this repository are configured using a shared rule syntax. A rule is a YAML object composed of a trigger, optional conditions, and an action.
# A rule is defined by an optional name, a trigger, optional conditions, and an action.
- name: my-rule # optional — used as the "rule_name" label in Prometheus metrics
trigger:
# ... defines what starts the rule evaluation (e.g., a NATS message or an HTTP request)
conditions:
# ... defines the logic to determine if the action should run (e.g., field value checks)
action:
# ... defines what to do if the conditions pass (e.g., publish a NATS message or make an HTTP call)
When name is omitted the trigger subject (NATS) or path (HTTP) is used as the metric label instead.
1. Triggers (The "If")¶
A trigger defines the event that initiates a rule evaluation.
NATS Trigger: Evaluates a message from a NATS subject.
HTTP Trigger: Evaluates an incoming HTTP request.
trigger:
http:
path: "/webhooks/github" # Exact path match
method: "POST" # Optional, defaults to all methods
2. Conditions (The "When")¶
Conditions are an optional block of logic that must evaluate to true for the action to be executed.
Template Syntax¶
All field references must use explicit template syntax: {variable}
This consistent syntax applies to both the field (left side) and value (right side) of conditions.
Supported Variable Types:
- Message fields:
{temperature},{user.name} - System variables:
{@time.hour},{@subject.1},{@header.X-Device-ID} - KV lookups:
{@kv.device_status.{device_id}:status}
Values can be:
- Templates (for variable comparisons):
{threshold},{@kv.config:max_temp} - Literals:
30,"active",true
Basic Condition Examples¶
conditions:
operator: and # or "or"
items:
# Compare message field to literal
- field: "{temperature}"
operator: gte
value: 30
# Check an HTTP header
- field: "{@header.X-Device-Auth}"
operator: "exists"
# Check the time of day
- field: "{@time.hour}"
operator: gte
value: 9
# Check a value from a NATS KV store
- field: "{@kv.device_status.{device_id}:status}"
operator: "eq"
value: "active"
Variable-to-Variable Comparisons¶
The rule engine now supports comparing variables to other variables, enabling dynamic thresholds and cross-field validation:
conditions:
operator: and
items:
# Compare two message fields
- field: "{end_timestamp}"
operator: gt
value: "{start_timestamp}"
# Compare message field to KV value
- field: "{temperature}"
operator: gt
value: "{@kv.sensor_config.{sensor_id}:max_temp}"
# Compare to system variable
- field: "{last_update_hour}"
operator: eq
value: "{@time.hour}"
# Permission level check
- field: "{user_level}"
operator: gte
value: "{@kv.permissions.{resource_id}:required_level}"
Type Handling¶
Variables are resolved to their native types for accurate comparison:
- Numbers remain numbers: {temperature} → 25.5 (float)
- Strings remain strings: {status} → "active"
- Booleans remain booleans: {enabled} → true
- Automatic type coercion: When comparing different types (e.g., string "30" vs number 30)
This ensures numeric comparisons work correctly:
# These both work correctly
- field: "{count}" # count = 42 (number)
operator: gt
value: "{limit}" # limit = 50 (number)
- field: "{count}" # count = 42 (number)
operator: gt
value: "40" # String "40" coerced to number
Common Use Cases¶
Dynamic Thresholds:
# Threshold stored in KV, different per sensor
- field: "{temperature}"
operator: gt
value: "{@kv.sensor_config.{sensor_id}:max_temp}"
Cross-Field Validation:
# Ensure end time is after start time
- field: "{end_timestamp}"
operator: gt
value: "{start_timestamp}"
Range Checks:
# Value must be within KV-defined range
conditions:
operator: and
items:
- field: "{value}"
operator: gte
value: "{@kv.ranges.{sensor_type}:min}"
- field: "{value}"
operator: lte
value: "{@kv.ranges.{sensor_type}:max}"
Access Control:
# User level must meet or exceed required level
- field: "{user_level}"
operator: gte
value: "{@kv.permissions.{resource_id}:required_level}"
Rate Limiting:
# Current usage must be below limit
- field: "{current_requests}"
operator: lt
value: "{@kv.rate_limits.{user_id}:requests_per_hour}"
Available Operators¶
Comparison:
- eq - Equals
- neq - Not equals
- gt - Greater than
- lt - Less than
- gte - Greater than or equal
- lte - Less than or equal
- exists - Field exists (not null)
String/Array:
- contains - String contains substring or array contains element
- not_contains - Inverse of contains
- in - Value is in array
- not_in - Value is not in array
Array Operators:
- any - At least one array element matches nested conditions
- all - All array elements match nested conditions
- none - No array elements match nested conditions
Time-Based:
- recent - Timestamp is within time window (e.g., "5s", "1m", "1h")
3. Actions (The "Then")¶
An action defines the work to be done when a rule's conditions are met.
NATS Action: Publishes a new message to a NATS subject.
action:
nats:
subject: "alerts.high_temp.{device_id}"
mode: core # optional: "core" or "jetstream" (overrides global nats.publish.mode)
payload: |
{
"alert": "High temperature detected!",
"temp": {temperature},
"device": "{device_id}",
"timestamp": "{@timestamp()}"
}
The optional mode field overrides the global nats.publish.mode for this action only. Use core for fire-and-forget (notifications, dashboards) or jetstream for durable delivery (audit logs, safety alerts). When omitted, the global setting applies. See Per-Rule Publish Mode Override for examples.
HTTP Action: Makes an outbound HTTP request to an external service.
action:
http:
url: "https://api.pagerduty.com/incidents"
method: "POST"
headers:
Authorization: "Token ${PAGERDUTY_TOKEN}" # Env vars supported
payload: '{"service": "app-alerts", "message": "{alert_message}"}'
retry:
maxAttempts: 3
initialDelay: "1s"
4. Debounce¶
Debounce suppresses rapid re-fires of a rule within a configurable time window. When a rule has a debounce field, only the first matching message fires the action — subsequent matches within the window are silently dropped.
- trigger:
nats:
subject: "sensors.temperature.>"
debounce: "30s"
conditions:
operator: and
items:
- field: "{temperature}"
operator: gt
value: 45
action:
nats:
subject: "alerts.high_temp"
payload: '{"temp": {temperature}, "device": "{device_id}"}'
The debounce field accepts a Go duration string at the rule level (alongside trigger, conditions, and action).
How It Works¶
- A message matches the rule's trigger (and optionally passes conditions).
- Shunt builds a debounce key from the trigger and action subjects:
trigger_subject::action_subject. - If this key has not fired within the debounce window, the action executes and the timestamp is recorded.
- If the key has fired within the window, the message is dropped and the
messages_debounced_totalmetric increments.
Debounce state is in-memory only — it resets on process restart. The first message after a restart always fires.
Valid Duration Formats¶
Any Go time.Duration string: "5s", "1m", "1m30s", "2h", "500ms".
Examples¶
Sensor flood protection — only alert once per minute regardless of how many readings arrive:
- trigger:
nats:
subject: "sensors.temperature.>"
debounce: "1m"
conditions:
operator: and
items:
- field: "{temperature}"
operator: gt
value: 45
action:
nats:
subject: "alerts.high_temp.{device_id}"
payload: '{"temp": {temperature}, "device": "{device_id}"}'
Notification deduplication — suppress duplicate webhook deliveries for 5 minutes:
- trigger:
nats:
subject: "events.deploy.>"
debounce: "5m"
action:
http:
url: "https://hooks.slack.com/services/T00/B00/xxx"
method: "POST"
payload: '{"text": "Deploy event on {service}"}'
Environment Variables¶
The rule engine supports environment variable expansion for static configuration values using ${VAR_NAME} syntax. This enables secure secret management and environment-specific configuration without hardcoding values in rule files.
How It Works¶
Environment variables are expanded at load time (when rules are loaded from the NATS KV bucket), not at runtime. This means:
- ✅ Performance: Zero runtime overhead - values are substituted once during startup
- ✅ Security: Secrets are never stored in rule files
- ✅ Simplicity: Standard environment variable management (Docker, K8s, systemd, etc.)
- ✅ Validation: Expanded values are validated along with the rest of the rule
Important: Environment variable expansion is completely separate from template variable substitution:
- ${ENV_VAR} → Expanded at load time (static configuration)
- {field} or {@system} → Resolved at runtime (per-message templating)
Syntax¶
# Use ${VARIABLE_NAME} anywhere in your rules
action:
http:
url: "https://api.example.com"
headers:
Authorization: "Bearer ${API_TOKEN}"
Where Can I Use Them?¶
Environment variables can be used in both conditions and actions:
In Actions¶
NATS Actions:
action:
nats:
subject: "alerts.${ENVIRONMENT}.critical"
payload: |
{
"apiKey": "${SERVICE_API_KEY}",
"region": "${AWS_REGION}"
}
headers:
X-Service-Token: "${INTERNAL_TOKEN}"
HTTP Actions:
action:
http:
url: "${API_BASE_URL}/incidents"
method: "POST"
headers:
Authorization: "Token ${PAGERDUTY_TOKEN}"
X-Environment: "${DEPLOY_ENV}"
payload: '{"service": "${SERVICE_NAME}"}'
In Conditions¶
Environment variables can also be used in condition values:
conditions:
operator: and
items:
# Check if status matches expected value from env
- field: "{status}"
operator: eq
value: "${EXPECTED_STATUS}"
# Check if environment matches
- field: "{environment}"
operator: in
value: ["${PRIMARY_ENV}", "${SECONDARY_ENV}"]
Missing Variables¶
If an environment variable is not set, the system will:
- Log a warning with details about the missing variable
- Substitute with an empty string
- Continue loading the rule (non-fatal)
Best Practice: Always set required environment variables before starting the application, or the rule may not work as intended.
Complete Example: Variable Comparisons with KV¶
# Dynamic threshold management with KV store
# KV: sensor_config["temp-001"] = {"max_temp": 35, "critical_temp": 45}
# Message: {"sensor_id": "temp-001", "temperature": 38, "location": "server_room"}
- trigger:
nats:
subject: "sensors.temperature"
conditions:
operator: and
items:
# Temperature exceeds configured max (warning level)
- field: "{temperature}"
operator: gt
value: "{@kv.sensor_config.{sensor_id}:max_temp}"
# But below critical level
- field: "{temperature}"
operator: lt
value: "{@kv.sensor_config.{sensor_id}:critical_temp}"
# Only during business hours
- field: "{@time.hour}"
operator: gte
value: 9
- field: "{@time.hour}"
operator: lt
value: 17
action:
nats:
subject: "alerts.temperature.warning.{sensor_id}"
payload: |
{
"alert": "Temperature warning - exceeds threshold",
"sensor_id": "{sensor_id}",
"location": "{location}",
"current_temperature": {temperature},
"thresholds": {
"max": "{@kv.sensor_config.{sensor_id}:max_temp}",
"critical": "{@kv.sensor_config.{sensor_id}:critical_temp}"
},
"triggered_at": "{@timestamp.iso}",
"alert_id": "{@uuid7()}"
}