DeepSentry Book, Live monitoring

Live Monitoring: Real-Time Anomaly Detection

Running DeepSentry in production: streaming logs, real-time scoring, alerts, and integration patterns.

What is Live Monitoring?

After training your models, you want to continuously monitor new logs. Live monitoring:

Reads incoming log entries in real-time
Encodes each message using the trained text autoencoder
Scores each sequence using the anomaly detector
Alerts when anomaly scores exceed a threshold
Maintains rolling statistics for adaptive thresholding

┌──────────────────────────────────────────────────────────┐ │ LIVE MONITORING PIPELINE │ ├──────────────────────────────────────────────────────────┤ │ │ │ Incoming Logs Processing │ │ ────────────── ────────── │ │ │ │ "14:23:45 Got block" │ │ "14:23:46 BlockReport" [READ LOG] │ │ "14:23:47 Verification" │ │ │ ▼ │ │ (rotating through logs) [ENCODE TEXT] │ │ Compress to 128-D │ │ │ │ │ ▼ │ │ [MANAGE SEQUENCE WINDOW] │ │ Sliding window of 10 vectors │ │ │ │ │ ▼ │ │ [ANOMALY SCORING] │ │ Reconstruction Error │ │ │ │ │ Threshold = mean + 2.5 × std ▼ │ │ [COMPARE TO THRESHOLD] │ │ Score < Threshold? ✓ NORMAL │ │ Score > Threshold? ⚠ ANOMALY │ │ │ │ │ Alert Destinations ◄──────── ▼ │ │ ├─ Slack webhook [ALERT ON ANOMALY] │ │ ├─ PagerDuty Msg + Score + Time │ │ ├─ Email │ │ └─ Syslog │ │ │ ├──────────────────────────────────────────────────────────┤ │ Status: Running, 10K msgs/sec, 120s latency │ │ Last anomaly: 5 mins ago (score: 4.8) │ └──────────────────────────────────────────────────────────┘

Running Live Monitoring

Basic Execution

bash dockerrun/run_live_monitoring.sh

The live monitor starts and reads logs according to configuration. Output looks like:

[14:23:45] Normal - score: 0.12 - "Got block report from..."
[14:23:46] Normal - score: 0.09 - "Verification complete"
[14:23:47] ANOMALY - score: 3.45 - "Unexpected error" ⚠️
[14:23:48] Normal - score: 0.14 - "Retrying connection..."

Input Sources

Configure in live_monitoring_config.yml:


# From a file
log_file: /var/log/myapp.log
tail_mode: true   # Start from end (watch for new entries)

# From stdin (pipe logs in)
log_file: /dev/stdin

# From network socket (with socat)
log_file: /dev/tcp/localhost/9999

Piping Logs in Real-Time

Forward logs from a source:


# From syslog
tail -f /var/log/syslog | bash dockerrun/run_live_monitoring.sh

# From application stderr
./myapp 2>&1 | bash dockerrun/run_live_monitoring.sh

# From remote server via SSH
ssh user@remote "tail -f /var/log/app.log" | bash dockerrun/run_live_monitoring.sh

Adaptive Thresholding

The live monitor doesn't use a fixed threshold. Instead, it maintains running statistics:

Window: Last N scores (e.g., 100 most recent)
Mean: Average of recent scores
Std Dev: How much scores vary
Threshold: mean + 2.5 * std_dev

Why this matters: Even if "normal" anomaly scores slowly increase (due to seasonal changes or system evolution), the threshold adapts.

Example

Hour 1: Scores are [0.1, 0.2, 0.15, 0.3, ...]. Mean=0.17, Std=0.08. Threshold = 0.17 + 2.5*0.08 = 0.37

Hour 2: Scores are [0.2, 0.25, 0.22, 0.4, ...]. Mean=0.27, Std=0.09. Threshold = 0.27 + 2.5*0.09 = 0.495

The threshold moved up because baseline scores increased. This prevents "threshold exhaustion" where all scores become anomalies.

Output and Alerting

Console Output

By default, the live monitor outputs to console:

[14:23:45] Normal  - score: 0.21 (threshold: 2.45)
[14:23:46] ANOMALY - score: 3.67 (threshold: 2.45) ⚠️
[14:23:47] Normal  - score: 0.18 (threshold: 2.45)

File Output

Configure in live_monitoring_config.yml:


output_file: /var/log/deepsentry-alerts.log

Anomalies are written with full context:

TIMESTAMP=2022-01-15T14:23:47Z
SCORE=3.67
THRESHOLD=2.45
MESSAGE="Unexpected error in database connection"
CONTEXT_WINDOW="Got block report... → Verification complete → Unexpected error"

Integration with Monitoring Systems

Direct integration patterns:


# Send alerts to syslog
bash dockerrun/run_live_monitoring.sh | \
  grep ANOMALY | \
  logger -t deepsentry -p user.alert

# Send to Prometheus (custom exporter)
bash dockerrun/run_live_monitoring.sh | \
  ./alert_to_prometheus.py

# Send to monitoring webhook
bash dockerrun/run_live_monitoring.sh | \
  grep ANOMALY | \
  while read line; do
    curl -X POST https://alerts.example.com/webhook \
      -d "$line"
  done

Tuning Live Monitoring

Key Parameters

Parameter	Default	Impact
threshold_multiplier	2.5	How many std devs above mean triggers alert. Higher = fewer alerts.
window_size	100	How many recent scores to use for statistics. Larger = more stable threshold.
sequence_length	10	Must match training. How many vectors in a sequence.
batch_interval	1.0 (seconds)	How often to score new logs. Smaller = more frequent scoring.

Tuning Strategy

Start with defaults, then adjust based on your alerts:

Too many false positives: Increase threshold_multiplier (e.g., 3.0 or 3.5)
Missing real anomalies: Decrease threshold_multiplier (e.g., 2.0 or 1.5)
Threshold too volatile: Increase window_size (e.g., 200 or 500)
Slow to adapt to changes: Decrease window_size (e.g., 50)

Production Deployment

Systemd Service

Run as a systemd service for automatic restart:


# /etc/systemd/system/deepsentry-live.service

[Unit]
Description=DeepSentry Live Anomaly Detection
After=network.target

[Service]
Type=simple
User=deepsentry
WorkingDirectory=/opt/deepsentry
ExecStart=bash dockerrun/run_live_monitoring.sh
Restart=on-failure
RestartSec=10

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl enable deepsentry-live
sudo systemctl start deepsentry-live
sudo systemctl status deepsentry-live

Docker Container

Run live monitoring in Docker:


docker run -d \
  --name deepsentry-live \
  -v /var/log:/var/log:ro \
  -v /data/deepsentry:/data \
  --restart unless-stopped \
  deepsentry:latest \
  python src/live/main.py

Kubernetes Deployment

For cloud deployments:


apiVersion: apps/v1
kind: Deployment
metadata:
  name: deepsentry-live
spec:
  replicas: 1
  selector:
    matchLabels:
      app: deepsentry-live
  template:
    metadata:
      labels:
        app: deepsentry-live
    spec:
      containers:
      - name: live
        image: deepsentry:latest
        volumeMounts:
        - name: logs
          mountPath: /var/log
          readOnly: true
        - name: models
          mountPath: /data/models
          readOnly: true
        env:
        - name: LOG_FILE
          value: /var/log/app.log
        - name: ANOMALY_MODEL
          value: /data/models/detector.h5
      volumes:
      - name: logs
        hostPath:
          path: /var/log
      - name: models
        configMap:
          name: deepsentry-models

Monitoring the Monitor

Keep an eye on the live monitoring process itself:


# Check if process is running
ps aux | grep deepsentry

# Monitor resource usage
watch -n 1 'docker stats deepsentry-live'

# Check for errors in logs
tail -f /var/log/deepsentry-alerts.log | grep ERROR

# Alert stats
tail -f /var/log/deepsentry-alerts.log | grep ANOMALY | wc -l

Common Issues and Solutions

Problem: No alerts even for obvious anomalies

Check:

Models are loaded correctly (check logs for errors)
Sequence length matches training config
Log format is correct (YYMMDD HHMMSS MESSAGE)
Threshold is too high

Problem: Too many false positive alerts

Solutions:

Increase threshold_multiplier in config
Increase window_size for more stable baseline
Check that training data was clean (no anomalies)
Verify test logs are similar distribution to training logs

Problem: Memory or CPU usage is high

Optimizations:

Reduce sequence_length if possible
Use smaller models (reduce embedding_size)
Increase batch_interval to score less frequently
Use GPU acceleration if available

Next Steps

Once live monitoring is running:

Set up alerting to send anomalies to your on-call team
Create dashboards showing anomaly detection rate and latency
Periodically retrain models with new data to stay current
Investigate flagged anomalies to understand patterns