Skip to content

Signals Outage Runbook

Detection

  • Signal delivery latency > 5 minutes
  • Discord bot offline or not posting
  • Subscriber complaints
  • Monitoring alert: signal_delivery_failures > 3

Response

  1. Check signal service health: curl https://api.qgtm.ai/signals/health
  2. Check Discord bot status: is it online in the server?
  3. Check Redis: are signals being published to the stream?
  4. Check Stripe webhooks: are subscription events processing?

Common Causes

Symptom Likely Cause Fix
Bot offline Token expired or rate limited Restart bot, check token
Signals not generating Strategy service crashed Check logs, restart
Signals generating but not delivering Redis stream consumer lag Check consumer group, restart
Delayed signals API latency Check Fly.io metrics, scale up

Communication

Post in Discord #announcements:

Signal delivery is currently delayed. We are investigating and will update within 30 minutes. Trading signals are still being generated — delivery will catch up once resolved.

Recovery

  1. Fix root cause
  2. Replay any missed signals from Redis stream
  3. Post recovery notice in Discord
  4. Review SLA impact for institutional tier subscribers