Delays in SLO and notification processing
Incident Report for Honeycomb
Resolved
This incident has been resolved.
Posted Oct 03, 2022 - 12:50 PDT
Update
Our systems appear to be stabilized in the short-term, and are monitoring the issue until we return to full capacity. We're also investigating long-term improvements in order to increase reliability of the affected services under appreciable amounts of load.
Posted Oct 03, 2022 - 12:39 PDT
Update
SLOs and trigger evaluations are up to date, but we are continuing to monitor excess load on the database and are still stabilizing the system.
Posted Oct 03, 2022 - 12:04 PDT
Monitoring
We are preparing to run a backfill on SLO notifications for the affected time period. Any SLO burn alerts that would otherwise have triggered during this period may trigger during the backfill.
Posted Oct 03, 2022 - 11:13 PDT
Update
Trigger evaluations and notifications are functional again, but SLO reporting is still delayed.
Posted Oct 03, 2022 - 10:26 PDT
Identified
We have disabled SLO evaluations in order to catch up on trigger evaluations, which has initially had a positive effect.
Posted Oct 03, 2022 - 10:23 PDT
Update
Trigger evaluations, notifications, and alerts appear to all be impacted and may not be running.
Posted Oct 03, 2022 - 10:05 PDT
Investigating
We are currently investigating a delay in the evaluation of SLOs and trigger execution.
Posted Oct 03, 2022 - 09:52 PDT
This incident affected: ui.honeycomb.io - US1 Trigger & SLO Alerting.