For more information about our Incident Response and Communications please read this support article.

We also maintain a list of Known Product Issues separate from this site here.

[MAJOR] Delays in Box Relay

Incident Report for Box

Postmortem

Between April 3, 2023 4:55 PM PDT, and April 4, 2023 12:47 AM PDT, some users may have experienced difficulties while working in Box. During this time, customers may have experienced delays when using Box Relay. We would like to take this opportunity to provide an analysis of this issue and the corrective actions we have taken in response.

Analysis

The issue occurred due to one of our services in a specific region experiencing high latency when communicating with a backend cloud database. This caused the service in that region to run in a degraded state. We were able to resolve the issue by routing all of the upstream traffic to another region where the service was healthy. In addition, we are working with the cloud service provider to prevent similar issues from occurring in the future. 

The existing logs and metrics we had during the incident did not reveal any indication as to what might have caused the degradation. So, following the recommendation from our cloud provider, we temporarily enabled extensive cloud logging for that region, diverted some traffic and monitored the service for degradation. The service was not degraded during this time. While the extensive logging has since been disabled, we have continued to closely monitor the service in that region and have not observed any anomalies or further signs of degradation. We will continue to monitor for any anomalies.

Corrective Actions

The following corrective actions have been completed or are planned:

  • Our existing metrics promptly alert the on-call engineer when such issues occur. We have additionally made the following changes to our runbook that the on-call follows:

    • Enable the extensive cloud logging and seek immediate assistance from the cloud provider on the issue.
    • Once adequate data is gathered, divert traffic away from the degraded region.

We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter. 

Sincerely,

The Box Team

Posted Apr 13, 2023 - 09:25 PDT

Resolved

After further monitoring, this incident is now considered resolved. Box Relay Service has been restored to full functionality. If you continue to experience any issues, please contact Box Support at https://support.box.com.
Posted Apr 04, 2023 - 08:17 PDT

Update

After further monitoring, this incident is now considered resolved. Box Relay Service has been restored to full functionality. If you continue to experience any issues, please contact Box Support at https://support.box.com.
Posted Apr 04, 2023 - 08:16 PDT

Monitoring

Our team has taken steps to remediate this issue and is seeing improvement on Box Relay service. We are continuing to monitor for any additional impact.
Posted Apr 04, 2023 - 06:48 PDT

Identified

Our team has identified the underlying cause of this delays observed on Box Relay and is working to take remediating steps. We will provide additional updates as they become available.
Posted Apr 04, 2023 - 01:39 PDT

Investigating

Our team is investigating an issue regarding Box Relay. Users may observe delays in their Box Relay workflows. We will provide additional information as it becomes available.
Posted Apr 03, 2023 - 23:39 PDT
This incident affected: Box Relay.