For more information about our Incident Response and Communications please read this support article.

We also maintain a list of Known Product Issues separate from this site here.

[Critical] Issues with Multiple Box Services
Incident Report for Box
Postmortem

We recently addressed issues affecting Public API, Downloads, Uploads and Sign. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.

On July 26, 2023 between 5:30 and 6:20 PM PDT, some users may have experienced difficulties while working in Box. During this time, users may have degraded experience or difficulty accessing Public API, Downloads, Uploads and Sign. The issue occurred due to an outage of an internal secrets management system that failed after a rare spike in requests from other systems. We were able to resolve the issue by restarting the front ends of the secrets management system to shed connections. In addition, we scaled up the capacity for the secrets management system as well as modified internal clients to reduce the magnitude of future spikes to prevent similar issues from occurring in the future.

Analysis

The internal secrets management system’s file descriptor and connection limits had been increased two months before this issue occurred. The increases were intended to improve system performance during anomalous conditions, but the system’s TCP ephemeral port range remained too small. A request spike consumed the available ephemeral ports on the hosts, resulting in the system becoming unavailable.

Corrective Actions

The following corrective actions have been completed or are planned:

  • Tuned the system’s limits to prevent the port exhaustion condition from reoccurring
  • Improvements made to reduce the recurrence of high magnitude request spikes
  • The connection limiting front end is being replaced with an improved load balancer

We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter. 

Sincerely,

The Box Team

Posted Aug 01, 2023 - 09:03 PDT

Resolved
After further monitoring, this incident is now considered resolved. Box services have been restored to full functionality. Please contact Box Support at https://support.box.com/ if you continue to experience any issues.
Posted Jul 26, 2023 - 19:07 PDT
Monitoring
A fix has been implemented and we are currently monitoring the results.
Posted Jul 26, 2023 - 18:43 PDT
Update
We are continuing to investigate this issue. We are seeing some recovery across some of the services. Some Box services are still experiencing some latency.
Posted Jul 26, 2023 - 18:14 PDT
Investigating
We are investigating an ongoing issue affecting Login (Web/SSO), All Files Page, Uploads, Downloads, Content API, Box Notes, and Box Sign. Users may see errors or slowness with the affected services. We will provide more information as soon as it is available.
Posted Jul 26, 2023 - 17:49 PDT
This incident affected: Box Platform / API (Content API), Box Web Application (Login/SSO, Uploads/Downloads, Box Sign), Box Notes (Web Application), and Box Website.