For more information about our Incident Response and Communications please read this support article.

We also maintain a list of Known Product Issues separate from this site here.

[CRITICAL] Customers may experience errors when using Box.com

Incident Report for Box

Postmortem

We recently addressed issues affecting Box. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.

Between 12:10am PDT and 1:02am PDT on July 03, 2023, some users may have experienced difficulties while working in Box. During this time, some Box users may have experienced increased errors or slowness when accessing Box.com, Public API, Box Sign and SSO services. In connection with a planned migration of certain user traffic to a new system, that system’s connection limits were exceeded under high load, which resulted in the system being unavailable for new connections. We were able to resolve the issue by temporarily routing user traffic back to our old system. In addition, we adjusted the configuration of the new system to prevent similar issues from occurring in the future. 

Analysis 

The new system we migrated user traffic to the week prior had a connection limit setting unknown to us. Under high load situations, we saw that limit starting to drop new connections, which resulted in errors and slowness for some users. We routed user traffic back to the old system to relieve stress on the new system and to allow us to reconfigure the new system. We have turned off the connection limits on the new system and have been serving out of that new system since.

Additionally, this issue revealed a gap in our monitoring when requests were being dropped while passing through this layer in our architecture, which contributed to our time to recovery. Additional monitoring to address this gap is being implemented.

Corrective Actions

The following corrective actions have been completed or are planned:

  • Turning off connection limits where such connection limits are not necessary.
  • Actively monitor and test if connections limit are turned back on again in the future.
  • Add distributed tracing to quickly identify where in our stack the problem lies.

We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter. 

Sincerely,

The Box Team

Posted Jul 16, 2023 - 16:10 PDT

Resolved

After further monitoring, this incident is now considered resolved. Our Services has been restored to full functionality. If you continue to experience any issues, please contact Box Support at https://support.box.com.
Posted Jul 03, 2023 - 02:36 PDT

Monitoring

Our team has taken steps to remediate this issue and is seeing improvement for our services. We are continuing to monitor for any additional impact.
Posted Jul 03, 2023 - 01:42 PDT

Identified

Our team has identified the underlying cause of this issue and is working to take remediating steps. We will provide additional updates as they become available.
Posted Jul 03, 2023 - 01:29 PDT

Investigating

Our team is investigating an issue located on Box.com. Users may see errors or slowness when accessing Box.com, Public API, Box Sign and SSO services. We will provide additional information as it becomes available.
Posted Jul 03, 2023 - 00:50 PDT
This incident affected: Box Platform / API (Content API) and Box Web Application (Login/SSO, Uploads/Downloads, Box Sign).