For more information about our Incident Response and Communications please read this support article.

We also maintain a list of Known Product Issues separate from this site here.

Critical: Issue with Admin Events API
Incident Report for Box
Postmortem

Updated Nov 11, 2021

We recently addressed issues affecting the Box Web Application. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.

 Between 6:30 p.m. PDT on August 29, 2021 and 12:30 p.m. PDT on August 30, 2021, some users may have experienced difficulties while working in Box. During this time, some request may have taken longer than usual to complete and some uploads may have failed. The issue occurred due to a misconfiguration of a third-party library used to increase performance. We were able to resolve the issue by optimizing the library for use within the Web Application. In addition, we have set up active monitoring of this library to quickly diagnose similar issues going forward.

Analysis 

The Box Web Application has a caching layer built in to store frequently used values in order to improve response time to requests. A change in usage caused the internal cache memory to become fragmented, resulting in some servers taking longer to serve requests, and adding load to downstream systems serving cached values. This extra load caused cascading failures and resulted in some uploads failing.

Corrective Actions

The following corrective actions have been completed or are planned:

  • We will now be immediately alerted when servers are unable to cache values due to fragmentation

  • Cache tuning was improved to make fragmentation extremely unlikely

We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter. 

 Sincerely,

The Box Team

***************************************************************

We recently addressed issues affecting timeliness of the Events API. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.

Between 8:43 am PDT and 3:25 pm PDT on August 30th, 2021, some users may have experienced difficulties while working in Box. During this time, the real-time Events API sporadically experienced unusual latency. The issue occurred as a result of performance issues in the backing store that powers our Events API. We were able to resolve the issue by migrating traffic to our secondary cluster. In addition, we have increased capacity and conducted additional performance tuning on the cluster to prevent similar issues from occurring in the future.

Analysis

On the morning of August 30th, 2021, the backing store that powers the Events API experienced performance degradation under the load of read traffic that was querying it.  This caused requests to intermittently fail until we migrated traffic to our secondary cluster.

Corrective Actions

The following corrective actions have been completed or are planned:

  • Apply additional performance tuning changes to our database cluster to increase performance.
  • Increase capacity for both of the database clusters to better service the higher read traffic volume.

We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter.

Sincerely,

The Box Team

Posted Sep 28, 2021 - 15:55 PDT

Resolved
After further monitoring, this incident is now considered resolved. Box services have been restored to full functionality. Please contact Box Support at https://support.box.com/ if you continue to experience any issues.
Posted Aug 31, 2021 - 16:17 PDT
Update
We are continuing to investigate this issue.
Posted Aug 31, 2021 - 15:06 PDT
Update
We are continuing to investigate this issue.
Posted Aug 31, 2021 - 13:09 PDT
Update
We are continuing to investigate this issue.
Posted Aug 31, 2021 - 13:01 PDT
Update
We are continuing to investigate this issue.
Posted Aug 31, 2021 - 12:59 PDT
Update
We are continuing to investigate this issue.
Posted Aug 31, 2021 - 12:51 PDT
Update
Box continues to investigate reports of issues with the Admin Events API endpoint, certain Admin Reports.
Posted Aug 31, 2021 - 11:01 PDT
Investigating
We have identified an issue affecting the Admin Events API endpoint and are actively taking steps to remediate at this time. Some users may experience delayed event notifications. We will provide more information as soon as it is available.
Posted Aug 31, 2021 - 09:20 PDT
This incident affected: Box Platform / API (Content API).