For more information about our Incident Response and Communications please read this support article.

We also maintain a list of Known Product Issues separate from this site here.

[Major] Issues with Uploads, Downloads, public API
Incident Report for Box
Postmortem

We recently addressed issues affecting Uploads and Downloads. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.

Between January 23, 2023 at 11:46 am PST and January 26, 2023 at 3:11 pm PST, some users may have experienced difficulties while working in Box. During this time, a newly re-deployed load balancer instance had a faulty network connection that caused it to be unreachable from our internal network. This resulted in timeouts for any traffic processed by this load balancer, and as a result, users may have experienced upload or download operations time out. We were able to resolve the issue by removing the faulty instance from the pool. In addition, we are working on improving our health-checks for newly deployed instances to prevent similar issues from occurring in the future.

Analysis

In our public cloud infrastructure, we use managed instance groups (MIGs) to regulate and group load balancers that function as backends for the external load balancer. The issue started when a scheduled re-deploy task created a load balancer instance with an incorrect internal network configuration. Additionally, our existing health checks did not account for validating the instance’s internal connectivity. As a result, all traffic that was sent to this particular instance timed out, in addition to any telemetry that would have been generated by the instance. The issue was corrected by removing the instance from the group, and we are working on improving our health checking as well as observability to reduce the risk of similar issues occurring again in the future.

Corrective Actions

The following corrective actions have been completed or are planned:

  • Importing missing external load balancer metrics to observability platforms for alerting and dashboards
  • Adding additional pre-enable health-checks to new load balancer instances

We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter. 

Sincerely,

The Box Team

Posted Jan 27, 2023 - 13:56 PST

Resolved
After further monitoring, this incident is now considered resolved.
Please contact Box Support at https://support.box.com/ if you continue to experience any issues
Posted Jan 27, 2023 - 07:13 PST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jan 26, 2023 - 15:20 PST
Update
We are continuing to investigate this issue.
Posted Jan 26, 2023 - 13:16 PST
Update
We are continuing to investigate this issue.
Posted Jan 26, 2023 - 11:45 PST
Investigating
We have observed an impact to our Public API, Upload and Download services. We are actively investigating to identify the cause of these issues and will provide further updates as as it becomes available.
Posted Jan 26, 2023 - 10:24 PST
This incident affected: Box Platform / API (Uploads/Downloads).