Refreshed
This is the official status page of CuByte's global infrastructure. All related incidents will be published here. Contact our network operations center. Alternatively you can create a ticket in the CuByte Support Center.
Closed | Feb 27, 2025 | 09:00 GMT+01:00
The last node has been successfully replaced. The cluster is already successfully restored, hence we are closing this incident
Monitoring | Feb 16, 2025 | 06:00 GMT+01:00
The OS disk of one of our storage nodes stopped working. The OS mounted read/only. There is no impact as this is a redundant node.
Update #1: we inserted a new disk and the copy job is in progress.
Update #2: during further investigation we found that the HPE storage controller that is used for OS and data disks is faulty and the actual cause of the issue. Hardware will be replaced soon - we are aiming to replace the whole server as the lifespan of the affected hardware is anyhow close to it's end.
Update #3: multiple shiny new ceph nodes have been placed in the datacenter. Restoration process within ceph for the first new node is in progress. Due to the size of the cluster this action will last several hours. During the time the write and read performance will be reduced.
Update #4: the first new node has successfully taken up its work and the pgs were rebalanced mostly
Update #5: another node has been successfully replaced without any service interruption. We will replace the remaining node tomorrow on 26th of February. No service interruption is expected.
Update #5: the last node has been successfully replaced. The cluster is already successfully restored, hence we are closing this incident
Closed | Feb 20, 2025 | 02:00 GMT+01:00
Unfortunately it wasn't the CPU, the Mainboard seems to be the cause. We decided to remove the whole node from the data center completely.
Resolution in progress | Feb 15, 2025 | 01:05 GMT+01:00
One of the nodes in our compute cluster crashed. We are investigating the root cause and are relocating workloads to other healthy nodes.
Update #1 after first on-site observation we assume the CPU on socket two died. There was a short spike in temperature and afterwards the server crashed. The CPU was removed and server came back online. We have ordered already a replacement CPU and will replace it once delivered. For now all workloads have been moved to the remaining compute nodes
Update #2 CPU was delivered and will be inserted on 20th (Wednesday)
Update #3 Unfortunately it wasn't the CPU, the Mainboard seems to be the cause. We decided to remove the whole node from the data center completely.
Closed | Jan 29, 2025 | 14:50 GMT+01:00
Mons have been restored and Ceph health is back to normal.
Resolved | Jan 13, 2025 | 04:47 GMT+01:00
Storage cluster currently faces a partial outage. Investigation ongoing.
Update #1: only cephfs affected due to unavailable MDS.
Update #2: Metadata has been repaired and services are recovering.
During this maintenance we will be updating our gateway routers to a new service release.
Due to the redundant design of our network we do not expect any continuous service interruption. You may see occasional packet loss.
We will update the software on one of our edge routers. During that time-frame you may notice higher latencies to some destinations like DTAG (AS3320).
Our currently used CA certificate for our internal VPN infrastructure is about to expire at Fri, 14 Feb 2025 16:34:34. This also means that CuByte is celebrating it's 10th birthday!
With this change, we are introducing cubyte-ca-v2 and will re-sign all user certificates for authenticating at our VPN infrastructure. Therefore you will need to replace your existing VPN configuration at your client with new ones.
The new configuration for your organization will be send out directly to affected users.
In case you experience issues with connecting to the VPN after Sunday the 2nd February, please reach to our support directly.
Mar 13, 2023 22:00 - Mar 14, 2023 01:00 | GMT+01:00
Lumen (AS3356) intends to carry out an internal maintenance within its network.
We do not expect any downtime during the process due to the redundant design and connectivity of our network.
The expected duration is 1 hour.