Texas 1 Platform (paas1.tx) not Responding
Incident Report for MODX Cloud

The outage that took place on Sunday, August 28, 2016 beginning at 04:00 US Central time, was a result of a hard disk failure which reduced performance of the platform during the period when the, rather intensive, daily backup operations were running. This overwhelmed the resources on the server and MySQL (the database server) was not able to cope with this state and essentially caused the platform to stop responding normally.

MODX Cloud uses RAID arrays to ensure no data loss occurs. In our configuration, drives perform much faster than a single drive could, when all disks are operating normally. When one disk goes down or fails, this cut the drive and data input/output (IO) in half.

To reduce the risk of outages as a result of drive failures (which will invariably happen), we will be migrating the MySQL data (database content) to another location on the configuration which will help the platform operate more normally in such an event. We will perform this migration on all of our other applicable platforms.

Additionally, we will review all drives on our platforms with our upstream provider to replace aging drives before they fail as a matter of routine maintenance.

Again, we apologize for any inconvenience this has caused. We truly understand how important uptime is and we continually search for ways to improve reliability.

If you have any questions or concerns about this outage or anything else related to MODX Cloud, please send us a support request via the Help button inside the MODX Cloud Dashboard.

Posted 10 months ago. Aug 29, 2016 - 12:43 CDT

Resolved
Everything is working normally at this time. We'll continue to monitor the server's health and ensure it runs as expected.

If you're experiencing any errors with your site(s), please submit a support ticket from the MODX Cloud Dashboard.

We understand the critical nature of uptime for all MODX Cloud customers and we sincerely apologize for the inconvenience.

Thanks for choosing MODX Cloud.
Posted 10 months ago. Aug 28, 2016 - 11:01 CDT
Monitoring
The platform has restored after MySQL stopped responding. We will continue to monitor the situation and hopefully have a permanent solution in place soon. As always, if you require assistance or have questions, please click the Help link in the MODX Cloud Dashboard.
Posted 10 months ago. Aug 28, 2016 - 07:43 CDT
Investigating
We're investigating the recurrence of downtime on on the platform again. We're resuming our investigation and will hopefully bring it back online soon.
Posted 10 months ago. Aug 28, 2016 - 06:55 CDT
Monitoring
We have restored functionality to the Texas 1 platform and verified that sites are back online. We will continue our investigation and monitoring to fully understand the root cause of the issue and take steps to prevent such an even in the future, if possible.

If you're site is not functioning as expected, be sure to clear the site cache from inside the MODX Revolution Manager or by using the Upgrade Product (and select the same version) from inside the MODX Cloud dashboard.

We understand the importance of uptime and we sincerely apologize for any inconvenience this may have caused.

If you have any specific questions, please contact us using the Help link inside the MODX Cloud Dashboard.
Posted 10 months ago. Aug 28, 2016 - 06:13 CDT
Update
The server is online, however we're still working to bring services and sites back up at this time.
Posted 10 months ago. Aug 28, 2016 - 05:50 CDT
Update
We are still investigating the outage with the Texas 1 Platform and are working with our upstream provider to identify and correct the issue.
Posted 10 months ago. Aug 28, 2016 - 05:00 CDT
Investigating
We're currently investigating an outage on our Texas 1 Platform. We apologize for the inconvenience and are working to restore. We'll post updates when we have additional information.
Posted 10 months ago. Aug 28, 2016 - 04:14 CDT