February 18, 2015
At approximately 5:00am CST on Sunday, February 15 our staff was notified of an outage on our Poseidon server caused by high server load. As daily upgrades had just recently completed around this time our initial suspicion was that the automated upgrade had caused an issue and we immediately began performing a forced upgrade of all cPanel services to investigate the problem.
When the forced upgrade completed successfully and server load remained high we began investigating other causes and noticed excessive connections to wp-login.php flooding our web server.
We immediately recognized this as a large scale brute force attack against WordPress sites hosted on the server where bots rapidly attempt to guess administrative logins for customer websites. Web hosts around the world first encountered this type of attack roughly two years ago and we have successfully defended against them many times since, using web application firewall (ModSecurity) rules developed to block attacks of this nature.
We implemented aggressive rules to attempt to better filter this attack which allowed services to remain online, but server load remained high causing performance degradation and slow loading times for our customers. In the past our aggressive rules would typically stop attacks of this nature in their tracks, but unfortunately many requests were still hitting our web server.
As we analyzed the traffic from the attack we quickly realized that this attack was different than any other we've encountered and that the botnet attacking our customer's websites on this server was behind a proxy so hits to wp-login.php rarely came from the same IP address more than once as the attackers were obtaining new IP addresses with every request. As such our web application firewall wasn't able to efficiently block the attack as failed login attempts were rarely coming from the same IP address. Instead hits from hundreds of thousands of unique IPs were identified.
On Sunday afternoon we made the decision to disable WordPress logins and fully block wp-login.php for our customer's protection and to help keep server load managable. Upon doing so, server load had dropped to normal levels as requests to that page were no longer being processed by PHP and were immediately rejected by our web application firewall. Server load remained stable throughout the evening and overnight.
At approximately 8:00am on Monday morning server load returned to high levels due to increased customer activity. We spent most of the day identifying and blocking large IP ranges taking part in the attack and testing alternate web application firewall rules in an effort to more efficiently mitigate against the attack. While we were successful in keeping services online for most of the day, server load remained high causing performance degradation and slow connections. We disabled non-essential services like spam scanning, malware scanning, log processing and backups in an effort to further reduce server load to allow our web server and firewall the most resources in order to keep services online and to filter the attack. Server load returned to low levels in the evening and remained low throughout the night.
Server load spiked back up after the long weekend on Tuesday morning and we received many reports of slow websites, outages, and questions about disabled WordPress logins from our customers hosted on this server. In the past we have never experienced an attack lasting longer than a day, so as this attack entered its third day we began examining all aspects of our filtering solution to try to make it more efficient to handle the massive attack. We reconfigured nearly every aspect of our firewall to employ aggressive banning, tested public blocklists of identified attackers and even attempted to block all traffic to the server outside of North America. Unfortunately none of these actions were effective in reducing server load to acceptable ranges.
In the later afternoon we began examining cPanel's new implemenation of ModSecurity which was introduced in cPanel 11.48 which was deployed across all of our servers only weeks ago. It was here we identified changes that cPanel had introduced that decreased the efficiency of our web application firewall ruleset. Namely, cPanel had moved the location of ModSecurity's IP persistant storage resulting in that file no longer getting flushed, resulting in vastly decreased performance.
We wrote a custom script to optimize cPanel's new implementation of ModSecurity and to regularly clean up its persistant IP storage which was creating the biggest bottleneck in efficiently mitigating the attack. At approximately 6:00pm on Tuesday we implemented these changes and immediately noticed a substantial drop in server load. Our web application firewall was now able to successfully filter and keep up with the massively aggressive attack.
When server load remained low for an hour we enabled WordPress logins and all non-essential services that we had disabled at roughly 7:00pm and noticed little to no change in server load. We've now implemeted these changes across all of our servers and have noticed they are all running with lower server load levels now due to the improved filtering systems.
We'd like to thank all those affected by this rare and unfortunate incident for their patience and understanding. We ventured into new territory both with the nature of the attack and its resolution. We highly recommend that all WordPress users read their official document on how to protect WordPress from brute force attacks as they offer many suggestions that website owners can implement to protect their websites and minimize the impact of these types of attacks.