My Server Crash: What the Hell Happened? Part 2 of 2

April 29, 2008 Posted by Tyler Cruz

After playing around with DiskSync for a while, I managed to successfully restore my data. So after a week of nonsense, I managed to restore the data myself in a matter of a couple hours. I fired the administration company shortly after this, as they basically proved that they only wasted my time and money. I’ve since found a very reputable replacement for them called TouchSupport.com. They charge a lot more (the plan I’m using is $99 a month), but are based in the US and have been a lot more helpful than my old company so far.

Even though I managed to restore my data from DiskSync, it wasn’t a quick and natural transition to having all my sites up. This was mainly because the new server, Abby, was installed with CPanel. Trinity was a very heavily modified server, and, believe it or not, never had CPanel on it. I basically administered everything on it directly through the console via SSH. While was certainly very antiquated and made easy things hard to do, I did learn a fair bit of commands over the years and still use SSH today even on the new server for many things, so it was certainly a good learning experience.

Anyhow, Trinity’s Apache httpd.conf file was very heavily modified with many things such as mod_rewrite expressions which made the transition over to Abby difficult and very time consuming. I had many backups of the latest httpd.conf file changes on my PC, so that wasn’t the problem. The issue was that apparently CPanel will not let you customize the httpd.conf file. So, I had to go in and create a custom .htaccess file for each of my sites, and transfer over the according details.

In addition to that, there were many other small issues. For one, all the usernames and paths of the home directory had to be redone. And many PHP and Perl modules had to be installed after it was discovered that many scripts were not working properly. I also set up e-mail for all my sites. I was actually more than glad to do this as it was a snap with CPanel. Previously, it was a MASSIVE pain having to try to set up POP3’s and redirects by hand within SSH. Thanks to Gyutae Park’s help (of WinningTheWeb.com), I can now send e-mails from all of my sites from GMail. Previously, I just had my e-mails such as webmaster@pokerforums.org redirecting to tylercruz@gmail.com, which meant that I always had to send e-mails from either twcruz@hotmail.com or tylercruz@gmail.com, which could make things seem a bit less professional at times. I have now set up tyler@merendi.com which I’ll be using for as my official/corporate e-mail, and it’s perfect timing as I’m setting up PayPal for my corporation and need an appropriate e-mail to use.

To give an idea of how out-of-date Trinity was, it was using Red Hat 3.

Current Server Status

As I write this post, everything is back up and running smoothly except for a few scripts on a few sites which I have my programmers looking at and hope to have fixed in a day or two.

I’ve finished setting DiskSync up on Abby about an hour ago. I had a lot of difficulty getting it set up, but it turned out to be because of APF automatically flushing out custom IPTABLES that were set up to let the Agent and Vault connect. Fortunately, the backup technician at ThePlanet was very helpful in fixing it for me.

I currently have DiskSync set to back up the following directories on Abby (recursive):

  • /backup/cpbackup
  • /etc
  • /home
  • /usr/local/apache
  • /usr/local/cpanel
  • /var/cpanel
  • /var/lib/mysql
  • /var/named

These directories should all the vital files. Since I just set up DiskSync on Abby, it only has the initial seed/safeset/backup currently, but will soon have it’s 10 days of backups.

Once the initial seed backup was finished, I tested the system by deleting one of my files. I then restored the file from the DiskSync Console. Works perfectly πŸ™‚

11

I also increased Bertha and Abby’s retention period from keeping the last 8 days of backups to the last 10 days. Ideally, I’d want the past month or two, but due to DiskSync bandwidth issues, I can only keep so much data backed up. The extra two days will give me more “breathing room” in the event of some future “perfect storm”, and also give me a longer period of time to revert old files/changes.

As for Trinity… well… she served me well…

221

(Revised image courtesy of PublicRecordsGuy)

She was only five.. why must they die so young? I had Trinity back when I was only making a few bucks a month, so she’s brought me a very long way.

She’s actually still with me as I write this, but I’ll be pulling the plug shortly after I’m 100% sure I have everything copied over and working on Abby. I don’t want to let her go until I’m sure I have everything transferred over properly.

So what happened in the first place?

There’s basically no way to know for sure. It is possible that the damage was done by a hacker: in the week or two prior to the crash, I had discovered hidden warez on Trinity, but this was isolated to the /home directory and was only a very small bit of warez. Nevertheless it was still a breach. The crash seemed to happen shortly after the compromised site accounts were deleted, so it is certainly possible that the server was wiped in reaction – but it would make more sense for the hacker to change my AdSense code to theirs or to keep using the server as storage for their warez.

The possibility that it was a hacker might also be reinforced by the fact that for a couple weeks after I got the sites back up, there was a constant heavy DoS attack aimed at Bertha. ThePlanet kept a close eye on this and ‘blocked’ it for two weeks. Since it continued for so long, they finally decided to take legal action and contacted the ISP from where it was originating from. One hour after phoning the ISP, the attacks stopped.

It could very well have been a hacker, but I would not be surprised if the server just finally let go from old age and very antiquated and heavily customized software.

Whatever the reason, it doesn’t really matter. The fact was that the server was completely wiped and that it took so long to get back up. The real question is now how it happened, but why it took so long to repair it.

Fortunately, I know the answer already. It was just a really bad set of circumstances. I had many backups in place, but a bunch of unfortunate things just all hit at once. And the server administration company that I had trusted, completely wasted 6 days.

If Abby gets wiped out in the future, it should take 4-5 hours tops to get her back up and running. The procedure would be much simpler now that it’s running CPanel, basically involving only two steps:

1. Order an OS reload from ThePlanet for $20 (1-2 hours)
2. Restore the server with DiscSync (1-2 hours; All data has to be transferred over)

Effects of the Downtime

I’m sure a lot of readers are interested in what effect the downtime had for me and my two dozen sites.

Negative Results of the Crash

The worst part by far was all the stress caused, and all the hours it took me to restore everything.

After that, it was the loss of advertiser income that hurt. Since I sell most of my ad inventory privately, I actually saw no loss of income. However, since my sites were down for 10-12 days (the sites actually were back up in a week, but it took me a few days to fix them as not all the pages were working correctly due to the httpd.conf/CPanel issue), I of course contacted each and every one of my advertisers and extended their campaigns by 10-12 days.

That may not sound like a lot, but that’s a $2,000 loss right there just for PokerForums.org. I also had one or two advertisers cancel their banner ad subscription on my blog when they saw it down for a few days (I still extended their campaign of course). The downtime also meant that potential new advertisers would not have the opportunity to purchase ads.

I estimated in an earlier post that the downtime will probably cost me around $9,000. Now that I’m at the end stages of the disaster I can see that I overestimated the costs, but it’s still not pretty:

  • Estimated lost advertising revenue due to ‘free’ campaign extensions: $4,000
  • ThePlanet Hardware/Service Fees: $222
  • Estimated loss of potential new advertisers/campaigns: $1,000
  • Old Server Administration Cancellation: $250
  • New Server Administration Monthly Price (Based on days left in the month): $60

Total: $5,532

And, of course, a negative impact of the downtime was a loss in traffic, although that wasn’t too bad as I’ll explain in the next section. However, it did hurt my RSS a bit (basically, just the growth), and it prevented potential new readers and subscribers. For instance, John Chow had written a post about meeting me and posted a picture of me and him and a link back to my site. Unfortunately, I lost out on all the traffic that would have sent me.

There are also SEO penalties for the downtime. For example, PublisherSpot is still showing a description of its directory from when it was broken, in Google. This appears to be because it has a blank META description on the index page – none of my other sites do and they didn’t have this issue.

I also received lower metric scores such as Alexa due to the drop. However, these are all temporary things and are not a big deal.

Positive Results of the Crash:

The main thing to realize when looking back at all this mess is to realize that this was basically the worst-case scenario and I survived: The crash occurred the day before I left for vacation, I put my trust in a server administration company which basically did nothing for a week and put me in a false sense that they were actually fixing it, ThePlanet mislead me and cost me money and time by telling me I could do a hard drive swap between machines, I got sick when I returned back from the vacation, once I restored the data from DiscSync, I learned that everything had to be hand-fixed and modified to make it compatible with CPanel from my previously heavily modified httpd.conf file.

So, the best news is knowing that this was basically one of the worst-case scenarios and that the world didn’t end.

The effect of the downtime in terms of traffic also hasn’t been too severe, as shown in the graphs below:

2203

As you can see, my three largest sites basically returned to normal levels immediately, although PokerForums.org and Movie-Vault.com are still a bit lower than before, but that is to be expected for a little while.

One thing I’m very happy with though, is my new Abby server. She is brand spanking new and is perfectly ‘clean’ with a fresh OS install and everything such as PHP and Perl up to the latest version. She also has WHM/CPanel, and from what I’ve used of it so far, I absolutely love it! I can’t believe how much simpler things are in CPanel from doing them in SSH or WEBMIN.

I have also set up e-mails for all my sites with Abby and GMail so that I can send and receive them from GMail, which is a major improvement for me as well.

Lastly, Abby is an absolute beast, at least when compared to Trinity. She’s over twice as powerful, has 4x as much RAM, 1GB more bandwidth, and has up-to-date software. I’ve already noticed the significant improvements in speed when sending out mass-emails lately. It used to take Trinity around 3-minutes to send out batches of 500 e-mails in vBulletin. Abby does it in around 10-seconds.

Here is my old set-up, from before the crash:

2183

And now my new set-up:

21911
 

So that’s the story of my big crash. If you don’t have multiple backup measures in place, I strongly urge you to take the time to implement a plan. I faced the worst-case scenario and survived only because I had my own plan in place – don’t take the chance and think a data-wipe can’t happen to you, as it most certainly can.

If you enjoyed this post, please consider leaving a comment below, subscribing to my RSS feed, or following me on Twitter.
Posted: April 29th, 2008 under My Websites  

31 Responses to “My Server Crash: What the Hell Happened? Part 2 of 2”

  1. Clog Money says:

    Well I cannot say I am surprised that you fired the administration company. I myself would have done the same and also try and recover some loss of earnings from them as well.

    I wonder do these servers have remote management cards. All the servers we have are HP’s which come with iLo cards which basically allow complete control over the server remotely. E.g turning off and on, console access from boot, meaning you could do any os re-install yourself remotely.

  2. Jason says:

    Glad to see you back mate.

    I had similar levels of downtime – not fun at all.

  3. Mike Huang says:

    Because of these problems, I’m going to start backing up the whole HD rather than just the WP database πŸ™‚

    -Mike

  4. Wade says:

    Outside of the expenses of fixing and changing over your server, did you estimate how much money you lost that the sites bring in? This should be a warning to everyone. BACKUP OFTEN AND EARLY! If it isn’t backed up, you would be seeing the original wordpress theme right now. lol

    Shudogg Dot Com – Make Money Online Blogging

    • Tyler Cruz says:

      Yes, I mentioned this in there:

      – Estimated lost advertising revenue due to β€˜free’ campaign extensions: $4,000
      – Estimated loss of potential new advertisers/campaigns: $1,000

  5. You should have your web sites replicated on servers at home. If anything happens to the outside servers, you’ll have your home servers to do a quick restore to the outside ones, or even temporarily run the sites from home even though it will be much slower connection.

    • Exactly!

      Even I have a backup server and backup hard drives for my little operation.

      But you have inspired me to use IDrive.com and backup all my data there. 12GB of free hard drive space is nice.

      • If your running your server on Windows, DriveImageXML is a great way to make an image of the Windows drive. If you have a server crash, you can restore the image in about 20 minutes. Reinstalling Windows manually takes 2-3 hours otherwise.

  6. Jon says:

    Since you only had to pay little over $500 in fees, I wonder why you were asking for donations?

  7. Andreas says:

    Just curious: Which Mailserver are you using now or is CPanel using? Postfix?

  8. Tyler,

    Why didn’t you replace that second server as well? It’s got no RAID-1 on it and a celeron with 1gb RAM is not a great config to be running your mysql — you did say that you use that server for your mysql databases right?

    Regards,
    Richard.

    • shafi says:

      “Trinity was the name of the server that ran Apache and Bertha the name of the one which runs MySQL.”

      Its kind amazing that you are running mysql on a slow machine. I suggest you do same as “abby”, or you can do daily back of up .sql files. i guess your other machines does that for you right ?

      you used raid 1?

      I had 16.0ghz machine with over 400k rows mysql struggles.

    • Tyler Cruz says:

      Yeah, I am considering upgrading Bertha now that I did Trinity.

      How much of an improved performance would I notice in MySQL executions if I upgraded a server with Bertha’s specs to something along the lines of Bertha?

      • shafi says:

        Allot since you are running linux and mysql .

        blazing fast..

        Maybe you should go far 64bit Linux Os. the more cpu , the more ram(2-4GB).

  9. Lyndon says:

    Thanks for the great posts Tyler. Good to see the entrepreneur is still thriving.

    A great learning experience for everyone, most of all you.

    How about a detailed post/review of DiskSync as it was a butt saver???

  10. Tom says:

    you lost like 5000 in a week, and you just make like 1200 a month… you overvalue some things here..

  11. DD says:

    Just wondering why you solicited donations when your server was down?

    It seemed borderline ridiculous to me – this blog is basically an exercise in telling the world how much money you make, yet you still feel the need to ask for $ when your site goes down? Doesn’t make sense to me.

    Can you imagine Google or Amazon doing something similar? You are all Internet businesses .. that made you look so bad IMO

    Didn’t you make $100k last year?

    • john says:

      the only thing he cares about is his money, hell he probably dresses up like a bum 5 days out of the week and sits on the street begging for money too.

    • Tyler Cruz says:

      That file was actually directed to some of my smaller sites which I still maintain even though they don’t bring much money in.

      I made the temporary index.html file while in my hotel room, as a temporary measure until the sites were restored. As the default page, all my sites were shown that file, including this blog.

      I also mentioned in that temporary page that there was no need to feel obligated to donate anything as I make good money from my websites. I’ve had people want to donate before, and so gave them the option of doing so to help.

      I do agree that it probably didn’t look the most professional on PokerForums.org and my blog, but it shouldn’t be anything to get worked up about…

      I appreciate the comparison to Google and Amazon though πŸ™‚

  12. Wade says:

    Tyler, post more often, I need more places to post comments to keep top commentator. Looks like it reset on me! No wonder why I had no hits πŸ˜›

    Shudogg Dot Com – Make Money Online Blogging

  13. Zach says:

    talk about having a server crash at the worst possible time…

  14. Nishu says:

    just imagine this happening in life of a person totally dependent on E-Marketing for money ..

  15. Glad to see your site up sorry to hear about Trinity

  16. Ben says:

    Sorry to hear about the crash. You mention that: “I can now send e-mails from all of my sites from GMail”. I have the same problem you described in terms of multiple emails from multiple domains forwarding to one gmail account. When I go to reply to them from Gmail it can be confusing to the person on the other end. How did you resolve that problem? Thanks!

PeerFly

Leave a Reply to Tyler Cruz