Where's the fire? AKA: My site is down... now what?
Anyone who's ever supported a website dreads the call or text (or pager alert... wow, that's old school) saying the site is down. Your heart starts pounding. You worry that it's some code you wrote or some config you changed. You wonder if you will figure it out fast. You wonder if you will get to sleep tonight.
I am not a devops person. In general, I try to avoid system administration (though I've done plenty of it over the years). I like hosting services like Pantheon and Acquia since they tend to take care of all that stuff for me. But, I still have to deal with what to do if the site goes down or is slow as molasses (including crossing my fingers it's the hosting company and not my fault!).
This session will cover a process for seeing what might be going wrong and how to recover. We'll go over:
- Emergency planning
- Monitoring
- Traceroutes
- Status pages for hosting, CDNs, and 3rd party services
- What logs to look at
- Analyzing New Relic data
- Support tickets
- Prevention when possible
The goal is to follow a documented process that will get you back online faster.
Intended AudienceAnyone building or managing sites, especially those on the hook when things go south. You will walk away with some strategies for dealing with site outages and slowdowns more efficiently and some tips for preventing some problems ahead of time.
Skill LevelsThis session is suitable for beginners and intermediates. If you are an expert, feel free to send your favorite disaster recovery tools, tips, or stories to @kristen_pol on Twitter.
About the SpeakerKristen has been working with Drupal since 2004 as a developer and architect, specializing in multilingual, migrations, and SEO. She has presented at many DrupalCons, BADCamps, Stanford camps, and other Drupal camps and user group meetings. Checkout her drupal.org page for a partial list of talks and check out more info on Kristen's Hook 42 page.