Understanding your platform: “profiling” for site reliability

rbayliss

The day before Black Friday, a large online retailer discovered that their product prices, which needed to change several times over the holiday weekend to support their revenue goals, were not updating on any of their 5 Drupal 8 sites without manual intervention.  We had a permanent solution at hand, but chose not to use it - instead we pushed through the weekend with more targeted manual cache clearing. While this was time-consuming, the site survived the holiday and the retailer exceeded their revenue goals, while many others did not. In short, we calculated the risk of each option, and chose the one that fit the situation.

In the course of making decisions that will affect platform reliability, you may have to make choices that are not straightforward or “by the book.” To make these decisions effectively, you need to look at all of the “profiles” that matter:

  • Traffic profile - When are visitors on the site, and how many at once?

  • Visitor profile - Who are your visitors? What kind of browser/device are they using, and what are they trying to do?

  • Editor profile - How many editors, what time of day, editing what?

  • Content profile - What content is time-sensitive, and what is the impact of it not showing up immediately?

  • Development profile - How often are code-level changes happening and what impact do we expect?

In this session, we’ll present two enterprise “platform support” projects - one in Government (Mass.gov), and one ecommerce.  We’ll walk through the different profiles for each platform. Then, we’ll talk about the optimizations and changes that were made based on those profiles, providing actual charts and graphs to show the impact. Finally, we’ll give some guidance on how you can use the “profile” idea to make better decisions about your own site’s performance and reliability.

Learning Objectives

At the end of this session, attendees will be able to:

  • Build a profile that models the risks their site faces.

  • Make reliability and performance choices based on data.

  • Survive a Black Friday style traffic event without losing their minds (hopefully!)

Target Audience

This session will be useful for anyone who makes decisions about performance and reliability as a stakeholder, architect, or implementor. Parts of the presentation will be technical, but the decision-making strategies presented should be useful to most.

Prerequisites

A basic understanding of the factors that impact site reliability (eg: traffic volume, response time).

Track

DevOps & Infrastructure

Tags

devops
performance
scaling

Experience Level

Intermediate

If no timezone is set on your profile, time is displayed in UTC.
Update your profile's timezone