Content as Code: using git for user content

August1914

We'll illustrate 3 concepts in this case study: the two-stack-cms pattern, how to think of content as a build material, how get your node  into git, as the system of record for content.

Content Management Systems including Drupal present unique challenges when trying safely delivery code changes from dev and content changes from marketing to a production system. Metadata and user content are stored together in a relational database: menu router tables, taxonomies and content structures are examples of metadata which can be defined in code which is written to the database during the build process. We can run automated tests against bare-bones systems but we need tests that run against a system populated with production content before we have any real confidence that a build is release-ready.

The development team is usually not in control of content, but their ability to safely deliver useful feature updates depends on it. They need to take ownership of content or face problems like publishing test content live, or inadvertently unpublishing important material, or breaking relationships between content, or breaking dependencies between content and code, when both are changing independently. The choice of how to integrate content in the delivery process will have big implications for both usability and quality.

Instead of making content updates directly to production, a two-stack CMS design can be used: http://martinfowler.com/articles/two-stack-cms. Content editors are provided a non-public facing production-like site instance for content updates, with the dev team effectively taking ownership of content at the point where it is exported from the staging site, after it has cleared editorial quality checks. Content editors still get the benefits of self service, while the dev team gets all the control it needs it order to prevent being blind-sided by unanticipated side effects coming out of code/content interactions. Also product owners will see releases reliably consistent with what they approved in a pre-release showcase.  Exported content in json format can be committed to git, making git the system-of-record for content.

Git gives us all of the advantages of being able to commit, version, diff, grep, branch, and push remote. We can run forensics on individual nodes, to show detailed change history with certainty. A git repo storing our content becomes the source for rebuilding the system, replacing the low fidelity database dump. Custom code will be required to manage dependencies, but we were going to have to deal with this one way or another anyway. Moving the details of code/content dependencies out into the open, makes them explicit and thus manageable. More importantly, drawing on the content repository, we can fully reconstruct the site from scratch without recourse to a database dump as a build material.

Technically, what we're talking about is not that difficult.  We are relying on the node_export contrib, so this session will not be Drupal8 ready, although the concepts will all apply once those export features are available.

Session Track

Coding and Development

Experience Level

Intermediate

Drupal Version