Drupal Content Syncing to Ease Code Reviews and Deployments

Dan Murphy on background

Dan Murphy

code computers

Topics

Introduction

One of our core values here at Savas Labs is to excel -

"We’re ambitious optimists who take pride in our work. We endeavor to excel while embracing the need to constantly learn and improve."

As such, peer-reviewing code via a pull request process is an important part of our workflow for maintaining quality, providing feedback, and collaboratively learning and improving.

However, performing constructive code reviews takes time and effort. Depending on the complexity and other demands, this can place a significant burden on reviewers.

Further, in addition to a code review, our pull request process typically includes an initial manual quality assurance (QA) check. This helps ensure that not only is the business logic sound but that features also function as expected and meet the overall requirements. Given that many sites we work on are content management systems, this QA check often requires the creation of some test content.

During a recent internal retrospective to discuss improving our processes, our team observed the following pain points while performing code reviews.

The cost of context switching: Code reviews are most often performed by team members working on the same project, and they usually require the reviewer to temporarily pause what they're working on to perform the review. However, the QA portion of the review often requires a fresh copy of the environment (including database), forcing the reviewer to abandon any of the test content or the (yet to be exported) configuration they have locally. This makes it harder for the reviewer to return to their original task, slowing them down as they try to return their environment to its pre-review state.

The burden of content creation: When creating a pull request, it is often time-consuming for the author to describe in detail all of the content necessary to adequately QA the feature. Further, it is time-consuming for the reviewer to recreate this content locally to a sufficient degree in order to QA. This creates a lot of inefficiency in the review process. The burden of content creation also goes beyond just the code review process. It is also a burden when staging content for the client's review or when deploying content.

Content vs. Configuration

If you're familiar with the differences between content and configuration in Drupal, feel free to skip to the next section. If not, let me provide some background to help you better understand the problem developers face when performing QA that depends on content.

Per Drupal's documentation content is the "information meant to be displayed on your site, and edited by users". Examples include articles, basic pages, media, taxonomy terms, menu items, and blocks. Configuration, on the other hand, is "information about your site that is not content and changes infrequently". Examples include content types, views, media types, taxonomy vocabularies, menus, block types, and block placement.

One of the great features of Drupal 8 is its robust configuration management system which makes it easy to export configuration stored in the database to YML files that can be version controlled. This YML stored configuration can then be imported into any environment.

However content lives exclusively in the database. The most common way to sync content between environments is to export the database from one environment and import it into the other. While some options exist to get around this, most require significant effort to implement (spoiler alert: in the next section we'll explain our workflow to greatly simplify this!)
 

Tools and workflows to reduce friction

The cost of context switching

To help reduce the cost of context switching, we decided to update our local development environment to include helper commands that allow code reviewers to quickly and easily "stash" a copy of their database prior to review, and then import the "stashed" copy after review.

While some of our Drupal veterans were already using drush to do this locally using their own workflows, we wanted to standardize this process to make it easier and more obvious for all developers (including junior and front-end) to use.

To implement this, we added the following commands to our Docker-based local stack template that we use for all projects. All of the popular Drupal local development stacks (Docksal, Lando, DDEV, etc.) provide a way to add your own custom commands.

  • stash-db - exports the compressed database into a local git ignored db/stash directory with the timestamp appended to the filename.
  • stash-db-import - imports a copy of the stashed DB, defaulting to the most recent copy in the db/stash directory unless a specific filename is passed as an argument.
  • stash-db-list - lists the contents of the db/stash directory.
  • stash-db-clean - removes all stashed databases from db/stash.

Our developers have really appreciated the ease of use of these standardized commands and the benefits they provide when switching between contexts.

The burden of content creation

To help reduce the burden of content creation, we first identified our need and then investigated the existing tools that were out there.

Our need

We determined that we needed a way to easily sync content between environments. We also needed a way to designate content being synced as either (1) test content intended for QA and review, or (2) content intended for eventual deployment to production. An example of test content might be an article featuring a new dynamic paragraph type being added to the site. An example of content intended for production deployment might be a footer block instance, whose placement is exported to configuration.

Investigating solutions

Our search revealed that syncing content between environments for review and/or deployment was a common desire with many different approaches for dealing with it, some of which we were already familiar with.

Creating branch-specific staging environments specifically for review is a popular approach to expedite QA. There are a few hosting environments (like Pantheon Mutidev, Platform.sh, and Tugboat.qa) that provide this feature. However, this approach doesn't reduce the burden of content creation since these branch-specific staging environments still require branch specific content to then be manually created.

Writing update hooks that create content is an approach our team is very familiar with. However, it can be difficult and time-consuming to write update hooks, especially when accounting for all entity references. Additionally, this approach is typically intended for content that should be deployed to production and doesn't fit well with providing code review specific content.

Sharing an exported database allows developers to sync environments, but it doesn't help with code review as the database dump could include a configuration that hasn't been exported. The QA portion of our code review is intended to ensure a feature will work on a fresh installation that matches production, not on a developer's specific instance.

The Deploy module is the most robust contributed module that attempts to address this need. However, it requires some pretty significant changes to the underlying way in which Drupal stores data (due to its Multiversion requirement). We were apprehensive to introduce these major changes to a project for a non-client need.

While the Content Synchronization contributed module attempts to solve our problem, we encountered issues with the version we tested. We found that it didn't seem to work with Paragraphs (which we use on most projects) and lacked good tools for export and import.

Recreate Block Content, Fixed Block Content, and Simple Block all try to address this issue specifically for blocks, as broken/missing block instance errors are quite common. However, our need extends beyond blocks.

Default Content is another robust contributed module that allows other modules and install profiles a way to ship with default content. Unfortunately, it lacks the tooling to support our needs.

Default Content Deploy on the other hand expands upon the Default Content module by simplifying content export and import, providing useful drush commands for the job. The module exports content including referenced entities, and even supports syncing files when the Better Normalizers module is installed. After testing this module out, we found it to be the backbone for our content syncing needs.

Our solution and workflow

Leveraging the Default Content Module, for any given Drupal project we implement the following to achieve our content syncing goals.

We install the following contributed modules:

  • Default Content Deploy - this module is actively being maintained, and we're currently using the dev release (commit 3adee3f).
  • Better Normalizers - this allows us to export and import files, which is helpful when the content includes media or file fields.

We create version-controlled /content/deploy and content/review directories in the root of the project repo. The former is for exported content intended for deployment to staging and production environments. The latter includes subdirectories named after specific branches. These /content/deploy/{branch} directories are for exported content intended specifically for review of that particular branch, and should not be deployed to staging or production.

In practice, a developer working on a feature branch may choose to export content via one of the following Default Content Deploy drush commands and then commit that exported content on their branch:

  • drush dcder block_content --entity_id=1 --folder='../content/deploy' - to export a single block for production deployment.
  • drush dcder taxonomy_term --bundle=test_vocab_import --folder='../content/deploy' - to export all the taxonomy terms in a vocabulary for production deployment.
  • drush dcder menu_link_content --bundle=main --folder='../content/deploy' - to export all the menu items in the main menu for production deployment.
  • drush dcder node --entity_id=81 --folder='../content/review/feature/1234-default_content_deploy' - to export a single node with its referenced entities for review of the feature/1234-default_content_deploy branch.

We also add the following custom commands to our stack to simplify the import process both on staging/production and locally:

  • import-content-deploy - a target for the drush command drush dcdi --folder='../content/deploy' -y. This command imports content intended for deployment to staging and production environments and should be run in those environments during deployments.
  • import-content-review - a target for the drush command drush dcdi --folder='../content/review/{current-branch}' --force-override -y. This command imports content specifically for review of the current branch. It imports content from a directory named after the branch you're currently on and should be run during code review to QA the branch.

These tools and workflows allow us to sync content between environments while designating whether the content is either (1) test content intended for QA and review, or (2) intended for eventual deployment to production.

Considerations

While we feel that this workflow brings many benefits to our team, it's worth acknowledging a few of the trade-offs:

  • Extra modules - this setup requires installing a few additional modules, making your code baseless lean.
  • Extra files in version control and potentially large objects in git history - obsolete files can be removed by occasionally deleting the contents of the /content/review/ directory. Removing large objects from git history takes a bit more effort.
  • Discourages "outside the box" QA - by providing test content, a reviewer may be less likely to test the edges by creating content in a way the developer didn't anticipate.
  • Setup - this workflow requires a bit of setup time installing modules and adding some custom commands to your stack.

Conclusion

In summary, we've found a combination of contributed modules, custom commands added to our local stack, and a few standardized workflows have really helped lower the cost of context switching and the burden of content creation during code review. They have also helped simplify deploying content to staging and production environments.

This has helped make us more efficient while maintaining quality and overall has made the team happier. We recommend you try this out for yourself if you collaborate closely with team members and are interested in achieving the same results!