When Drupal Database Sanitization Isn't Enough

Proceed with caution

After enough whining, my patch (discussed later in the post) did make it in to the user_revision module. Thanks Peter.

The general problem

One of the most embarrassing and potentially costly things we can do as developers is to send emails out to real people unintentionally from a development environment. It happens, and often times we aren’t even aware of it until the damage is done and a background process sends out, say, 11,000 automated emails to existing customers (actually happened to a former employer recently). In the Drupal world, there are myriad ways to attempt to address this problem.

General solutions to the general problem

maillog - A Drupal module that logs mails to the database and optionally allows you to “not send” them
reroute email - A Drupal module that intercepts email and routes it to a configurable address
devel mail - An option of the beloved devel module which writes emails to local files instead of sending
mailcatcher (not Drupal-specific) - Configure your local mail server to not send mail through PHP

The ultimate solution to the problem?

Never store real email addresses in your development environment. In the Drupal world, we do that by using the drush sql-sanitize command. With no arguments, how I typically execute it, the command will set all users emails addresses to a phony address that looks like this: user+1@localhost.localdomain. This is a good thing. Then, even in cases in which you do accidentally send out emails in an automated way, often from cron, sending to phony addresses is a livable mistake; no end-user receives an email that confuses her or makes her lose confidence in your organization.

So, in most cases, drush sqlsan (alias) is enough, and the mail redirection options linked above are additional safety measures. Of course, I’m not writing about most scenarios now am I? Sadly, I’m not yet aware of a comprehensive solution that ensures no email will be sent from a development environment. Please reach out if you know of one!

The specific problem with user_revision module

One pernicious case, in which drush sqlsan is insufficient in sanitizing your database, is when the user_revision module is enabled on a Drupal 7 site, at least without my patch applied. The user_revision module extends the UserController class, which overwrites fields from the “base” table users (in the case) to the “revision” table, user_revision, due to the way that entity_load() works. Therefore, when a user entity is loaded, it receives the mail field from the user_revision table. Without the above patch applied, this table is not affected by drush sqlsan.

How did I discover this?

I discovered this when adding new cron, notification functionality to the Tilthy Rich Compost website, which we maintain. We began using the user_revision module in 2013 due to losing information we still needed from canceling users. After sending emails to subscribers from my development environment for the 10th time in 2 years, even after sanitizing, I was determined to figure out once and for all, what was going on. So like any deep-dive, I fired up the trusty ol’ debugger and discovered the aforementioned culprit.

The solution to the specific problem

After consulting the team, we agreed that the solution would be to write a drush hook for the user_revision module. This code would need to sanitize the mail field in the user_revision table and would be invoked when the drush sqlsan command is executed in the presence of the user_revision module. However, to write this code efficiently and effectively, I would need to debug drush commands during execution, which I had never done.

How to debug drush (or other CLI scripts) with PHPStorm

Set up xdebug (Mac only)

I first installed xdebug with homebrew via this method. NB: Changing the port change to 10000 was necessary for me.

Upgrade to latest drush

I ensured I was using the most recent version of drush to ensure that the code I wrote would apply to the most recent drush development.

Getting breakpoints in PHPStorm to listen to drush

Several have blogged about this before, so I’ll just point theirs out. Generally, I followed these instructions, but I trust that my mentor and friend Randy Fay’s article is excellent as well. They all seemed to use xdebug and PHPStorm, and though I use PHPStorm, I have been using ZendDebugger for years, with reasonable success. But I had been dissatisfied of late, and the rest of the team uses xdebug anyway, so I figured it would be a safe switch, which proved true. After having xdebug properly installed, you can add a line to your .bashrc file to always make PHPStorm ready to listen for drush commands.

The solution in action

So now when running drush sqlsan, we can truly feel safe that we won’t send emails to anyone we didn’t mean to. You’re welcome community ?

Will user_revision exist in D8?

It’s not clear, though some think so. Perhaps mature D8 entities and revisioning on all entities will render a contrib module unnecessary. Time will tell.