Testing Changes
As you scale your website and its popularity grows, it becomes increasingly important to properly test all changes prior to updating your production website. At minimum, you should have a separate testing server which duplicates your production environment. Your development environment should be using the same exact operating system as your production servers, with the same extensions installed and updates applied. If you instead use for example CentOS on your production servers, and Fedora on your development servers, you may find that code which works perfectly on your development server fails in production due to issues such as to failed dependencies.
The more similar you make your development environment to your production environment, the more valid your testing will be. That said, very often while your production environment may be comprised of numerous servers, your development environment may be limited to a single server. In this case, you should do what you can to simulate your production infrastructure.
In this final section of chapter one, I offer best practices for testing changes and pushing these changes to your production servers.
Revision Control
It's often tempting to maintain your website one file at a time, manually copying individual files into place. Usually this involves first making a backup copy of the file you wish to change, over time resulting in dozens of old backups cluttering the directories of your production servers. Often this can also involve editing files directly on a production server and hoping for the best. Unfortunately even the most trivial seeming change can have unforeseen consequences. It can also quickly become confusing which files have been updated, and which files are still an older version. This can result in bug fixes never actually making it into production, or new and bigger bugs being created while trying to fix old bugs.
A simple yet extremely effective solution to this problem is to utilize a revision control system. Revision control is one of many phrases used to describe the management of multiple versions of the same information. Other popular phrases often used to describe this functionality include version control, source control, and source code management. There are a great many number of both open source and proprietary revision control tools available to you. Popular examples include CVS, Subversion (SVN), Perforce, and Git.
For the example contained in this book, we have chosen to use Git, a fast and flexible distributed source control system originally designed by Linus Torvalds for managing the Linux kernel. Git was selected because of its distributed design, its growing popularity, its flexibility, its applicability to what we are trying to solve, and its free availability. However, this does not mean that you also need to use Git to manage your website. It is possible to apply the tips and best practices we explain here to your favorite source control system.
Tracking File Changes
The basic steps required for managing files with Git were briefly discussed in the previous section on backups. In this section, we build upon our previous examples, showing you how Git can offer you much more than a versioned backup of your website.
Managing Drupal Core With Git
In this first example you will learn how you can manage a website built from Drupal's core files. Start with an older version of Drupal, which you will manually patch. You will then use Git to easily upgrade your website to a newer version of Drupal. Start by checking out Drupal 6.2 out of CVS:
$ cvs -z6 -d:pserver:anonymous:anonymous@cvs.drupal.org:/cvs/drupal \ co -d html -r DRUPAL-6-2 drupal
Next, create a git repository in your website's directory, and in it store all the files you checked out from CVS, including the CVS files themselves. Then create a "drupal" branch where you'll keep an unmodified copy of Drupal for use in upgrading the site later. Finally, tag your release with the same tag found in CVS for simplified reference, and revert back to the "master" branch:
$ cd html/ $ git init Initialized empty Git repository in .git/ $ git add . $ git commit -a -m "Drupal 6.2" $ git checkout -b drupal-core Switched to a new branch "drupal-core" $ git tag DRUPAL-6-2 $ git checkout master Switched to branch "master"
Now use your web browser to configure your Drupal installation, which will create and configure settings.php. Once completed, add your new settings.php file to the master branch of your Git repository. Throughout these examples, you will be using many branches for merging and website development, but the "master" branch will always contain your actual website:
$ git add sites/default/settings.php $ git commit -a -m "Add configured settings.php"
At this point you are ready to patch your new Drupal website. For this example, you will apply a very simple patch to bootstrap.inc that is intentionally a slightly different version of a change made to the file in Drupal 6.3. You do this to cause a conflict when you upgrade the website to Drupal 6.3:
$ cat bootstrap.inc.patch
index 44cd0d7..d45cf5d 100644
--- a/includes/bootstrap.inc
+++ b/includes/bootstrap.inc
@@ -283,0 +284,7 @@ function conf_init() {
+ // Do not use the placeholder url from default.settings.php.
+ if (isset($db_url)) {
+ if ($db_url == 'mysql://username:password@localhost/databasename') {
+ $db_url = '';
+ }
+ }
+
Manually apply this patch to your master branch, checking in the modified bootstrap.inc include file:
$ patch -p1 < bootstrap.inc.patch $ git commit -a -m "custom bootstrap patch"
Now, upgrade your website to Drupal 6.3. First, update the version in your "drupal-core" branch from CVS. You update the "drupal-core" branch so that CVS won't run into any conflicts. If you instead update your "master" branch, CVS will corrupt the bootstrap.inc include file due to our patch. We will later rely on Git to more intelligently help us resolve the merge conflict:
$ git checkout drupal-core Switched to branch "drupal-core" $ cvs update -r DRUPAL-6-3
With your "drupal-core" branch updated to Drupal 6.3, commit the updated files to your Git repository and tag them for possible future reference:
$ git commit -a -m "Drupal 6.3" $ git tag DRUPAL-6-3
Now you use this updated "drupal-core" branch to upgrade your website. You will perform the merge in a temporary branch, though it would be just as easy to perform the merge in the "master" branch. Either way, Git provides easy mechanisms for undoing a merge if you make a mistake our change your mind. In this case, you should test the merge in your temporary branch before merging it into your official "master" branch:
$ git checkout master -b temporary Switched to branch "temporary" $ git merge drupal-core Auto-merged includes/bootstrap.inc CONFLICT (content): Merge conflict in includes/bootstrap.inc Automatic merge failed; fix conflicts and then commit the result.
Git was able to automatically merge all files except for includes/bootstrap.inc, which failed because of the custom changes which modified the file in the exact same lines as Drupal 6.3. You will quickly resolve this conflict using a graphical tool, verify that the changes look sane, then check in all the merged results:
$ git mergetool $ git diff --color master includes/bootstrap.inc $ git commit -m "Upgrade to Drupal 6.3"
If you make a mistake during the merge, you can easily and safely delete the temporary branch ("git branch -d temporary"), recreate it, and try the above steps again, fixing your mistake. Once you've confirmed that the website is working as expected, merge the temporary branch into your master branch, and delete the temporary branch:
$ git checkout master Switched to branch "master" $ git merge temporary $ git branch -d temporary Deleted branch temporary.
Managing Contributed Themes And Modules With Git
Managing contributed themes and modules is best done by using another branch. It is helpful to create one branch for each remote source for the files you use to build your website. You can use a single branch for all your contributed modules and themes, as they all come from Drupal's "contrib" CVS repository.
In this example, we'll add the devel module to our website, checking it out of CVS:
$ git checkout master -b drupal-contrib Switched to branch "drupal-contrib" $ cd sites/default $ mkdir modules $ cvs -z6 \ -d:pserver:anonymous:anonymous@cvs.drupal.org:/cvs/drupal-contrib \ checkout -r DRUPAL-6--1-10 -d modules/devel contributions/modules/devel $ git add modules $ git commit -a -m "Devel module version 6.1.10"
You can repeat this process to check out additional contributed module or themes from CVS, checking them in to your local 'drupal-contrib' Git branch. Once you've checked out all the modules and themes you need for your website, merge them into your master branch:
$ git checkout master Switched to branch "drupal-contrib" $ git merge drupal-contrib
When you need to upgrade any of your contributed modules or themes, follow the same steps described above for updating Drupal core. Switch to the 'drupal-contrib' branch to checkout the updated version from CVS. Commit the changes to your "drupal-contrib" branch, then use Git to merge the changed files into your "master" branch.
The important thing is to keep the files in your 'drupal-contrib' branch unmodified so that CVS can update the files without any conflicts. If you need to modify any of the contributed modules or themes, do it in the 'master' branch, or in another development branch. If your changes conflict with future upgrades, you can easily resolve these conflicts in the same way that you did in our previous example with a conflict in bootstrap.inc.
Managing And Upgrading An Existing Website With Git
The previous examples assumed that you were creating a new website with Drupal. In this example, we will show you how Git can also help you to manage and upgrade an existing website, even if you've not been using revision control up to this point.
The first step is to create a new Git repository within your website directory, and to add your existing website files to this new repository. This first step is identical to the example provided in the previous section for backup up your website files:
$ cd /var/www/html $ git init Initialized empty Git repository in .git/ $ git add . $ git commit -a -m "Initial commit."
When you're ready to upgrade your website, checkout the version of Drupal that you wish to upgrade your website to, creating a new Git repository with this new version of Drupal. In this example, you'll upgrade your website to Drupal 6.3:
$ cvs -z6 -d:pserver:anonymous:anonymous@cvs.drupal.org:/cvs/drupal \ co -r DRUPAL-6-3 drupal $ cd drupal $ git init Initialized empty Git repository in .git/ $ git add . $ git commit -a -m "Drupal 6.3"
In previous examples, you've always kept all your files in different branches of the same Git repository. In this example, you take advantage of Git's distributed design to merge code from two different repositories. To upgrade your website to Drupal 6.3, switch back to your website repository and create a "drupal-core" branch. Now, "pull" the updated version of Drupal from the second repository you just created. Finally, merge the "drupal-core" branch into your "master" branch and manually resolve any conflicts that Git is unable to automatically merge:
$ cd ../html $ git checkout -b drupal-core $ git pull ../drupal $ git mergetool $ git commit -a -m "Drupal 6.3, resolved conflicts." $ git checkout master $ git merge drupal-core
At this point, you can either continue tracking Drupal core in the "drupal-core" branch of your website repository, or you can instead continue tracking Drupal core in the external "drupal" repository, deleting your local "drupal-core" branch until you need it again. There is no technical reason to favor one solution over the other, so it is left to you to decide which method works best for you.
Apply this same technique when you wish to upgrade contributed themes or modules. Once again, checkout the new version of the module and create a new repository with it. Then, merge this repository into a new branch of your website repository. Once you are happy that the upgrade has gone smoothly, merge the update into your "master" branch.
Finally, if multiple people are involved in the ongoing development of your website, each developer can use Git to "clone" your repository and implement their own custom changes. When they finish, you can then "pull" their changes back into your repository. It is such distributed development that Git excels at, providing you with powerful tools allowing you to pick and choose which changes to merge from another repository, and to undo commits if they later prove to be problematic. There is much documentation found online to help you master Git, thereby greatly increasing your productivity.
Tracking Database Schema Changes
As your website evolves, you will find that you sometimes need to update your database schema. Fortunately, Drupal provides a method of tracking and automating such schema changes. When developing custom modules for Drupal, you can define various "hooks" in the .install file. For example, the _install hook is called when your custom module is first enabled, and should be used to create custom database tables. If you need to modify your schema in the future, you define an _update_N hook in your module's .install file, then run update.php on your website. Drupal will track which updates have already been installed on your website, and will alert you as new updates come available. As you update your .install file, be sure to commit your changes to your Git repository. Read more about .install files, _install hooks and _update_N hooks in the official Drupal API documentation at the following URLs:
- http://api.drupal.org/api/function/hook_install/6
- http://api.drupal.org/api/function/hook_update_N/6
Staging configuration changes from development servers
It is best to test all configuration changes on your development server, before attempting to make changes on your production server. Once you have made all your desired changes, you have to decide the best method for duplicating these changes in production. Many tediously take notes as they make changes on their development servers, then manually repeat the same steps on their production servers.
It is much preferable if you can automate some of this process, allowing you to test for consistency and to track all configuration changes in the database. What follows is a recipe for partially automating and tracking this process using git. It does require a solid knowledge of SQL.
To begin, first configure an exact copy of your production website on a development server by restoring an up-to-date backup. Do not attempt to work from an outdated backup or the following steps may have unexpected results.
Next, create an empty sub-directory and capture a baseline database backup from your new development server. This will contain the same exact data as is in the backup you used to create this development server, however the backup will be formatted differently as you will be using different mysqldump options. In most cases, it will make sense to use the --no-create-info option as you will not be adding new tables or altering table definitions. It can also be very helpful to use the --skip-extended-insert option so that each change is on its own line, simplifying patch generation. Finally, the --complete-insert option can prove helpful for generating database queries to use when data in your database is being updated rather than simply inserted.
Once you have created your baseline snapshot with the appropriate mysqldump options, initialize a new git repository, and commit your database snapshot into your new empty repository:
$ mkdir snapshot $ cd snapshot/ $ mysqldump -uUSERNAME -p --no-create-info --skip-extended-insert \ --complete-insert DATABASE > snapshot.sql $ git init $ git add snapshot.sql $ git commit -m "initial database snapshot"
Now, log in to your development website and make the necessary configuration changes. Do not attempt to make too many changes at one time, or it may prove too difficult to later merge these changes into your production website.
In our example, we will visit the Site configuration section in the Drupal Administration pages and make the following changes:
- On the date and time page, disable user-configurable time zones
- On the site information page, configure a new slogan.
You're now ready to extract the changes you've made on your development server, preparing to push them into production. First, get a new database snapshot using the same identical mysqldump flags that you used previously. Now, utilize a handy git feature to only commit the relevant changes into your temporary development repository. Finally, use git to generate a patch from this commit.
To commit only the relevant changes, you will use the git add --patch command. It will logically split your changes by table, referring to each table as a "hunk", asking you for each whether or not you wish to "stage this hunk". In this example, you will answer "n" to all changes affecting the cache* tables, the sessions table, and the watchdog table. You will only answer "y" to the changes affecting the variable table. You do not stage the changes for the many cache tables because these will be automatically generated on your production server as needed. You do not stage the changes to the sessions or users tables, because these are specific to your current session on your development server and unrelated to your configuration changes. You also do not stage the changes to the watchdog table as this is only internal logging information and not relevant to updating the configuration of your website:
$ mysqldump -uUSERNAME -p --no-create-info --skip-extended-insert \ --complete-insert DATABASE > snapshot.sql$ git add --patch snapshot.sql $ git commit -m "example configuration changes"
You can now generate a patch from your partial commit. First, use git log to find the previous commit against which a patch will be generated. In our example this is the initial database snapshot with an ID of 908f027ba0077baad4b7c52ebbe986fb89b40f41. Second, call git format-patch to generate the actual patch, passing in enough unique characters of the commit ID:
$ git log commit 968fe8271ed7ff08fa46d789371b626b80c46ac6 Author: Jeremy AndrewsDate: Fri Aug 22 16:20:54 2008 -0700 example configuration changes commit 908f027ba0077baad4b7c52ebbe986fb89b40f41 Author: Jeremy Andrews Date: Fri Aug 22 16:06:22 2008 -0700 initial database snapshot $ git format-patch 908f02 0001-example-configuration-changes.patch
Next, use this automatically generated patch file to create an appropriate _update hook for a custom .install file. This is done by first opening the patch file with a text editor. Reviewing the patch, note that any pre-existing configuration options which you have updated involve two lines in the patch, one starting with a "-", and one starting with a "+". All lines starting with a "-" are being removed from your database, while all lines starting with a "+" are being added to your database. On our example website the site slogan was previously defined, so in our patch file we see a "-" line removing the old slogan, and a "+" line adding the new slogan:
-INSERT INTO `variable` (`name`, `value`) VALUES \
('site_slogan','s:18:\"This is my slogan.\";');
+INSERT INTO `variable` (`name`, `value`) VALUES \
('site_slogan','s:26:\"This is my updated slogan.\";');
Using our knowledge of SQL, we manually convert this into a single update as follows:
UPDATE `variable` SET `value` = \ 's:26:\"This is my updated slogan.\";' WHERE `name` = 'site_slogan';
Our other change was to disable user configurable time zones, and as this had never been updated on our website before we only find a single relevant line in our patch starting with a "+", and none starting with a "-":
+INSERT INTO `variable` (`name`, `value`) VALUES \
('configurable_timezones','s:1:\"0\";');
Finally, we use the queries we collected above and create a new _update_N hook in a custom module used for pushing database updates to our website. If you are not already using a custom module, you can create an empty custom.module file, a proper custom.info file, and a custom.install file. In the custom.install file, you will add a new _update_N hook. Refer to the links provided at the beginning of this subsection for a more in depth description of how these Drupal hooks work. In our example, we add the following function to our custom.install file. In your own usage, be sure to increment N in your new _update_N hook:
function custom_update_6001() {
$ret = array();
$ret[] = update_sql("UPDATE `variable` SET `value` = \
's:26:\"This is my updated slogan.\";' WHERE `name` \
= 'site_slogan';");
$ret[] = update_sql("INSERT INTO `variable` (`name`, \
`value`) VALUES ('configurable_timezones','s:1:\"0\";');");
return $ret;
}
You should commit the changes you have made to your custom module files into your website source code repository. You can then push these changes to your production website as explained below. Note that it is highly recommended that you first push these changes to a staging server, testing the update process and verifying that you have properly written your update hook. To have your actual updates performed on your staging and production servers, you will need to point your browser to yoursite/update.sql and follow the directions.
The same principles that have been documented in this simplistic example can be applied to more complex configuration changes. You are not limited to just calling UPDATE and INSERT in your _update_N hooks, you can also call DELETE, CREATE, ALTER, and any other appropriate SQL command. When making more complex configuration changes, you should dump your database regularly without actually committing each individual change. After each database dump, you can use git diff --color to view how your changes are affecting the database. The more you do this, and the more familiar you get with how Drupal works under the hood, the quicker the process will become.
There has been much discussion about how these processes can be further automated in Drupal 7 and beyond. There are also existing projects attempting to further automate the process for earlier versions of Drupal, such as the Database Scripts project found at http://drupal.org/project/dbscripts.
Pushing Changes To Production
In previous examples, you've learned how you can use Git to manage your website, simplifying many processes including upgrading to a newer release of Drupal, and making configuration changes to your website. This final section discusses using Git to push changes to your production server. In an earlier example dealing with backups, we configured a Git repository on a backup server with the IP address 10.10.10.10. We will use this previously configured backup server again in this example.
At this point, you have updated your website to Drupal 6.3, and merged all of your changes into the master branch of your Git repository. You have tested all your changes, and are now ready to push them to your live web server. You should first tag your release for easy reference in the future. As you're working in a different repository than you used in the backup example, you need to configure the remote backup server. Then, push your current code to the remote server:
$ git tag RELEASE-2008-07-002 $ git remote add backup-server user@10.10.10.10:backup.git $ git push backup-server master
This process is greatly simplified if only one person (or on Git repository) is pushing changes to the backup server. This one person can be responsible for merging together everyone else's work, and testing all the changes. Once the code is pushed to the backup server, it is now available to be pulled to your website. When using this work flow, it's important that you don't edit files directly on your web server, but instead that you always pull changes to files via your Git repository. On the production web server:
$ git pull user@10.10.10.10:backup.git master
If for any reason you want to revert to an earlier version of your website, this can be easily done using tags. We'll assume that your previous release was tagged as 'RELEASE-2008-07-001'. We use the "--hard" option
$ git reset --hard RELEASE-2008-07-001
You can now fix whatever problems you ran into by making changes to your local repository. Once things are fixed and tested, add a new tag and again push your changes to the backup server. Finally, pull these changes to your production server.
With this strategy, you always know exactly what version of your website is currently being used in production. It also becomes possible to quickly back out any changes if. Finally, if you have multiple web servers, it is now trivial to keep them all in sync by checking out files from the same remote Git repository.
