Refining our UI Regression Tests
Our previous process
In retrospect, this was more complicated than it needed to be. We were essentially doing regression tests against the generated images but there was really no reason not to use previous screenshots instead of generating the images ourselves.
We were also using Jenkins to run our tests. Travis was chosen as a more modern (and public) solution.
The new process
We are now simply performing regression tests on a series of screenshots. A protractor script is used to build a set of images for each widget state on each browser. These images are then uploaded to a github repository (kept separate to keep our main repository small).
When we commit to the dev or production branch of our main repository, Travis picks up the commit and runs a series of tests comparing those ground truth images to the current screenshots. On failure, the images are uploaded to a private server where they can be reviewed later. On success, if we’re testing in production, we upload our viewer to S3.
Travis has a lot of really cool features, things like Sauce Labs are very well integrated, Travis picks up our commits on github and sends us slack notifications on test completion.
Building everything in public ended up being easier than expected on Travis. They provide an encryption tool which allows you to keep things like credentials, private keys and API keys secure. In our case this was essential since the code we use to deploy the widget is stored in a private repository, so we simply created a github deploy key and encrypted it with Travis, allowing us to clone the private repo at will.
One issue we’ve had with Travis is debugging shell commands - it can be quite difficult to diagnose what’s wrong with a script when there’s no way of running commands inside the environment ad-hoc. There are some other CI environments that allow for interactive debugging, so it would be great to see Travis implement this!
Here’s a copy of our travis.yml file:
We’ve found that taking screenshots can be somewhat fraught with bugs. It seems like this is not the most well used part of Selenium and occasionally bugs do crop up. So far, the major bugs have all been fixed within a few days but there are occasionally issues like screenshots on old versions of IE having inconsistent offsets (which makes comparisons difficult). This is definitely improving over time but there’s still some work to do here.
We’ve now created a relatively robust system to test our widget’s UI on multiple browsers with very little fuss. Our code was hugely simplified by removing image generation and simply testing for regression against old screenshots. Using Travis CI to automate those tests was simpler than expected and we were particularly impressed with the simplicity of the encryption scheme.
Image by Dean Hochman licensed under Creative Commons.