How to run an A/B test for your design clients - part 1

Updated: Oct 6


As a scientist gone designer, I always look for more ways to validate my ideas. It can be just for me, like speaking to the target audience to make sure my ideas actually appeal to them. But it can also be helpful when presenting an idea to a client. This is where A/B testing comes in. A/B testing, also known as split testing is a great way to quickly understand how users or customers react to your choices. Just like in science, you enter with a focused question and exit with data that answers this questions. However, fail to clearly define the parameters and you end up with a lot of data without meaning. I will outline step by step how you can set up, run and analyse the results from A/B testing to get helpful information that make your designs better.



OVERVIEW

  • Before you get started

  • Create your designs

  • Set up the test

  • How much data do you need?

  • Is A/B testing right for you?


Before you get started


Before you start anything, you need to decide exactly what question you want to answer. This should be based on the client’s goals. If the goal is to increase sales from the website, your question is “which version generates the most sales”. Since any test often shows many different kinds of data, it is easy to get distracted by less important facts and vanity metrics. Like the average time spent on the site. If this is not the goal, it should not be presented to the client.


Next, decide what constitutes success. Is a design more successful if it generates 1% more sales? Consider the cost of implementing the new design and the risk involved. This is a great time to break out the old cost/benefit analysis. Be realistic and consider both the risk and potential reward of switching designs. If you are changing something small, such as a button, the risk is lower so you might not need to show as drastic of a difference to warrant a new design.


Create your designs


Sometimes, it is better to use a show kitchen than to build a mansion nobody wants. By this I mean that if you can test something without actually building it, this is a much more cost effective way of going about things. For websites, this means wireframes rather than full coded sites. For packaging, it could be a mockup instead of a printed end product. If you are interested in how to do this, listen to episode 13 of the podcast Startup. In this episode, Google Ventures joins the podcast team to test an imaginary app and decide if it is a good idea to build.


Finally, make sure there is only one difference between your designs. This can be different copy, button colour, packaging, hero image, you name it, but there can only be one. If you have multiple differences, there is no way of knowing which aspect gave the result you see.


Set up the test


There are a number of softwares that can help you run a test and collect key information.


Google Optimize

Optimize is the free version of Google’s A/B testing software which allows most or all of the functionality you will need for a small to medium sized project. If you need more advanced functions, you can get a custom quote for Optimize 360.


Optimizely

If you are looking to test a high traffic site over time, Optimizely has more advanced functions that make it easier to draw the right conclusion.


Hubspot

If you are testing more marketing related aspects such as brand messaging, Hubspot has tools that allow you a lot of control over your test.


VWO

If you think visually, VWO can feel more intuitive than other options. They pride themselves on being easy to use and have different plans to suit your needs.



Facebook ads

If you have a limited budget and need to decide between two designs rather than measure a certain action, using Facebook ads is a great way to collect target data quickly. Since Facebook allows you to specify your target audience in detail, you can set up two identical ads except for the variable you are testing. By using the same budget but varying for example the image shown, you can see how many people click or interact with each version.


There are a few key things to keep in mind when you set up your test. One important thing to remember is timing. It is crucial that you test all your versions at the same time. If you instead show website A in October and website B in November, perhaps more people shop in November because it is closer to Christmas rather than website B having a better design.


How much data do you need?


The amount of data and the method you use to test your result depends on how accurate you want to be. To help you decide which approach is best for you, I will present you with 3 scenarios.


Scenario 1. Low stakes. Decide on the % difference for success and a minimum test size

Let’s say you want to find out which package design customers will prefer. You decide beforehand that for a design to be successful, it has to have at least 10% more sales than the other options. This is a good way to decide if the investment will pay off. In addition to the %, you also need to set a minimum number of tests. This is so that your information will be accurate. If 20 people visit your site and 2 more buy product A, it is not enough to say that design A is the better option. If the number of sales is instead 220 compared to 200, you can feel more confident in your results.


Scenario 2. Use a t-test to have more certainty

T-tests are used by scientists to decide if two options are different from each other, but you do not need to be a scientist to perform one. Many softwares can run the test for you and you can even use Excel to test yourself for free. A t-test looks at your data, e.g. how many sales you had each day, and determines if the average values are actually different or if any differences you see is because of chance. You can decide how sure you want to be by deciding what your p-value should be. If you want to be 90% sure, your p-value is 0.1, meaning that there is a 10% chance that any difference you found is because of chance. You can never be 100% sure so scientists typically use a p-value of 0.1 or 0.05 since this is high enough to make a judgement.


Scenario 3. You are hired to analyse large amounts of ongoing data

The scenarios above work well for relatively short tests with a clear end point, but if you are running many tests over a long time, you need to use a different method. This is because if you run enough tests, you can find something that is not true by chance. E.g. if you test if site A or B appeal to men or women and you run 1000 tests, some will show that more men clicked even if the majority of site visitors are women. This is called a false positive and softwares such as Optimizely now use sequential testing to fix this problem. Don’t panic if this sounds intimidating, softwares will do these tests for you.

Learn more about statistical significance.


Is A/B testing right for you?


If your client does not have a clear image of what appeals to their target audience, or if you find it hard to justify a design choice to a client, A/B testing can save you a lot of time.


There are three main ways I have found A/B testing useful. Macro testing, such as asking which packaging design your customers prefer, micro testing, such as choosing the ideal button colour on your website and finally to find the perfect target audience. That last one is less conventional, but with most platforms being very flexible, you can actually test the same design with different audiences and see which one performs best. This is great if your client wants to find their niche but is not sure which segment of their customers is the most profitable.

In part 2, we will show how you can present the results you find in a way that is easy for your client to take on board.


Do you have a favorite tip we missed? Join our Facebook community and spread the wisdom!