Over the course of 12 months we used the Google Ads Experiments tool (known as ‘Campaign Drafts & Experiments) and run close to 100 experiments for a single client. In this article we’ve categorised the experiments into subgroups and provided an overview of where we found success and where we failed. This would be a great starting point for anyone looking to use the experiments tool and trying to find ideas for what to test.
As an agency committed to optimising and making decisions for our clients based on data, the experiments tool is one of our favourites. Google provides a tool which lets us take an existing campaign and duplicate it as a ‘draft’. We can then make changes to the draft and split traffic between the draft and the original. It also provides a reporting interface to review performance in real time. After the experiment has run its course the advertiser has the ability to either reject the experiment or apply the apply experiment. With 1 click, rejecting the experiment reverts the campaign back to how it was. Otherwise applying the experiment with 1 click will roll the changes back into the original campaign (or create a new second campaign).
With that out the way, let’s jump into the experiments we performed. We ran 93 experiments in total over the course of 12 months. Most of the experiments used a success metric of CPA, where the goal was to achieve a significantly lower CPA. Some of the ad copy experiments used CTR as the success metric. The experiments covered:
- Landing Pages: changes to landing page design & copy
- Ad Copy: changing key messages in the ad copy headlines
- Campaign Structure: adjusting keyword groupings within Ad Groups
- Locations: Adjusting investment in different locations
- Bid Strategies: testing manual bid changes & new auto-bidding strategies
- Audiences: Adjusting bidding based on audience demographics
- Devices: Adjusting bidding based on devices
From the 93 experiments we ran, 51% of them were successful and 49% were not successful. By success, we mean that they were both statistically significant and showed a positive effect on performance.
The results are shown below:
|Landing Pages||45% (13)||55% (16)|
|Ad Copy||47% (7)||53% (8)|
|Structure||50% (2)||50% (2)|
|Location||100% (1)||0% (0)|
|Bid Strategy||53% (18)||47% (16)|
|Audience||63% (5)||38% (3)|
|Devices||100% (1)||0% (0)|
Note that ‘location’ & ‘device’, ran few experiments each and therefore are not a good enough sample set.
In general the most successful experiments came from testing changes to ‘Audience Demographics’ and ‘Bid Strategies’.
We ran many experiments for Landing Copy changes and Ad Copy changes, but often found that our changes performed worse then the original.
When it came to audiences demographics, most of the experiments focused on adjusting bids for age groups and income levels. Before the campaign we would review the audience insights and generate a hypothesis as to which audience to bid down or exclude. We would then run an experiment. In most cases our hypothesis was proved correct.
For the bid strategies testing we ran a mix of experiments. Our success metric in all cases was decreasing CPA. Many of the experiments that were successful were manual bid changes, such as decreasing bids by 20% across the whole account. This is intuitive.
When it came to testing auto bidding strategies against manual bidding strategies the results were not as good. We found that auto bidding strategies such as Target CPA only outperformed manual bidding approximately 25% of the time. The machine failed three quarters of the time to achieve a better CPA. CPA targets were set based on past performance.
When testing Ad Copy changes we found the results were often very close. For example, using Responsive ads in campaigns achieved a better result in some campaigns but others not. It was split about 50/50.
When it came to testing landing pages, we ran quite a few different variations. Some of the themes that stood out were: Removing the menu bar from the top of the landing page is a good idea, it consistently increases the conversion rate. Similarly taking pricing off the landing pages increase lead volume, but obviously might generate lower quality leads. Adding tabled data rather then text did not make a significant change.
Many of the findings we came across were not intuitive. If we were to just rely on a hunch we would likely have made optimisations that would not have helped achieve the business goals. At the end of the day we had a high success rate with over 40 experiments applied.
Experiments are the best way to make incremental changes to your account with a higher degree of certainty that the optimisations you are making will advance your business goals. If you are looking for ideas I would start with some of the examples we have used in the above case.