Effective run in Meta Ads: from testing to scaling

Sasha Zhukovskaya

Aug 15, 2024 • 20 min read

Evgeny Mokin, founder/CEO of Median Ads and lecturer at the WebPromoExperts Academy, conducted a seminar where he discussed the entire cycle of advertising campaigns on Meta.

Ad campaigns testing

Let's start our conversation with testing advertising campaigns. The first thing we begin with is defining the target audience. My personal approach, as well as that of my team, is that the first step is a detailed description of the target groups, into which we immediately divide potential clients.

Target groups

You can't just take and describe the target audience. In reality, it needs to be divided into specific target groups. When we talk about target groups, we mean non-overlapping audiences, which is very important. For example, if we say that our target audience is parents, then the target groups would be moms and dads; respectively, these are two non-overlapping target groups.

Why is this important? When you start building a communication strategy, you'll realize that it needs to be tailored to a specific target group. For moms, for instance, it's very important that the toy is of high quality and safe for their child. For dads, this may not be as crucial, but they are concerned about costs, so the communication here should emphasize that our products are not only high-quality and appealing to children but also competitively priced.

Dividing the audience into target groups at the outset is fundamental. In fact, one could say that the foundation for testing consists of two things: precise segmentation of the target audience into groups and the structure of advertising campaigns. We launch thousands of projects and each time we start with the target audience, its division into target groups, and the structure of advertising campaigns. After that comes the development of the communication strategy. When we define the target audience, we outline the groups based on various parameters. An example of the main parameters can be seen in the table.

Saved audiences

Saving audiences is essential. When you save an audience, you can track analytics not only by campaigns, groups, and audiences but also by saved audiences.

How do we promote our own agency using saved audiences? For example, we have 6 saved audiences. We targeted and saved the audience based on behavior – who has a business, who owns a small business, and so on.

We saved these audiences, but note that on the left we have Persona A and Persona B. Persona A has three saved audiences, and we can configure them differently.

It's important to understand that these are the same people, but the targeting settings are different. This is because each of the configured saved audiences has a different CPN, so you can test where the audience is cheaper, even though it’s the same target.

Therefore, once we have defined the target groups, we save them. This way, we establish with whom we will be working. Once you have maximized results from these target groups, you can start adding new ones. Look for new offers and products for them, but first, they need to be established.

Structure of ad campaign

Typically, the structure of advertising campaigns for most agencies and specialists is organized as follows: first, there’s an acquisition campaign targeting cold audiences, followed by retargeting ads for those who visited the website, looked at something, but didn’t make a purchase—this is called retargeting warm. Finally, there’s advertising aimed at hot audiences—retargeting hot—although it is with the hot audience that significant profits are made. Selling a product to someone who has already purchased something is much easier.

Consequently, the majority of the budget is spent on cold tests, searching for new audiences and creatives to attract them. However, the profitability metrics are highest within the hot audience. Therefore, when building advertising campaigns, it’s essential to remember that we work with cold audiences based on one principle, while with warm audiences we follow another—specifically according to the stages of the sales funnel.

Once an audience has made a purchase, we operate under a third logic. For example, cross-selling, repeat sales, additional incentives, and alternative directions. If a business has another line of products, this audience can be effectively directed there as well.

If this audience is relevant to another product, when returning to testing and saving the audience, it’s necessary to categorize it into completely different "baskets." We start testing through retargeting, optimizing to ensure our conversion rate is as high as possible, analyzing behavior in the hot audience, and understanding what can be upsold.

For instance, if someone bought an iPhone, we can sell them headphones, a case, and so on. For those who have already made a purchase, we need to build a sales chain and maximize profitability. Maximizing profitability is exclusively about the hot audience.

Ways of testing

What testing methods exist? Those with more experience surely know about these two methods, but let's take a closer look at them. There are two main options for testing: A/B testing and multivariate testing. For example, we have an image featuring a product, people, and some abstraction, and let's assume that the people show better results, a higher conversion rate, and so on.

What should we do next? We need to select the people and exclude other options accordingly, as they showed poorer results, had higher customer acquisition costs, and so on. For instance, if young people performed well, we can create alternative versions featuring middle-aged individuals, professionals, and so forth. It's important for us to understand what works best and "dig deeper." This is the logic behind the A/B approach. If option A worked, option B might work as well, and so on. If, for example, it turns out that not young people but those depicted in the images as professionals performed better, we test a business style, showing women separately from men. This way, we dig deeper and truly understand what works best. We not only reduce the cost of results but also begin to grasp the overall mindset and interests of the audience through A/B testing.

This cannot be said about multivariate testing (MVT), which operates on a different principle. In this type of testing, we create a combination of two or more elements: using an image, text, and other design objects. We then observe what works. MVT can be conducted in dynamic creatives: we throw everything together, see what performs well, analyze the breakdown by design, template, and draw conclusions.

This approach is fundamentally different from A/B testing because in this format we cannot investigate what specifically worked while two or more options are still running. That being said, MVT has its advantages. It allows us to test more hypotheses in a short period of time, but we won't understand exactly which element of a given hypothesis worked.

When we have little time and need to find what works more cost-effectively, multivariate testing is used. For example, when promoting events, concerts, etc., that are scheduled to take place in a month or six months, and there is no time for detailed testing. In such cases, a specific scenario is written along with a unique communication strategy.

If we have a systematic business and time on our side, it is always better to use A/B testing. This type is most commonly used and is synonymous with systematic testing.

What is more important to test first: audiences or creatives? The answer is very simple, and again, it is related to the structure of advertising campaigns. It truly has a significant impact and follows a logical approach. One must understand that if the logic is completely different, then the approaches will also be different. Therefore, in acquisition advertising campaigns, it is better to conduct testing in a multivariate format. The audience should be tested along with the communication strategy, not separately from the creatives.

What does an advertising creative consist of? It consists of two parts: the strategic part, which is the communication strategy, and the tactical part, which is the design. This is why many specialists immediately dive into creating various formats: videos, static banners, etc., and start testing them right away.

This is not bad, but in this format, you are immediately testing the tactical part without having thought through the strategy. You should start with what we discussed earlier: define the audience, then the structure of the advertising campaigns, and after that, the communication strategy for each target group separately. Remember the example about parents: what do we need for moms and dads? The message for moms will be "you give your child the best," while for dads it will be "this is quality and maximally beneficial, not more expensive than competitors," and so on. Of course, we will communicate to parents in general that this is the best product because they want the best for their child, but still, the communication strategy is divided into target groups, each with different insights.

It is important to understand that when we test the audience, we can take one set of audiences with one set of communication insights and another set with different ones; some can be combined. Testing should take place across the entire list of audiences and accordingly in combination.

We also need to understand what audience testing is: when we have a list of target groups, we save them. For example, out of all our hypotheses, approximately 15 audiences fit, plus or minus 5 — that’s normal, depending on the product and audience. We save these audiences, and if we have worked through everything and there are a maximum of 15, then what 16th can you test? There isn't one; that means you have already tried to create hypotheses for all audiences, so they all need to be tested. Therefore, in fact, I would call this not audience testing but validation.

When we take an audience, we are simply validating it based on cost per result. We end up with 10–20–30% of the audience that can continue to be worked with. All others are excluded because they are not profitable. Therefore, essentially, what happens with the audience is not testing but validation. Thus, both the communication strategy and the audience need to be tested, specifically the relevance between them. Because communication must always be relevant to the audience.

Retargeting campaigns

Now let's talk about retargeting. It helps ensure that the maximum number of people who visit a website or mobile app complete the desired final action. The goal of retargeting is to achieve the highest possible conversion rate, which means we need to work with creatives and segments in this format.

We won't test women who viewed products separately from men. Our aim is to guide everyone to the final action, so it makes more sense to test the creatives. This is better done in an A/B format to gain a deeper understanding of the audience. Understanding the audience will lead to a better conversion rate. Thus, retargeting is no longer about combinations but about driving the audience to complete the desired action. Accordingly, we need to ensure that all target groups reach the final action.

CRO

So, once we have tested and found combinations (audience plus insights from the communication strategy), we can move on to the so-called CRO process – conversion rate optimization. This is a process aimed at further increasing the conversion rate. To achieve this, we need to work on the design of the creatives because people make the decision to apply or purchase something, that is, to convert, based on insights from the communication strategy.

When we identify these insights, we understand that context is key. If we recognize that something is necessary, it determines the cost per result: $5, $10, $15 per conversion – this is the cost of the result. The design is the conversion optimization process, which can either lower or increase the cost of the result by approximately 20%.

For some, it might be 10%, for others, 30%, depending on your personal expertise. In my experience, the average conversion optimization rate is around 20%. First, we find effective combinations, identify insights that resonate with the audience, and make decisions based on them. People do not make decisions based solely on how well you set up targeting. Therefore, we look for additional actions in CRO to increase conversion and lower the cost per result.

This is a very important stage in preparing for scaling. I explain these stages not just to outline the steps involved but to convey that scaling begins with testing. If a solid foundation is laid in the test, you will be able to scale. If the test is conducted poorly and the cost per result is not minimized, scaling will be limited or even impossible.

Testing period

What do you think should be the testing period? Some say a month, others a week, and for some, even two months may not be enough. Everyone has their own solution regarding the testing period. That's normal, and I will share my approach.

It simply doesn’t exist. Why? Because in some niches, the test is launched, and after a week, everything is already within KPIs; everything is clear, and there’s no need to test further. In other cases, such as B2B targeting, testing can take two months. I had a client with a small budget, a couple of hundred dollars, and we tested for two months. My friend was running contextual ads, and after a week, everything was already working great. We spent two months on tests, found results, and it was very challenging. I almost didn't believe it myself, but in the end, we achieved results.

Therefore, there are no strict deadlines, like three weeks or two months.

When you achieve results, then the testing will end, no matter how trivial that may sound. So, it’s important to test correctly. I’ll share my benchmarks and personal experience. If we’re talking about B2C, I usually plan a maximum of one month. If we’re talking about B2B, I plan up to two months because you really need to wait for results there.

You can test for five to seven days. I have my own benchmarks in a short format: I test for 3/7 or 5/7 days, make changes, and give the system 2 days to return to optimization, or the first 24 hours – that’s always changes in the auction. I make changes after 3, 5, or 7 days and monitor them. But there’s also an important aspect — the decision-making cycle. This is the first thing regarding creatives.

Goal of testing

When it comes to testing, everyone imagines visuals, design, audiences, etc. But if we look at testing in the context of preparing for scaling, it’s a completely different story. When we prepare for scaling, we need to test combinations in such a way as to solve two tasks.

The first task is to find as many combinations as possible that work within the KPIs, because the more combinations there are, the more directions for budget spending and, consequently, more opportunities for scaling.

The second point is to find combinations that are as far from the KPIs as possible. Why? Because when you scale, the cost per result increases — this is normal. This happens because we start taking audience away from competitors. To do this, according to the logic of any auction, you need to pay more or create higher-quality creatives, meaning you need to improve quality metrics. Therefore, it’s important to understand that the more combinations we have tested, the better and more solid our foundation will be. If they are further from the KPIs in a good sense, we can increase the budget more comfortably and with lower risks of losing the audience. The lifecycle of each audience, if it is further from the KPIs, will be longer.

Personal formula

I am often asked how I calculate the budget for testing. I want to share my personal formula. I don't know who else might use it, but you can take advantage of it.

The budgeting formula for the testing period (estimate):

Number of ad sets x daily budget x testing period

The logic is as follows: there are combinations, meaning the number of ad sets we have. I multiply this by the daily budget, which varies depending on the country. If we’re talking about Ukraine, it might be $5-15 per day. If it’s the USA or Europe, it could be $10-30 per day. I take the maximum period for B2C – two months, as I mentioned earlier.

For example, if we have 10 ad sets with a budget of $10 each, that totals $100 per day. Multiplying by 30 days gives us $3000. This is a small budget for testing. This is the maximum budget because we won’t keep audiences that are not performing with KPIs for 30 days; we will exclude them immediately. Therefore, in reality, less is spent. This is an approach for calculating a budget that can be presented to a client.

Always have plan B

A question may arise: "Could it be that no working combinations will be found in the KPIs?" Yes, there have been cases where combinations yielded no results at all, and that was surprising. One of them was for B2B, and another was for a product that had long since lost its appeal. We created a unique approach that didn’t work. But what should we do if no combination works within the KPIs? It’s important to understand that marketing is divided into two components: inbound and outbound. Outbound is when we approach the audience and say, "Here’s our offer, take it." Inbound is when we generate demand and interest through content.

This can also be referred to as warming up the audience. If we warm them up, we will find our KPIs. Therefore, Plan B is to switch from direct sales to working through content, warming up the audience.

Conclusion

The conclusion is very simple: the testing period is not just for testing something or checking how an image or video will perform. That is already optimization, the second stage that we will go through. The testing period and testing, in essence, as a company vision, is more about finding those combinations that work within the KPIs in order to prepare for scaling.

Optimization and analytics

What needs to be done before scaling? You should start the optimization. Pay attention to two areas: targeting settings and analytics.

Popular mistakes

What mistakes can be made during optimization? When adjusting targeting settings, you change something, look at breakdowns, exclude placements, etc. This certainly affects the results, but only by 10-30%. Analytics has a greater impact. However, before moving on to analytics, it's important to note the following: at the start of optimization, it’s crucial to fix the audience: exclude what is not in the KPI and keep what is in the KPI.

If you have already completed testing, excluded audiences not in the KPI, and retained those that are in the KPI, then we work with those. We don’t test further, don’t change images, and don’t compare where it’s cheaper or more expensive. We have identified winning combinations, and now we focus on reducing the cost per result based on them.

Analytics

Вартість за результат (CPA) по суті залежить від чотирьох основних метрик

If we look at all the metrics, the cost per result (CPA) depends on four main metrics: CPM, CTR, conversion rate, and cumulative frequency.

СРМ

CPM is the base cost per mile, and it depends on how you set your targeting. A broader audience is cheaper in terms of CPM. The narrower the audience and the more targeted it is, the more expensive it becomes.

If you look at the auction formula, it includes your bid multiplied by estimated metrics such as conversion rate, engagement, etc. If you have a great bid but low metrics, you will be placed lower in the auction. Therefore, these characteristics matter a lot. They include positive and negative feedback, conversion rate, and engagement. You need to ensure that all metrics are at least above average.

It's essential to test all possible audiences and look at the base CPM. It's very important to take a snapshot in the first few days to see the initial price. Then optimization begins, and we can compare the differences, determine the range we are working within, and reduce CPM through quality metrics.

This is already CPA optimization because if we improve all these metrics (except for cumulative frequency, which needs to be monitored), our CPM will also decrease. If we improve CPM, CTR, and other metrics, our results will also improve.

CТR

What does CTR depend on?

First of all, it depends on the insights we discussed, the messages—in other words, and the context—what we put inside. Let me give you a simple example: if you are selling a phone or selling a phone with a 50% discount. Do you understand how that will affect the click-through rate? This is not only relevant to this slide, but still, if we convey a message, people look at the context, and if it’s interesting, they will click accordingly.

The conversion rate is also influenced by the offer, message, and entry point.

I always tell my colleagues, show which metrics have changed and how the cost of the result has increased. For example, if they say that the conversion rate has worsened but the click-through rate hasn’t changed, then the issue is not primarily with the audience but with the entry point. Something might have broken, or there could be another reason. In other words, all four basic metrics are analyzed, and only then does it become clear where to look and what needs to be fixed.

Cumulative frequency

When it comes to cumulative frequency, everything is quite straightforward. If we do not take branding into account and focus solely on performance, we need to approach cumulative frequency differently. The logic of performance cumulative frequency is as follows: pay attention to the actions marked in blue, which can be conversions, clicks, etc. I want to explain this with a very clear example that often repeats itself. You can observe your campaign when you make changes to either broaden or narrow your audience. What happens? The cumulative frequency, meaning the accumulating frequency, starts to grow. Initially, it is one, then it goes to eight, and continues to rise.

In the screenshot, it can be seen that from February 7 to 9, there is an increase in both frequency and actions. This indicates saturation with creatives. After that, there is over-saturation: people have seen the creative enough times, they understand it, and there’s no need to show this creative 50 times. Consequently, if actions begin to decline, conversions will also start to drop, and the cost of the result will begin to rise. Therefore, it is essential to monitor and find the optimal frequency. For example, the optimal cumulative frequency in Ukraine will differ from that in the USA because of different advertising loads.

It is crucial to control cumulative frequency so that it does not exceed limits, and we do not end up overpaying for repeated impressions.

Audience breakdown

How to create a breakdown? You can look at age, gender, and geography. This also affects placements.

Conclusion

So, at the testing stage, we identified the working combinations and kept them. For example, if the KPI is $10, we keep everything below $10 and exclude anything above it. Next, we focus on optimization, meaning we reduce the cost per result. For instance, if the average cost per result was $9, we brought it down to $7. In this way, you’ve created a solid foundation before scaling, because if you double the budget, going from $9 to $10 will quickly end your scaling, while going from $7 to $10 will happen more slowly.

Scaling

Let's consider the scaling stage. How does it happen, and what does it consist of?

Scaling formula

If we consider scaling in general, it involves increasing the budget and expanding the audience. First and foremost, you should increase the budget, and only then expand the audience, because you have an audience that is meeting your KPIs. If you expand it, you might attract people who are not as relevant, which can lead to increased cost per result or even a complete failure.

So, first, increase the budget and make the most of your existing audience. Once you've maximized that and the cost per result starts to worsen, you should then expand the audience. Work with the current groups, accumulating conversions and results. If you have enough experience and know that effective optimization on Facebook requires 50 conversions per week, stick to this approach. In other words, first maximize your current audience, gather as many conversions as possible, let the optimization work at full capacity, and only then expand the audience.

Pace of budget raise

How much should the budget be increased and how often? There are some interesting opinions on this matter.

First of all, there should be a certain pace for increasing the budget. This is a fact. The budget should be increased gradually, not all at once.

For example, we increase the budget by 50%. After the increase, the cost per result goes up slightly, but it’s still within the KPI. Then we increase it by another 50% – still within KPI. Next, we increase it by another 50% – now it’s no longer within KPI. This way, we determine the maximum budget limit that can be used for a specific group. For another group, it might be possible to increase the budget five times before reaching the limit. The budget increase should be step-by-step to monitor when you reach that limit.

The Case

Look at this case of budget increase. The red color shows the number of leads obtained per day, while the blue indicates the cost per result. The graph illustrates the testing process. The graph goes up, then slightly decreases, which means we tested many audiences, stayed within limits, made adjustments, tested hypotheses, increased the budget, then optimized, and that’s it. From there, the price should remain roughly equal.

In fact, the percentage depends on your chosen pace. If you realize that you are close to the KPI, then you should take small steps. If you are further from the KPI, then accordingly, you increase the budget more sharply

How to expand the audience

If you already have a broad audience, there's nothing more to expand. Therefore, in this case, we continue to squeeze the maximum results. Our only solution is to duplicate the audiences.

Often, duplication is used incorrectly. You launch it, the results worsen, you duplicate again, launch it, and you’re already working at 80-90% of the previous event's performance. Then you duplicate again.

From personal experience, duplication can be done in two ways. The first option is when you have a broad audience and have maximized your results; you can duplicate to achieve even more results.

The second format is when we launch an audience, fully understanding that this audience should work, but for some reason, it doesn't perform, and we duplicate it. If it still doesn’t perform or there’s some technical issue, duplicate it up to three times at most; beyond that, it’s pointless.

A narrow audience needs to be gradually developed and nurtured by slowly adding new parameters based on what has worked before. At some point, we completely remove targeting and maximize performance because we need to understand that scaling makes the audience as broad as possible. Without a broad audience, there will be no scaling.

Manual bidding

I conducted a lot of tests. For example, I achieved a cost per result of $10. I set the result at $11 and also set a manual bid. However, I didn't get any impressions. Technically, what does this indicate? It suggests that there is a certain hierarchy in auctions, where automatic bids are at a higher level in the auction hierarchy than manual bids. It's quite simple. For instance, in Ukraine, these manual bids simply set a certain upper limit for the system. Manual bids are a great tool. There are bid floors, bid caps, and so on. It's a fantastic approach, especially in branding campaigns where manual bids are a must-have. However, in performance marketing, Facebook has decided to do everything for you, making manual bids inconvenient and they won't work effectively.

On the other hand, manual bids are not conducive to scaling. Why? Because you limit your audience and the number of results you can achieve.

Furthermore, if performance starts to decline, most people decide to copy the audience. This may buy some time, but if results continue to worsen, the client will eventually leave you. Therefore, this solution is not suitable for scaling.

Did you accumulate so much conversion data just to restart the campaign and lose everything? If performance is declining, the first thing to focus on is communication strategies. The second is entry points, and of course, analytics.

Scaling opportunities

Is everything scalable? Definitely not.

For example, if you have a local beauty salon, what can you scale here?

Therefore, if you really want to scale your business and earn more, you need to choose great niches where scaling is possible.

Scaling cycle

How long does a trend last? Overall, it depends on the audience's reaction to the product and the geographical area, as demand levels vary in different regions. For example, if a new product is well-received in one country and everything works there, it may not be of interest at all in another country. Scaling lasts only as long as you stay within your KPIs.

That’s why it’s important to optimize, analyze all the data, minimize the cost per result as much as possible, and only then scale. The longer it lasts, the more budget you spend, and the more results you achieve and money you earn.

Summing up. New horizons

When the audience is maximally expanded, when you have worked to the fullest and achieved results, I congratulate you! You have gone through this entire journey, all the cycles I talked about today. Now you can enter new markets, and your client or employer will surely say 'Thank you'.