Why Split-Testing is Like Sex in High School

Why Split-Testing is Like Sex in High School

Reader Comments (88)

  1. I just want to point out one thing. Yes, to have accurate results you need to test one variable at a time, but don’t hesitate to test a completely different page. Sometimes you’re just working with a bad sales or conversion page and you can tweak all the colors and headlines you want and you’ll never get insane results.

    • Derek

      How would you know you you’re working with a bad sales or conversion page? Is there a benchmark you should use – and if your way below the benchmark you should try changing the complete page – is it a ‘gut’ thing, or scientific?

      Very curious.

      Paul

      • Hey Paul, it’s hard to know, because norms are so specific to industries and offers. Ideally, you can work off of comparables, but realistically speaking, you almost never have access to representative data… the best thing you can do is test and test and test…

    • That’s a really good point, Derek – the first thing that should be tested is the general concept and layout – first arrive at the best “big picture” that works, and then tweak the details. 🙂

    • Absolutely right, Derek. Single-testing will get you to a local maximum (“local peak”), you’ll need multi-variate if you’re aiming for the global maximum (“chooing the peak”). Good to start with the multi and move to single as you gain more confidence that you’re scaling the right peak.

      Also, Danny, users will want to control for differences in traffic characteristics. Most glaringly, date/time and traffic sources. If you don’t control they can get spurious correlation.

      Great article, Danny!

  2. As I read through that I felt like it was written specifically for me. I have made all of those errors . . . in fact I still am! I tend to come up with ideas and then change a ton of stuff and then leave it a month to see what happens. Whether the site made more or less money is the main metric I tend to go by.

    However, your article has hit me like a kick in the nuts! Don’t just talk the talk, walk the walk. How can I pass on killer strategies to my list if I still haven’t mastered this KEY marketing fundamental.

    Great stuff!

    • You bet, Jimmy, I’m glad it was helpful. The key is to test one variable at a time; that means only changing one thing at a time, but it also means controlling for other changes – for example, if one month you ran three guest posts and the other you didn’t run any, then you’ve got to account for that in your test results!

  3. Most don’t split test because of laziness. They are in this game because they want to do the least amount of work for the most profit. What they don’t realize is this industry is a full time business. Not a hobby. To make real money means you need to put in real effort and time. That’s exactly what the wealthiest online marketers understood.

    • Hey Dave, sometimes it’s laziness, and sometimes it’s a sense of overwhelm – often it’s both.

      Every industry is a full-time business – that’s just the nature of any serious money-making venture that is past the “gold rush” stage.

      The thing with split-testing is that it is just complicated enough to be intimidating, and hard to get started with. That’s why tools like Premise are so important… 😉

  4. Wow, that is a really useful post!! The change per impressions percentages was really useful. I have done split testing before and testing something over a month. When the number of sales went up (even by one) I would sit back and congratulate myself on being a killer marketer. However, invariably the next month, after not touching it, the number of sales would go down and I would be back to square one. Now I will only pat myself on the back when the change in action is significant enough to warrent it.

    Thanks so much

    Pete

    • Hey Pete, I’m glad you liked it! Yeah, the statistical significance portion is almost always overlooked – most people don’t realize that they have to measure it, and a lot of people who do realize don’t feel comfortable doing the math (they don’t realize that you don’t have to – there are tools that can do it for you). 🙂

  5. Another thing people forget to check is effectiveness. A may have fewer click throughs than B, but maybe A has a better conversion rate…and that’s what we’re really interested in, right? Fewer tire kickers, more buyers.

    • That’s a very important point, Lara, and it comes down to what outcome you are measuring – clicks isn’t always the best one (in fact, with PPC, it often isn’t – clicks is more a measure of how much you’re paying than how much you’re making!).

  6. Just don’t forget the golden rule of split testing, which also applies to sex: make sure to test one partner (or variable) at a time, otherwise things could get messy.

  7. The simple fact is, most lists aren’t large enough to effectively split test. Unless you have a really large list, and I’m talking six figures minimum, then you really don’t have enough to split test.

    Say you have a list of 30-thousand. First you test the headline 15,000to one list — 15,000 to the other. Now you aren’t sure about the lead graph,better test it.But you don’t want to test the same list. So you take `15,000 and split it in half. 7500 to one list — 7500 to the other. Okay, but what about that graphic? Wouldn’t a Clickbank clip look better than that monitor with the money flying out of it. Okay 3750/3750. And so forth. It isn’t long before your samples become so small they are insignificant.

    I can see the value of split testing. I can also see how worthless it becomes if its overused.

    • You’re right, Joelin, that the numbers need to be large, but I think you’re wrong about how large the numbers have to be… what you’re describing is multi-variate testing, in which you’re testing a lot of things at the same time – for that, you need really huge numbers.

      If you’re just testing one thing at a time, though, and changing to a new experiment once you’ve reached statistical significance, then you don’t need nearly as large a sample size to do the tests (see the post above for specific numbers, or the tools that were linked to).

  8. Love the headline too… 🙂 Thanks for the great breakdown of split testing and the fond memories of high school shenanigans.

    I’ve also tested pages that have a video intro or not and always find that most often video gets better results. Back in the day, I’d put an audio greeting on sales pages and that had the same warmth effect. I haven’t tried Premise, but one thing I’m personally clicking away from are too many template look alike sales pages. I’ll have to check that out to see if it gives me the click-away urge or the stick-around urge.

    • Thanks, Nancy!

      Premise doesn’t create template-y sales pages, it’s just a platform for easily testing whatever you want to test, and doing it within WordPress.

      One thing about split-testing – it’s important to test how your *audience* responds to your sales page, because it isn’t always the same way that you would respond… 😉

  9. Nice post Danny! I find that people don’t test because they don’t know what to test or they’re lazy.
    -AJ

    • Yup, those are common reasons… but I think overwhelm is the most common one – the unfortunate reality is that split-testing is just complicated enough to be intimidating, which is why tools like Premise are so important – especially if the math makes you squirm…

  10. I find when I get something that works I get stuck in a comfort zone I do not want to change anything it is only when something is not working that I test to improve it. I should aim to split test more often but it is hard to actual get down to it.

    • That’s a really common challenge, Phil! I find the best way to go about it is to plan a bunch of things to test in advance, and that way each time an experiment is done, you pick the winner and continue on to the next test – all you need is the implementation, which is really quick. 🙂

  11. I am running my first serious split testing on two Facebook ads. Exactly the same market criteria; both ads point to the same website page for a women’s event.
    The second ad initially had a much higher click through rate, on a lower display rate – but has now levelled out to be the same numbers as the first one. With no sales, we have now cancelled the campaign.

    • Hey Lesley, if click-throughs started off different, and now they’ve become the same, it is possible that you’ve saturated the audience with the ads, so the first one isn’t “pulling” as well anymore – are you running your ad to a small, targeted group of people?

      One thing, though – if the click-throughs vary depending on the ads, and the conversions are remaining flat, it usually means that the problem is with the sales page or offer, rather than the ad.

  12. Great writeup, Danny, especially the clear and concise section on statistical significance, which many people overlook.

    to quote Ogilvy (and many others) – Always, always be testing

    • Thanks, Steven! Yeah, that’s the thing, most people skip over the statistical significance portion – I think it’s because they’re intimidated by the math. But without it, split-testing doesn’t really mean anything…

    • Hey Sherice, thanks for the link – I’ll go check it out. 🙂

      It’s not just that conversion rates can improve over time – it’s that sometimes you get lucky, and you need to filter out the luck to get at the real science! 😀

  13. Wow, thanks so much for this article. I will be honest that I cringe every time I see articles in praise of split testing – what IS that and how do you DO it?! Anyway, looks like I’ve kinda been doing it and didn’t even know it. My blog isn’t quite a year old yet, but I’ve continually tweaked the opt-in look and location, tag lines, colors, etc., and get such a kick out of finding things that work. I’m a very visual person anyway so I actually enjoy coming up with new things to test. At least I understand what split testing is now. Thanks again!

    • You’re very welcome, Marquita, I’m glad I could help! It isn’t really all that complicated – just one of those things that people make out to be a lot trickier than it really is. 🙂

  14. I have to say, I have never ever done split testing on the internet. I have heard the stories. Not to say that I don’t experiment with design changes… all the time… but market testing the changes would probably be very useful, and dull, tedious, and all those things us creative types don’t really want. It would also be most useful to test over a week on a website that gets a lot of traffic rather than one that gets almost no traffic.

    I really enjoyed the article btw. It really highlights the kind of marketing (research) you can do on the internet.

    • Hey Brian, thanks for the kind words, I appreciate it! You’re right – if traffic is minimal, then you’re wasting your time split testing, because you aren’t going to get results that are meaningful anyway – your efforts should go to traffic first. But until then you shouldn’t really focus on design changes either, right? 😉

  15. Great Job, Danny!

    I definitely engage in testing, but not anything too high -level. Mostly because too much “data” makes my brain freeze. That said, I can vouch for the fact that testing can make all the difference in knowing what your target market really wants. If you don’t have anything to assess, you constantly throwing darts blindly. And in all likelihood you can be hurting because of it.

  16. Awesome Post!

    Personally not a big fan of multivariate testing, probably because I suck at it.

    A/B testing definitely works better for me.

    You’ve sold me on Premise though.

    Thanks again.
    Niv

    • Hey Nivin, honestly, unless your traffic is in the stratospheric numbers, multi-variate testing isn’t practical. Maybe Brian can weigh in, but does even Copyblogger have the traffic to effectively do multi-variate testing?

      I think you’ve really got to be on the scale of an AOL or Yahoo or Google to be doing that stuff – the rest of us can stick to A/B testing. And Premise is a great tool for that! 🙂

        • What I’d like to know is if you are doing it sequentially or in parallel?

          I mean do you run variation A for a few hours/days and then variation B for the same time or do you runt them in parallel serving A to some random visitors and serving B to some other random visitors?

  17. I do split testing online all the time. First we test price, then we test headlines. It’s easy to do in 1shoppingcart and if you can do it and you don’t, you’re losing money!

    • Interesting – I always test headlines first, and then price – I figure that the headline is the gateway to the site, so I’ll get statistically meaningful results about price faster that way. 🙂

  18. The point is well made. Don’t believe the numbers unless you know what you are telling you. Monitoring is the greatest gift the Internet has to offer a business. But it is a complex area and whilst numbers may not lie it is crucial to ake sure that we understand what they are telling us.

  19. I love what you say about how everyone is talking about it but no one is really doing it. I feel this way about a lot of things. A great idea comes along and everyone whines that it will be ruined because everyone will jump up and down and copy said idea. Thing is, unless you’re a daily deal site, this rarely happens.

    Why? Because people are lazy.

    Figuring out how to do a good split test takes time to learn and, yes, it’s frustrating. But take a day and devote it to testing colors, fonts, the first sentence of your intro, whatever. If you actually make a commitment and do it you’ll be way ahead of the game.

    • You’re so right, Marian! Everybody does jump on ideas, but only after they’ve proven themselves time and again, and only when they’re simple (at least in concept).

      I don’t think the issue is laziness, though – I think the issue is intimidation and overwhelm.

      You’re right, though, the best way to get over that is just to try… 🙂

  20. G’Day Danny,
    I thought that I knew what split testing was all about. I’d even tried it a few times with mixed results.

    Thanks to your post, I now have an idea of just how much I didn’t know
    Thanks

    Regards
    Leon

  21. Hi Danny,

    A super in depth article here.

    If you have an analytical mind split testing is certainly for you. Taking the time to run detailed tests helps you find out what’s working and what’s not.

    Thanks for sharing your insight.

    Ryan

  22. Split testing with or with out Google?

    I started with Google using the A/B approach but it fell to the way side because the process is complicated and the resulting data was insufficient.

    Now I do my split testing the hard way:

    The first page the traffic comes from SE’s (it is indexed) and my web sites, I use Pay Pal for the pay processor

    The second page is for affiliates only and is not indexed all the traffic comes from Click Bank affiliates.

    I track each page daily by the number of unique readers and sales.

    I use the page for affiliates to do my changes, like most other split testers I can’t do a one word change and let it go at that. Patience is a virtue, it is wearing very thin.

    I do a change on the affiliate page after 100 unique readers (or hits as Click Bank ranks them) then wait to see what the result is.

    I rate the page by the month, that is the conversions for the monthly total of sales divided by the total number of unique readers.

    Currently the two pages are running at 0.26% for the Pay Pal and 0.07% for the Click Bank. However the Click Bank page has a higher hit ratio than the Pay Pal page.

    You have some excellent tips in your article that I will incorporate in to my affiliate page.

    • Hi Monte, I’m glad you found it helpful. It sounds like your process is pretty thorough – myself, I’ve mostly used Google, and been fine with it – are you sure you’re using it right?

      • Pretty sure, I have five products, three I was using Google for split testing.

        After a month of not getting the data I was looking for (really basic stuff) I decided to drop in and track it manually.

        Now it may have improved because that was over four years ago. 🙂

  23. Love this! I’m always trying to encourage people to split test – or at the very least sequential test where you run one page and then another and then compare results. Maybe for some of us that’s an easier entry?

    Cathy

    • Any testing is better than no testing, but I think simultaneous is a lot better, because it controls for variables like different posts, different promotions, and so forth – testing is only useful when the data is valid, after all. 🙂

    • I think you’ll find a better explanation in wikipedia but let me try in my simple words:

      The idea is that you create two alternatives of something (e.g. the form at the top right of this page) One called A that looks as it is now. Another called B that uses the word “subscribe” instead of “join us”.
      (hence split testing is also called A/B testing.

      Then you let both versions run for some time. (see the original blog entry for details of what this “some time” might need to be) and then you figure out which version generated more new sign-ups and if the difference is significant enough. Based on this you either keep version A or version B.

  24. I agree with everything but one thing: it’s easy to say that you should let a test run until you have 95% confidence, but some tests just never get there. I’ve seen sites (and owned/own them) where 50k impressions still did not amount to a high enough confidence level.

    And that’s a big problem. A huge problem. A problem that’s not often addresses: tests with minor differences. Don’t test headline A versus B, in my experience even that is too small of a difference in many occassions. At first, test radical differences: long copy versus short copy, video + opt in form versus text + opt in form, then test smaller differences.

    I’ve set a 3 month max to a test at a certain point in time because too many tests never reached 95% confidence. If there are still no significant differences on a site with plenty of traffic, then there is NO winner. The different version needs to become even more different. Can’t stress this enough,

    • That’s a really good point, Dennis – thank you for pointing it out. If the difference is too small, then even a large number of impressions won’t easily get you to statistical significance – which is why it is important to test things that are likely to make a big difference, just as you said. 🙂

  25. I really enjoyed this post Danny. As a rule of thumb (I realize this would change depending on the industry), would you recommend changing the format every month? Do you often find yourself going back to the original format after you have changed it?

    • Hi Marc, I’m glad you liked the post. It’s not a question of how many visitors, it’s a question of statistical significance; the more extreme the difference in results, the less visitors you need. Check out our split test checker (linked to in the post), to see how the numbers work out. 🙂

  26. Great post Danny, You brought up very awesome points and I agree with most of them. I think people don’t do split testing enough because of the time it takes to setup and execute some of these systems. Most people just want to do small things..like split test a singe image.

    This is why I created ClickAppeal. I wanted something that would allow the average blogger or Internet marketer the ability to split test images on their site with out having to split test an entire page. Something you could setup within 2 min. http://ClickAppealapp.com

This article's comments are closed.