Allan & Steve are the chubby founders of LessEverything. This is their blog, hear them rant, praise, give advice and talk about Just Stuff, Less Accounting, Lovd by Less, More Honey, Events, Less Memories, Code, Business, Design, Marketing

UI Test Results #3

written by Allan Branch on June 15th, 2009

In my last test results blog Kevin Burg said ...
With equal weight to both buttons you may tend to read the whole text on both buttons as a sentence, and instead of going BACK to the beginning (Try it) you go with the last thing you read (See the tour). How about flipping them – “See the Tour or Try Less Accounting Free”, and configured like the 2nd option?

So I ran this test for thousands of users over the course of 12 days. The results were surprising.

Conversions 12.3%

Conversions 13.8%

29 Responses to “UI Test Results #3”

  1. Adam Wride June 15th, 2009

    Great little test. The difference in % is not huge, but if this is really across thousands of different users then Kevin’s idea can give people a little bump.

    Where else could this be used? How you display your products?

  2. ActsAsFlinn June 15th, 2009

    So subtle a change but effective. Thanks for sharing, I’ve enjoyed seeing these UI test results.

  3. allan branch June 15th, 2009

    @Adam, I’d be happy with any +% change. 1.5% is HUGE in my opinion.

  4. Pygy June 15th, 2009

    Did you do any statistics? A simple chi square would be in order.

  5. allan branch June 15th, 2009

    @Pygy is that a joke? Why would I waste time building a chi square? I can see the results, this isn’t rocket science.

  6. jack June 15th, 2009

    Very interesting. Although +1.5% sounds small, it is actually a huge improvement: Before, 123 of 1000 visitors converted, now it’s 138. 138/123 is roughly a 12% improvement.

  7. Pygy June 15th, 2009

    Absolutely not. The difference could be th result of random fluctuations.

    A chi square takes about 20 seconds to complete.

    Go there: http://faculty.vassar.edu/lowry/newcs.html Set cols and rows to 2.

    fill in the grid like this: assign rows to the design type and cols to the conversion boolean e.g. assuming you have 1000 visits in each group:

    123 877 138 862

    And look at the uncorrected Pearson p value. That’s the probability that your results are due to a random fluctuation.

  8. Pygy June 15th, 2009

    Aw, the numbersu shoul have been displayed in a tabular form (one line break doesn’t seem to be enough).

    Test:

    123 877

    138 862

  9. allan branch June 15th, 2009

    Google Website Optimizer will tell you when you have enough data and the results are confident.

  10. Nate June 15th, 2009

    Big fan of what you guys do with your apps and your writing. But I too am serious about statistics. The results here don’t appear to be significant.

    It’s not about how 1.5% feels for someone. Sure that can be a very “significant” number for your bank account, but did the test actually provide results that weren’t random in nature. If they are random, then if you were to do this test again, it’s very possible that you won’t see this increase again like this, or the other design will have an increase.

    I haven’t done a chi squared test in a while. So I looked at

    http://blog.asmartbear.com/blog/easy-statistics-for-adwords-ab-testing-and-hamsters.html

    Which seems right, I think?

    So doing the test from this article. Assuming you did 1000 views of your design, your “trials” look like: 123 converted from Design 1, and 138 converted from Design 2. That’s 261 total conversion trials. D is (138-123)/2 or about 7.5, which has a D squared of about 56.25. Since 56 is not greater than 261, the results don’t appear to be significant, in a statistical sense.

    Now, if you had run your designs past 10,000 people this would have been significant.

    You’d have had 1380 conversions from Design 1, and 1230 conversions from Design 2, which makes a total of 2610 trials. D would be in this case 75, which is a D squared of 5625. And now D squared is greater than the number of trials.

    So to truly prove your hypothesis, you need to increase the number of trials run, or have a larger increase in conversions. And maybe you have the larger number of trials to use, but you just said “thousands” which might be closer to 1,000 than 10,000, but only you guys know that :)

  11. William Lang June 15th, 2009

    Thanks again for sharing. Left is not always better!

    Maybe for your next version of these UI tests – post the screenshots at the top of the blog posts without revealing the results (show the results lower on the page – requiring the reader to scroll). It will be fun to guess which we think had the higher conversion rate.

  12. allan branch June 15th, 2009

    @nate check out the google website optimizer it will tell you when it’s received enough impressions that the results are not random. Which was WAY more than 1000 impressions closer to 10,000 like you mention.

    @william Having a UI Poll is a great idea!

  13. Shane June 15th, 2009

    Make no mistake about it: 1.5 points is huge.

  14. Tamer Salama June 15th, 2009

    I tend to think that the conversion increase was due the location of the button rather than the sequence. In the second case it was more towards the center of the page, making it easier to capture going vertically through contents.

  15. Mike Brown June 15th, 2009

    Were the tests run at the same time or one after the other. Day of the week and even week of the month make a HUGE difference in my stats.

  16. Mike Brown June 15th, 2009

    Were the tests run at the same time or one after the other. Day of the week and even week of the month make a HUGE difference in my stats.

  17. Sourav Sharma June 15th, 2009

    I like that and has always believed that with test you can really achieve great results…but in real world only 5% of the sites do a test

  18. Sourav Sharma June 15th, 2009

    I like that and has always believed that with test you can really achieve great results…but in real world only 5% of the sites do a test

  19. Hunter Al. Gonzojourn June 15th, 2009

    This is a good find. I’m going to change a couple pages on a site I’m working on now and move the decision I want the reader to make to the end to see how it effects conversions on those pages.

    Hopefully I will have similar result to yours.

    This is good info, so I’ll have to check back for more useful info later.

    I’ll let you know how it works for me.

  20. Jeff Potter June 15th, 2009

    I’d have to agree with both the statistically significant comments and Tamer Salama’s point of positioning being more centered.

    Is it possible for you to re-run this with a large sample size, and to also add in a variant that drops the “See the Tour” link? (Or a fourth option, where See the Tour is below the green button.)

  21. StoreCrowd June 15th, 2009

    Keen to know if you guys used Website Optimizer to test or did you write a Rails app to do it?

    Cheers Stuart

  22. allan branch June 15th, 2009

    @storecrowd, we used the google website optimizer.

  23. deepak June 16th, 2009

    which split testing solution did you use? Google Website Optimizer or any other program?

  24. Tim June 16th, 2009

    Maybe it is a case of people being ill informed, but every time I read a post about people using tools like Google website optimizer for testing, the same question always comes up. Was your sample size big enough. or did you test it on the right day of the week.

    Google Website Optimizer takes this into account when running the test. It doesn’t shut off the test or declare a winner until it is statistically significant and yes it normally takes over a week to determine this.

    For those that don’t believe it, if you came back with a clear winner statistically like the tool says, why wouldn’t you run the winner. I always encourage follow up tests, but you have to run the winner. you would be risking your neck if you didn’t. There are outside factors that could account for different things working that don’t, but it is your job to check the analytics to see what might have change the outcome.

    In general the winner of the Google Website Optimizer test will be the correct one. It doesn’t take a tiny sample size to determine a winner.

    And keep sharing yoru findings. I love reading them.

  25. Steven Bristol June 16th, 2009

    @tim is totally correct. Google tells us when our test is finished. But since we get > 100,000 uniques per day, it doesn’t take us very long to get test results.

  26. Luke Stevens June 18th, 2009

    Fantastic result, incredible really given it’s such an apparently minor change.

    Great to see you guys testing and publishing the results too. GWO is great for this kind of thing.

    One thing I guess would be interesting is what happens to the overall conversion rate, with regards to those who convert via the future tour (I assume in this test you were still counting conversions as home page -> sign up page), and whether it remains unchanged, dips, or even goes up.

    I’m sure you guys have a million things in mind to test (I can think of a few! :) so look forward to seeing more results when you get there! :)

  27. David July 5th, 2009

    A lot of people posting here do not understand the core issue. This was a good test, but just because two things happen together, it cannot be concluded that one caused the other. This is a common thinking error. There are many other variables that could have caused this outcome. As an example, perhaps users were starting to recommend the product to other users. Perhaps there was a recommendation on a user forum somewhere that increased adoption. Or perhaps a good blog post.

    Or perhaps the author is exactly right. But to conclude that this was the cause is to potentially convince yourself of something that is false. Anyhow who thinks they this limited study proves the buton change is responsible needs to learn a little bit about statistics and science.

  28. Steven Bristol July 5th, 2009

    @David and the rest of you detractors are wrong, wrong, wrong. I would suggest you all go and read the docs on the Google Webpage Optimizer to see how it works.

    It shows both versions of a page randomly to every person who comes to the page you are testing (the home page in this case). And runs until it collects enough data to be statistically significant. It’s not like we’re just changing the page and looking at the difference. This prevents errors from outside events, like a good blog post.

    Try it yourself, do your own statistics and judge for yourselves.

  29. Феликс January 10th, 2010

    Вот именно поэтому и иногда не хочется двигаться вперёд!

Leave a Reply