In my last test results blog
Kevin Burg said ...
With equal weight to both buttons you may tend to read the whole text on both buttons as a sentence, and instead of going BACK to the beginning (Try it) you go with the last thing you read (See the tour). How about flipping them – “See the Tour or Try Less Accounting Free”, and configured like the 2nd option?
So I ran this test for thousands of users over the course of 12 days. The results were surprising.
Conversions 12.3%
Conversions 13.8%
Great little test. The difference in % is not huge, but if this is really across thousands of different users then Kevin’s idea can give people a little bump.
Where else could this be used? How you display your products?
So subtle a change but effective. Thanks for sharing, I’ve enjoyed seeing these UI test results.
@Adam, I’d be happy with any +% change. 1.5% is HUGE in my opinion.
Did you do any statistics? A simple chi square would be in order.
@Pygy is that a joke? Why would I waste time building a chi square? I can see the results, this isn’t rocket science.
Very interesting. Although +1.5% sounds small, it is actually a huge improvement: Before, 123 of 1000 visitors converted, now it’s 138. 138/123 is roughly a 12% improvement.
Absolutely not. The difference could be th result of random fluctuations.
A chi square takes about 20 seconds to complete.
Go there: http://faculty.vassar.edu/lowry/newcs.html Set cols and rows to 2.
fill in the grid like this: assign rows to the design type and cols to the conversion boolean e.g. assuming you have 1000 visits in each group:
123 877 138 862
And look at the uncorrected Pearson p value. That’s the probability that your results are due to a random fluctuation.
Aw, the numbersu shoul have been displayed in a tabular form (one line break doesn’t seem to be enough).
Test:
123 877
138 862
Google Website Optimizer will tell you when you have enough data and the results are confident.
Big fan of what you guys do with your apps and your writing. But I too am serious about statistics. The results here don’t appear to be significant.
It’s not about how 1.5% feels for someone. Sure that can be a very “significant” number for your bank account, but did the test actually provide results that weren’t random in nature. If they are random, then if you were to do this test again, it’s very possible that you won’t see this increase again like this, or the other design will have an increase.
I haven’t done a chi squared test in a while. So I looked at
http://blog.asmartbear.com/blog/easy-statistics-for-adwords-ab-testing-and-hamsters.html
Which seems right, I think?
So doing the test from this article. Assuming you did 1000 views of your design, your “trials” look like: 123 converted from Design 1, and 138 converted from Design 2. That’s 261 total conversion trials. D is (138-123)/2 or about 7.5, which has a D squared of about 56.25. Since 56 is not greater than 261, the results don’t appear to be significant, in a statistical sense.
Now, if you had run your designs past 10,000 people this would have been significant.
You’d have had 1380 conversions from Design 1, and 1230 conversions from Design 2, which makes a total of 2610 trials. D would be in this case 75, which is a D squared of 5625. And now D squared is greater than the number of trials.
So to truly prove your hypothesis, you need to increase the number of trials run, or have a larger increase in conversions. And maybe you have the larger number of trials to use, but you just said “thousands” which might be closer to 1,000 than 10,000, but only you guys know that :)
Thanks again for sharing. Left is not always better!
Maybe for your next version of these UI tests – post the screenshots at the top of the blog posts without revealing the results (show the results lower on the page – requiring the reader to scroll). It will be fun to guess which we think had the higher conversion rate.
@nate check out the google website optimizer it will tell you when it’s received enough impressions that the results are not random. Which was WAY more than 1000 impressions closer to 10,000 like you mention.
@william Having a UI Poll is a great idea!
Make no mistake about it: 1.5 points is huge.
I tend to think that the conversion increase was due the location of the button rather than the sequence. In the second case it was more towards the center of the page, making it easier to capture going vertically through contents.
Were the tests run at the same time or one after the other. Day of the week and even week of the month make a HUGE difference in my stats.
Were the tests run at the same time or one after the other. Day of the week and even week of the month make a HUGE difference in my stats.
I like that and has always believed that with test you can really achieve great results…but in real world only 5% of the sites do a test
I like that and has always believed that with test you can really achieve great results…but in real world only 5% of the sites do a test
This is a good find. I’m going to change a couple pages on a site I’m working on now and move the decision I want the reader to make to the end to see how it effects conversions on those pages.
Hopefully I will have similar result to yours.
This is good info, so I’ll have to check back for more useful info later.
I’ll let you know how it works for me.
I’d have to agree with both the statistically significant comments and Tamer Salama’s point of positioning being more centered.
Is it possible for you to re-run this with a large sample size, and to also add in a variant that drops the “See the Tour” link? (Or a fourth option, where See the Tour is below the green button.)
Keen to know if you guys used Website Optimizer to test or did you write a Rails app to do it?
Cheers Stuart
@storecrowd, we used the google website optimizer.
which split testing solution did you use? Google Website Optimizer or any other program?
Maybe it is a case of people being ill informed, but every time I read a post about people using tools like Google website optimizer for testing, the same question always comes up. Was your sample size big enough. or did you test it on the right day of the week.
Google Website Optimizer takes this into account when running the test. It doesn’t shut off the test or declare a winner until it is statistically significant and yes it normally takes over a week to determine this.
For those that don’t believe it, if you came back with a clear winner statistically like the tool says, why wouldn’t you run the winner. I always encourage follow up tests, but you have to run the winner. you would be risking your neck if you didn’t. There are outside factors that could account for different things working that don’t, but it is your job to check the analytics to see what might have change the outcome.
In general the winner of the Google Website Optimizer test will be the correct one. It doesn’t take a tiny sample size to determine a winner.
And keep sharing yoru findings. I love reading them.
@tim is totally correct. Google tells us when our test is finished. But since we get > 100,000 uniques per day, it doesn’t take us very long to get test results.
Fantastic result, incredible really given it’s such an apparently minor change.
Great to see you guys testing and publishing the results too. GWO is great for this kind of thing.
One thing I guess would be interesting is what happens to the overall conversion rate, with regards to those who convert via the future tour (I assume in this test you were still counting conversions as home page -> sign up page), and whether it remains unchanged, dips, or even goes up.
I’m sure you guys have a million things in mind to test (I can think of a few! :) so look forward to seeing more results when you get there! :)
A lot of people posting here do not understand the core issue. This was a good test, but just because two things happen together, it cannot be concluded that one caused the other. This is a common thinking error. There are many other variables that could have caused this outcome. As an example, perhaps users were starting to recommend the product to other users. Perhaps there was a recommendation on a user forum somewhere that increased adoption. Or perhaps a good blog post.
Or perhaps the author is exactly right. But to conclude that this was the cause is to potentially convince yourself of something that is false. Anyhow who thinks they this limited study proves the buton change is responsible needs to learn a little bit about statistics and science.
@David and the rest of you detractors are wrong, wrong, wrong. I would suggest you all go and read the docs on the Google Webpage Optimizer to see how it works.
It shows both versions of a page randomly to every person who comes to the page you are testing (the home page in this case). And runs until it collects enough data to be statistically significant. It’s not like we’re just changing the page and looking at the difference. This prevents errors from outside events, like a good blog post.
Try it yourself, do your own statistics and judge for yourselves.
Вот именно поэтому и иногда не хочется двигаться вперёд!