How to Conduct A/B Tests in Mobile Apps: Part I

  • A developer thinks he knows how to improve the app without testing
  • The developer believes he can simply compare “before” and “after”
  • A/B tests are time consuming and expensive
  1. Statistical significance
  • They are revealing. Radical testing has an extremely positive or negative effect, so it is easier to assess the effect from the changes. Even if the test gave a negative outcome, you get an understanding in which direction to move, whereas trivial tests inspire the illusion that an optimum is found. The $5 option lost to the $4 option — it could be regarded as meaningless to test even larger costs, they definitely won’t win. In our experience, it doesn’t work that way.
  • They give an opportunity to save money. There are more iterations in the radical test, but they are cheaper to perform and achieve the desired significance with fewer conversions. We need less data to draw a certain conclusion.
  • Lower error probability. The closer the tested variations, the higher the chance of randomness.
  • A chance for a pleasant surprise. Once in AppQuantum we had been testing an unreasonably high offer price — $25. Our entire team and the team of our developer partner were convinced this was too expensive and no one would buy that offer at this price. Competitors’ similar offers cost a maximum of $15. But all in all our variation won. Pleasant surprises happen!
  • Narrative and quality of localisation
  • User interface
  • User experience design
  • Tutorial and onboarding
  • We know that only together these changes work effectively.
  • We are sure no change will give a negative result.
  • It is easier to simultaneously test several elements that are inexpensive to design.
  • Defining the biggest bonus and risk features;
  • Asking why this superfeature could fail and succeed;
  • Estimating whether the bonus is worth the possible risks at all;
  • Determining the minimum implementation in order to receive a bonus;
  • Formalising the assessment of the bonus and risk in the test;
  • As a result, comparing the variation not only with the control group, but also with alternative ones.
  1. By demography. The audience is commonly split by country or gender + age. This factor decides what traffic sources you should use for the campaign;
  2. By payers. If there is enough data, we make several segments;
  3. By new and old users. But if possible, it is worth testing only new users;
  4. By platform and traffic source.
  1. Embedded analytics and tracking in the app.
  2. Understand how much one user in your app costs and whether you can scale it.
  3. Have resources for constant hypothesis testing.

--

--

--

Mobile Game Publisher

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Structuring Your Software Engineering Team

Basic Concepts of Red Hat OpenShift

What is a Large-Scale Web Scraping: Highlights and Challenges

My Experience at Flatiron School

[Opinion] Why Open-Source Matters | PANTHEON.tech

Solution to Leetcode’s Valid Perfect Square

Deploy a Full Stack Web App to Azure Kubernetes Service with Docker

How to get a free SSL certificate for Kubernetes with cert-manager

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
AppQuantum

AppQuantum

Mobile Game Publisher

More from Medium

How We Test & Learn

Omnichannel Experience #Fails

Overhead picture of a shopper paying for a purchase with a mobile phone tap-to-pay application.

How to quickly deliver a quantitative study to a PM

How we design our A/B Test experiments at Syfe