Posts tagged split testing
Posts tagged split testing
That’s right, a computer game about football management can help you to develop skills that are useful for your career.. and I don’t mean getting a job as a Football Manager. For those who don’t know what it is, Football Manager (FM) is a simulation game, developed by Sports Interactive (SI), that puts you into the hot-seat of managing a football club.
So how can you weave ‘Playing FM’ into an interview? What is the skill that FM can help you to develop? It’s Analytics, that’s why it fits in this blog. If we look at what skills are currently in demand in the workplace it’s Big data. Companies, everywhere, are looking for people who can analyze the data that they generate.
So how does Football Manager help you develop valuable career skills?
I started playing FM under its previous carnation (Championship Manager), One of the common complaints were that after playing the game over many seasons, some of the newly generated players looked odd and the game appeared unbalanced, e.g (defenders were no longer brave). So I built a tool in my spare time that analyzed player attributes to see if there were differences in how players evolved over time to check for big differences. This taught me to code a little and how to analyze data.

It’s a simple tool that spat out a text file that showed you the differences between the data at 2 different points in time (e.g. do all defenders lose the ability to be aggressive in 20 years time?). SI used it at the time, however, I eventually became too busy to be able to do this, and I’m pretty sure SI developed their own tools to do this much much better than the buggy code I created.
The later versions of FM makes it possible for anyone to develop analytical skills - I’ve purposely selected some screenshots below that show you what Football Manager is all about.

This screen shot shows you how well a player has improved.

This one shows you their training regime.
One aspect of improving any product is testing. This means changing variables and checking the results. For example, in a web page you may want to improve the % of people who sign up. You usually do this by split testing, or A/B testing. Then you analyze the data set afterwards.
Look at those 2 screens… it’s the same principal. You tweak the training program, you assign different players to each program and then you compare the results to see which is more effective. There are a lot of people who are doing this already and unaware about the skills they are developing and how transferable they are to the real world. Take a look at the Tactics and Training Forum and you’ll find a lot of deep analytical talk where people discuss how to tweak training programs to improve players the most. It’s a hotbed of statistical analysis, A/B testing, spllit testing, metrics, measurement… no different to a professional analytics group on LinkedIn.
If you’re looking to develop your skills using FM, I recommend you use FM Genie Scout… some people may call it cheating… but it’s the ability to use it for data analysis that makes it so useful. Look at this screenshot -

It could be Google Analytics. The history function let’s record multiple data points. Here’s how you would perform a split test playing the game and using this tool -
It makes it easier to discover what changes are more effective for which player attribute, not much difference to optimizing a website or product… the fundamental skills are the same. For those who are more advanced, you can spit the data out into a spreadsheet. Once you’ve done this over several data points, you can plot any graph or create pivot tables to analyze player progression.
And that is how Football Manager can help you to develop real skills that are needed today.

In my previous post. I talked about getting retention and engagement metrics out of split testing.
Here’s a practical example of how to do A/B testing using Flurry.
Create an App_Launch event that happens whenever your app is started or brought back from the background
When you log the event, pass it with a parameter A(name of split test) or B(name of split test). You can decide in advance if the app should use the ‘A’ version or the ‘B’ version using some device variable such as the MAC address or UDID.
For the purpose of this post, I will use a conventional ‘marketing’ conversion split test. The position of the in-app purchase button as the illustration of the ‘split test’. However, it could be for anything, the number of coins a new user receives on starting a game, the order of the tabs at the bottom, the layout of a particular screen, etc. In this example, people in group A have the in-app purchase in the top of the screen. People in group B have it pop-up. Marketing wants to know which positioning maximises conversions…but we want to also see the impact on engagement and retention, which I will talk about in a later post.
1) Create 2 segments inside Flurry

You have now created 2 segments that can dissect the user behaviour of both of these parties.
2) Check the conversion


This shows 608 people converted. That’s just below 10% in conversion
Do the same for B, and now you can compare the conversion rates. In my example, B has 6320 and 478 conversions. Use a tool such as this online calculator and we find that it is statistically significant.
3) The Bonus Engagement and Metrics
If you go back to the event summary, you can download all the A and B data in CSV format. Go ahead and do that.
Then create a spreadsheet with 3 sheets. A, B and statistical significance. Then you can create a spreadsheet that can test your whole app across all its events, by adding a statistical test to each event. I used this basic formula-
=(((0.5*(‘Split Test A’!B2+’Split Test B’!B2))-‘Split Test A’!B2)^2)/(0.5*(‘Split Test A’!B2+’Split Test B’!B2))+(((0.5*(‘Split Test A’!B2+’Split Test B’!B2))-‘Split Test B’!B2)^2)/(0.5*(‘Split Test A’!B2+’Split Test B’!B2))
But there are plenty of other formulas that maybe more suitable for you..
Next, colour code the spreadsheet so that any significant differences are highlighted and then you can see the impact of the A/B test beyond the scope of just the conversion.

This can bring you many different insights. For example, conversion maybe higher for in app purchases but the number of people recommending or sharing the app using a tweet or facebook button decreases for that group.
Remember to check in which direction the result is statistically significant.

For most people, A/B testing analytics (or split testing) is all about conversion. How can I redesign this webpage to get more people to click through to my goal? How can I get more people to sign up. It’s all about acquisition, acquisition, acquisition… actually there’s more to it than that.
For mobile app analytics, A/B testing tools generally support campaign or content optimization focused on conversion, clicks or triggers. This is fine if what you are doing is marketing focused, acquisition focused or activation focused.. but it doesn’t really help much with engagement or retention. What if you wanted to find out the impact of :-
etc. what change will increase retention, not just conversions. What change will increase use of other features?
What conventional A/B testing doesn’t tell you -
In fact there’s a very easy way to do this using event attributes or event parameters (depends on the app analytics tool you use). In your app framework, assign people who are in the ‘A’ group with an event attibute/parameter as ‘Test Group A’ and likewise for ‘B’ in their App Launch event or whatever naming convention makes sense for you. You can give more unique names for different tests.
By doing this, you have segmented your users into A & B and now you can test the impact across every event/metric in your app.
Segment and extract all the data into a spreadsheet and you can then statistically test them like this -

In the above chart anything in Green was tested as statistically significant. This gives a deeper insight into what other changes occured due to the A/B test and means you can make A/B tests beyond thinking about ‘click’ or ‘tap’ conversions at the top level, remove features that don’t add to engagement, retention or revenue.
Secondarily, using tools which provide retention or lifecycle metrics, you can create a segment using the event parameter and see which version really does provide more retention and allows you to make more discoveries
These are some of the potential discoveries just looking beyond the basic A/B test.
Do you have any mobile analytics tricks to share?