At Jana, we have four core values: Get Stuff Done, Own It, Measure It, and Have Fun. On the Data Science team, we take a special interest in helping the rest of the organization “measure it.” Whether it’s designing experiments, selecting measurable OKRs, or building dashboards, our team finds ways to nudge the company towards data-backed decision making. Recently, I spearheaded an initiative to host Data Science office hours, where our Data Science team volunteers time to address ad hoc requests. The first project I got my hands on came from the Consumer Marketing team.
mCent at IIT TechFest
Jana’s Consumer Marketing team works on awesome projects related to promoting mCent. Some of their initiatives include partnerships with social media influencers in Mexico, an online commercial about female empowerment in India, and managing mCent’s Facebook page with 1.4 million likes. Their most recent offline marketing initiative was at IIT TechFest in Mumbai, Asia’s largest science and technology festival, where mCent was a major sponsor. The goal of the event was to reach 100,000+ college students in India and introduce them to mCent. Prior to the event, the team’s plan was to write down the name, phone number, and email address of each student who downloaded mCent at the booth. For doing so, users would receive bonus mobile data. After just a few hours, it became clear that this plan was not going to work—word got out about mCent, and the team’s booth became one of the most popular at the festival. With long lines developing, writing down every name became impractical, and they lost track of how many users were signing up.
Offline marketing initiatives like this one are notoriously tricky to measure. Still, the Consumer Marketing team needed to estimate how many users they signed up in order to calculate ROI. To crack this mystery, I began investigating our location data. We use Android’s location services API so that we can customize the mCent app experience based on which city a user is located in. When app developers run user acquisition campaigns in cities their app is not available in, app reviews take a hit. So, collecting this information allows us to offer smarter, targeted user acquisition for our customers.
Measurement Challenge: User Locations
With no way to know for certain why each user downloaded mCent, I decided to estimate how many users I would typically expect to see downloading mCent in Mumbai during the time of the festival, and compare it to what actually happened. The first challenge came from the coverage of our location data. In emerging markets, where phones often have less battery power, many users turn off location sharing for all apps to save phone battery. Location sharing is on by default in Android, but for more recent versions of the OS, this switch is in Quick Settings, so users on those versions turn off location sharing more frequently. Additionally, for most of our users’ devices that run Android versions before Marshmallow, location tracking is global, so users have to choose to turn it on or off for all apps without distinguishing privileges between apps.
With an estimate of our location data coverage, I could take the difference in actual and expected user growth in Mumbai and divide by the coverage rate. However, I didn’t want to assume that users at TechFest would be just as likely as other mCent users to turn off location sharing because I believed that this particular population may be in a different socioeconomic level than the average mCent user. This difference manifests itself in ownership of a different pool of phones.
Luckily, I had access to a list of 300 users who had signed up for mCent in the first hours of the festival, before the craziness set in. Comparing these new users to the rest of our current users, the assumption that TechFest attendees would be just as likely to turn location services off turned out to be valid. One surprise did emerge—the city-level location accuracy was somewhat limited. Only 85% of the 300 users were located in the correct city, but all were in the right state. Knowing this, I extrapolated the rate of location accuracy for the rest of the festival, allowing us to multiply the effect by this rate.
Measurement Challenge: Outside Marketing Initiatives
To estimate expected growth, I plotted typical organic mCent user growth in Mumbai during the month before and after the festival. After that, I compared the trendline to actual user growth in Mumbai during the festival. The difference between the actual growth during the festival and the expected growth under normal circumstances, divided by the rate at which I expected new users to be correctly labelled as in Mumbai, became the basis for my estimate of the festival’s effect.
However, there was a separate initiative that could bias my measurements—our recent mCent commercial. Promotions for the commercial ended a day before the festival, so I anticipated a subsequent decline in the growth rate across India. To isolate typical organic growth from growth influenced by the commercial, I decided to plot a trendline for new users based on the days after both the commercial and the festival had ended. To develop the trendline, I used a simple linear regression model to compute how many users I would have expected mCent to gain in Mumbai during the days of the festival, had the Consumer Marketing team decided to skip the event.
This trend line allowed me to calculate the difference between actual and expected new users. Since there is day to day variation, I calculated a confidence interval around this estimate. In the end, the Consumer Marketing team was able to able measure the ROI of this effort, in order to plan other marketing campaigns for next year.
This project is just one example of how using data science can push the business toward more efficiency. Are you interested in joining our collaborative, data-driven team, you’re in luck—we’re hiring!