@andrewchen

Get the newsletter · Featured · Recent

How to measure if users love your product using cohorts and revisit rates

Do users really love your product?

If they did, how would you be able to tell?

I would argue that the single most telling metric for a great product is how many of them become dedicated, repeat users. This angle of thinking naturally leads to a number of metrics around user retention, which we’ll examine in this blog post.

User retention is especially important for social web products. Failure to consider the backend retention of a userbase can lead to catastrophic results – in particular, without the proper mechanics in place, it’s easy to hit the “shark fin” user curve, as well as the death spiral caused by reverse Metcalfe’s Law. In both cases, once the core audience of a site starts to erode, then the erosion can cause a negative feedback loop that causes the entire audience to fall away.

This raises a series of questions:

  • What are the right metrics to track for user retention?
  • (And as a corollary, what are the wrong ones?)
  • What is a “good” retention number? What are bad retention numbers?
  • How do you optimize and improve retention rates?

Let’s tackle these below.

Retention versus Engagement
First off, there’s an important distinction between engagement versus retention, which some folks often track in one bucket. I generally define retention is simply the act of getting users BACK to revisit, regardless of their actual activity on the site. Contrast this with engagement, which measures how much time they spend with the product, how many features they interact with, etc.

An implication of this is that the right metric to follow is visits rather than something like pageviews or time-on-site.

Here are a couple examples of the separation of engagement versus retention:

  • Google is a high retention, low engagement site
  • MySpace is a high retention, high engagement site
  • News sites are often medium/high retention, low engagement sites (like checking a headline)
  • etc.

Note the important point that engagement doesn’t necessarily correlate with monetization. Because many retail sites and reference properties are transactional in nature, oftentimes this implies that the closer you are to the money, the lower the engagement is.

Keep this in mind for people who espouse “addictiveness” and “engagement” as virtues for social media sites.

Retention versus Acquisition
Secondly, there’s the important issue of how to disambiguate newly acquired users from retained users. The problem with a traffic graph that’s going up-and-to-the-right is that it’s not clear what’s really happening – is the site bringing in lots of new users? Or is there a bunch of dedicated users that are extended their engagement? You need to figure out which of 4 scenarios are actually happening, which I’ve blogged previously about:

  1. Pageviews are coming ONLY from new users
  2. Pageviews are coming ONLY from one generation of users (like early adopters)
  3. Pageviews are coming ONLY from retained users
  4. Pageviews are coming from new users and retained users

The proper way to disambiguate retention from acquisition is to precisely track the following stats:

  • How many new users are joining the site?
  • Of these new users, what are the different funnels they are joining from? (be it SEO, direct navigation, etc.)

Then you separate out these users completely from the aggregate numbers, and the remaining folks you have left are ones who are coming back to the site. You can then further segment this group by cohort, which we’ll discuss below.

Building your first retention table: User cohorts vs Revisit rates
Using the points from above, you can now build a retention table that compares how many users are coming back. This table starts with three columns:

  • Time period the user joined
  • Number of users that joined that period
  • Revisit percentage rate

The reason why you separate it out into cohorts is that it gives the ability to compare performance of the site over time. As new product features are added, ideally the revisit rates would also continue to rise.

Let’s put this together in a table, imagining that we’re at Week 5:

Time period
User count Revisit rate
Week 1
(4 wks ago)
1000 28%
Week 2
(3 wks ago)
1100 26%
Week 3
(2 wks ago)
1210 23%
Week 4
(1 wk ago)
1331 15%
Week 5
(now)
1464 0%

A couple points on the above table:

  • Looking back as Week 5, you can see that Week 1 is now the “oldest” cohort, and those users have had many weeks to revisit the site
  • The overall userbase is growing 10% per week, starting with an initial userbase of 1000
  • The revisit rate is naturally <100% since whatever initial cohort you start out with, it can only decrease but not increase
  • Note that the retention rate of the site seems to be around 30%, although you’d want to let the Week 1 cohort run for a while and see if it eventually stabilizes
  • Week 5 is currently at 0% since in this example the week just started and no users have revisited yet
  • The actual number of visits on any given day is weird to calculate using this table, since the view is not based on aggregate numbers

The key metric is really the number that the revisit rate converges to. You can use this number in your traffic models to understand whether you should be focused on acquiring new users, or if you can simply focus on extending the engagement levels of your site.

What’s your revisit rate? (Using Google Analytics to approximate it)
Google Analytics gives you an overall number for free, with some caveats. You can access this feature on the lefthand nav through “Visitors”, then “New vs. Returning.” Basically this is an OK approximation of the revisit rate, as long as you:

  • Maximize the window in which you are doing the analysis (ideally starting the analytics window when the site was first made public), otherwise the numbers will skew high since you’ll be counting too many dedicated users
  • Ideally, the site would isn’t adding exponentially more users every day, since it would skew lower because newer users are less likely to have returned

Essentially there’s some skew that comes into play since Google Analytics doesn’t let you segment your users based on when they first joined the site.

Willing to share?
For readers who are willing to share the numbers on their site, please comment below and if I get enough responses I’ll do a followup blog post on the subject.

Like this blog?
If you did, please recommend it to a colleague and/or click here to get updates via email or RSS.

PS. Get new updates/analysis on tech and startups

I write a high-quality, weekly newsletter covering what's happening in Silicon Valley, focused on startups, marketing, and mobile.