Sign up for my email newsletter

Get new updates, usually once a week – it features long-form essays on what’s going on here in Silicon Valley.

I’ve written 550+ essays which have been featured and quoted in The New York Times, Fortune, Wired, and WSJ. The topics range from mobile product design to fundraising to “growth hacking.”

Thanks for reading. -Andrew

Close

@andrewchen

Subscribe · Featured · Recent essays

Facebook viral marketing: When and why do apps “jump the shark?”

Excel spreadsheet download
For those of you who are interested in the gory details, please download the following spreadsheet here:

Viral and Retention Excel Model (Click to download)

Math warning!
This blog post will be a little more technical than usual, so I apologize to those of you who are bored by this. Anyway, let’s get started.

See this image before? Many would describe that as, EPIC FAIL ;-)

That’s what happens when you “jump the shark” and your app goes from successful to completely not successful. Why does this happens? This blog post is to dissect that exact issue.

Modeling user acquisition
First off, let’s look at some ways to model user acquisition. For those of you with the spreadsheet, this is the second tab. You first start with a couple constants:

  • Invite conversion rate % = 10%
  • Average invites per person = 8.00
  • Initial user base = 10,000
  • Carrying capacity = 100,000

(note that these are just example numbers)

To understand how these constants work, you basically want to think about how viral marketing works. What happens is that you start out with an initial userbase (=10k), and every time your userbase grows, each user ends up sending out invites (=8.00), which then have a specific conversion rate (=10%).

That means that in the first time period, you have 10k. In the second time period, you get 10k*8*10% more users, which equals 8k more users, who are the next round of users who send invites. Then in the third time period, it’s 8k*8*10%, and so on. Note that the new batch of users needs to exceed the previous batch, in order to “go viral.” That ratio is often referred to as the viral coefficient. In fact, here’s the equation for this unbounded viral equation:

u(t) = u(0) * (1 + i * conv)^t
where u(0) = 10k, i = 8.00, conv = 10%, and t is the # of time periods

However, note that this assumes that your “carrying capacity,” that is, how many users are in the total network, is unlimited. However, on Facebook, that’s not true – once you burn through the 60 million new users, then you don’t have any left. Similarly, it doesn’t reflect the reality that as you saturate the network, your invites may end up going towards people who have already evaluated or installed your app, and they are unlikely to install it again.

A simple model for network saturation
Thus, one simplifying assumption is that as you saturate the network, the conversion rate on your invites goes down. In one possible model, you’d argue:

  • If you have installs on 0% of the network, then your natural conversion rate (10%) holds
  • If you have installs on 50%, then your natural conversion rate is discounted 50%, which equals 5%
  • If you have installs on 99%, then your natural conversion rate is discounted 99%, and etc.

Note that you might even argue that this is an optimistic view. You might argue, for example, that the “discount” on your conversion rate should be related to the total % of the userbase that’s been invited, not the total % that’s installed something.

In that version, if someone hates your app and doesn’t want to install it, it’s unlikely that they will ever install it. In the version I’m describing, the only people who won’t install your app are the people who have already done so.

To describe this mathematically, you might say that at each point, there’s an “adjusted conversion rate” which looks like:

adjusted conversion rate
= natural conversion rate * saturation %
= natural conversion rate * (current installs / total Facebook population)

so if you agree that’s true, then you can combine the this last equation into the initial one:

u(t) can be defined as:
= u(0) * (1 + i * adjusted_conv)^t
= u(0) * (1 + i * conv * u(t-1) / carrying_capacity)^t

(This can then be simplified further, but I’ll leave the math to the reader – the spreadsheet reflects this thinking already)

As a result of this, you see that your cumulative install base kinda looks like a logistic curve:

Now that you see that the cumulative users follows an interesting trend, where it starts to grow exponentially, but then starts to hit saturation. Then it eventually takes some time, but it starts to plateau as you reach the carrying capacity of the network.

Quick break for Cohort analysis re-introduction
Before reading through this post, you might want to glance over a previous blog I wrote on cohort analysis and its relationship to user retention reports

You may want to read that before going further…

Back to our story…
Previously, I discussed how you can mathematically model the viral acquisition process, particularly as you hit the network saturation point. However, while the model shows a growth curve for cumulative users, it doesn’t take into account how retention metrics fit in.

In the spreadsheet linked above, you can flip to the “User retention” tab, which shows a cohort analysis perspective of the hypothetical site. Here’s how to read it:

  • On the Y-axis are “Time period cohorts” which are defined by the group of users that joined in a particular time period. So #1 means, the users that joined in period #1
  • On the X-axis are the “Time period” which defines the time period that the specific cohort is in

So for example, in 1×1, there are an initial 3,000 active users on the site.

However, by the next time period, the 3,000 active users have declined to 1,500 users. However, because there are a bunch of virally generated users, there’s a new cohort of 2,328 users who have joined as cohort 2. The number of “new” cohorts is defined by the rows in the other spreadsheet tab, “Viral acquisition.”

Then notice that at the bottom of each time period, there’s a count for how many users are active in total, in each specific time period.

Does this make sense? If not, shoot me an email at voodoo[at]gmail with what you’re confused by, and I’ll update this blog with more clarifications!

Introducing the retention coefficient
So the key driver for retention is the % of users that stay alive in a specific cohort, between one period to the next. If it’s 50%, then if you start out with 3k users, in the next period you’ll be left with 1.5k users. If it’s 100% retention, then 3k users ends up with 3k users.

So let’s play around with the numbers.

At 99% retention, which means that over 20 periods you are losing very few users, you get a graph of total active users that looks like this:

This chart looks pretty good, of course. You start with exponential growth, then hit a plateau, and you have a very slow burn on your userbase. I suspect that the Facebook site, among other highly popular sites, essentially have >99.999% retention between days. I say that because people seem to use the site for years at a time, and probably the early users of the site are probably mostly still on it.

Now for the EPIC FAIL.
OK, here’s the fun part, which is when you drop the retention coefficient down to 50%:

Ouch. Doesn’t look good. If you’ve read all the way this, far it’s pretty clear why this happens, but let’s summarize:

Key conclusion
The key in this calculation, if you look at the stats, is that:

  • Early on, the growth of the curve is carried by the invitations
  • However, over time the invitations start to slow down as you hit network saturation
  • The retention coefficient affects your system by creating a “lagging indicator” on your acquisition – if you have good retention, even as your invites slow down, you won’t feel it as much
  • If your retention sucks, then look out: The new invites can’t sustain the growth, and you end up with a rather dire “shark fin.”

Things look great at first, but if you can’t retain users long-term, then you don’t have a business.

Improvements to the model
I want to make a couple comments on how the simplified model contained within the spreadsheet could be improved dramatically:

  • Don’t just model invites, model multiple viral channels
  • Include “usage loops” not just the “invite loops,” which are triggered by users trying out the product
  • Try both a global carrying capacity, as well as a “niche discount” for the number, if your app is super-niche and focused on a particular demographic or user behavior
  • Be able to handle realistic numbers – perhaps even retrofit it onto Adonomics data, for example
  • Factor in re-engagement channels
  • etc.

Obviously if anyone would like to think about this more, feel free to and shoot me an email.

Questions and comments?
I built this model very quickly while on the plane ride back from Graphing Social Patterns, but if anybody wants to discuss the model, make improvements, etc., please e-mail me:

voodoo[at]gmail

Thanks!

UPDATE: Dave Fry sent in a correction on the fact that only the new delta of users sends out new invites, the old guys have already done so, and are unlikely to in the next period. Thanks Dave!

Like this post?
Get new updates via newsletter..

  • http://stevenkovar.com Steven Kovar

    GREAT Post Andrew! I’m currently developing a Facebook app with a partner and one of my chief concerns has been whether the value of the application is sustainable of the long-tail.

    This post gave me a few ideas to lessen the sting of user burn off to help re-engage them in the application.

    Thanks for taking the time to cover this!

  • http://minethatdata.blogspot.com Kevin Hillstrom

    Well done on realizing that the math used by biology and ecology folks applies to Facebook!

    I make my entire living on the concepts you outline. I apply those concepts to companies, outlining for them how their business may be struggling due to the issues you describe in your post.

    You correctly identified that adoption/acquisition, carrying capacity, and retention drive all dynamics, and do so not just for Facebook, but for all business models.

    Well done!

  • http://www.sexywidget.com lawrence

    that’s awesome, Andrew. Folks have been hinting at the math behind viral distribution on FB, but this is the first time that I’ve seen it broken down.

  • http://www.dogster.com Ted Rheingold

    Thanks. Great details. Most appreciated.

    As someone who built a viral app (pre-Facebook) I primarily nurtured and grew it for many more months than another entrepreneur might simply to make sure if I was going to commit to it that the users were sticking around.

    Of course I tried to get as much easy revenue as I could early on (gotta get the low hanging fruit ;), but I didn’t invest heavily into it until I confirmed it offered long-term value to the users and the long-term reward would match the long-term effort.

    I’ve really been appreciating you posting these formulas and calculations. Where as in the early days it was fine to throw stuff on the wall and see what sticked, we really like to make sure if we put focus and effort into something it’s not just going to stick, but climb up to the ceiling ;>

    Thx

  • http://www.broadstuff.com alan p

    Wrote a piece on the same subject a few weeks ago….the similarities in our graphs are uncanny – great minds, eh?

    Here was mine:

    http://www.broadstuff.com/archives/750-The-Rollercoaster-Dynamics-of-Social-Net-Usage-Traffic-Crash.html

  • http://www.optimizeandprophesize.com/ Jonathan Mendez

    i have nothing to add except screw the math warning-this post kicks ass!

    thanks!

  • http://adonomics.com Jesse Farmer

    Andrew,

    Yay! We’ve used similar models internally at Adonomics. I have a BS in Mathematics, so this stuff makes me happy.

    If you’re interested in any of the data we have I’d be happy to share with you. We’re the only ones with data before the switch to DAU, so we actually have detailed graphs of user growth.

    You can email me at jesse[at]adonomics.com if you’re interested.

    Cheers,
    Jesse
    CTO, Adonomics

  • http://www.stanleywong.org Stanley Wong

    Love this great post.

    The logistic curve you refer to is actually also known as a Sigmoid function (aka S-Curve).

    On Wikipedia:
    http://en.wikipedia.org/wiki/Sigmoid_function

    There is also an Excel spreadsheet floating around for the S-Curve here:
    http://jcandkimmita.info/jc/2007/04/business/modeling-market-adoption-in-excel-with-a-simplified-s-curve/

    I’ve used this to model pretty closely Facebook application growth of some pretty popular apps.

    Best,

    Stanley

  • http://adecon101.blogspot.com/2008/03/analysis-facebook-viral-marketing.html Dash Chang

    Well written article, but I respectfully disagree. The hypothesis may not reflect social realities.

    Thousands of Facebook members have endorsed termination of widgets that force more friends to install. This push strategy is socially out of date.

    Social applications win by pull. As each friend installs, their friends discover the application and installs. Thus, as recommended by Mark Zuckerberg of Facebook, developers should focus on great products, not viral marketing.

    Bloggers win via persistent pull. As you write each post that expands the relevancy of your content, more readers install your feed – spreading the word to their friends.

    With an application or blog, the result should not be modelled as a one time event producing a return curve. It comes from incremental improvements to the application and content – producing long term growth.

    Social networks provide only the tools for potential viral growth.
    -Dash
    The New Economics of Advertising – http://adEcon101.blogspot.com

  • http://profile.typekey.com/amyjokim/ Amy Jo Kim

    Great post — appreciate the math breakdown.

    It’s interesting that the first curve you post IS the traditional “hits” model of entertainment sales — hit movies, hit singles, hit games. This curve dives quickly when it’s a one-time content experience: purchase the hit, enjoy the hit, then move on.

    With multiplayer games and social networks, we have ongoing streams of fresh content – and the potential to direct viral growth into an ongoing, ever-changing entertainment experience.

    SOME app is going to be the WOW of FB – the breakout hit that grows virally and takes over an entire category – not through a spammy invite system, but because it’s SO MUCH FUN to play together with your friends and family.

  • http://allfocus.com isayusay

    Venturebeat reported record growth on fubar.com on Mar 7, while Alexa already showed the traffic at the jump the shark stage, like your graph.

  • Chris Wexler

    Great post. Seems to me that this just shows why you need to always be in beta or working on the next version… There are few long-term home-runs in this business – and all you are really doing with great products is lengthening the shark tail, not eliminating it.

    Seems to me that a possible strategy with this reality is using your initial product to popularize the second and so on… Overlapping shark-tails means when you jump the shark, the users have somewhere to land. Just a thought.

  • http://lsvp.wordpress.com jeremyliew

    Andrew,

    You’ve generated an all star discussion forum in the comments! (myself excluded of course).

    One refinement I’d suggest based on cohort analysis of subscription businesses is to vary retention churn over time. It is typically not a constant x% per period. Rather you typically get high “infant mortality” over the first few periods, but then settle into a steady state with little loss once you’ve identified your core users. This data is now several years old so I feel comfortable sharing it, but in AOL’s dial up business around 2002 we would average around 6-7% churn per month. But this was an average across multiple cohorts; we would lose 50% of users in the first three months, but by month 12 churn was in the 2% range (which is close to the move/death range of unavoidable churn when dealing with a service tied to an address/phone number). This “mix” problem is particularly important when you’re modeling viral growth businesses that have meaningfully different user acquisition rates over time.

  • http://www.spongefish.com Adam Durfee

    Any particular reason why you took this approach instead of applying the Bass model to this? http://en.wikipedia.org/wiki/Bass_diffusion_model

  • http://www.facebookinsight.com Neil @ FacebookInsight.com

    Great post.

    Clearly you’ve mapped the short lifecycle of most Facebook applications at this point. This kind of theory would hold for any sort of engaging application online. The anomalies do exist though, like Top Friends or Compare People, where constant innovation keeps users coming back for more. I imagine this is what we’ll see after Facebook shakes down existing applications with the new Profile setup.

  • http://profile.typekey.com/desmondhaynes/ desmondhaynes

    I didn’t understand some of the maths bit – but understand the implications loud and clear. Thanks for the post!
    -DH
    Visit my tech blog @ http://techrunch.blogspot.com/

  • NevaehEastkbff

    This is good methad and good story.

    NevaehEastkbff

    —————-

    Wow, check out this site called http://www.fluc.com
    . Free SMS and free mobile ads!! Its fantastic

  • http://www.skmurphy.com/ Sean Murphy

    Great post, one thing that struck me is that it doesn’t capture the “competitive equilibria” issues that may also occur. Your social network isn’t the only one recruiting, so that there is competition for the total carrying capacity. I think there may also be a bandwagon effect, where users depart smaller groups (in the context of a particular demographic/market/segment that you measure with your carrying capacity) for larger ones because they offer more benefits. But I agree with your basic modeling approach.

  • http://joshuamarch.co.uk Joshua March

    Hi Andrew,

    I did a talk a while back on your findings to the Facebook Developer Garage in London, which I help organise. After posting about my talk ended having an discussion with a reader over some inputs which create a saw-tooth in the results, you might find the conversation interesting – http://www.joshuamarch.co.uk/2008/04/jumping-shark.html

    Cheers,

    Josh

  • http://www.nickjag.com Nick Jag – Facebook Marketer

    Definitely agree with you Andrew, retention is really important when developing a new Facebook app, or any app for that matter. Great math, great stats, and great post.

  • http://www.lawebdejuan.com.ar/2008/07/crecimiento-viral-vs-crecimiento.html Juan

    Hey, i just found out about this theory on Google I/O. I will defintlly look for more of syour developments. I have made a post about organic growth vs viral growth on my blog. Is acctualy in spanish, so you might find it interesting:)

    http://www.lawebdejuan.com.ar/2008/07/crecimiento-viral-vs-crecimiento.html

    Awesome work!

  • Kati

    Hi Andrew!

    I played a bit with your excel-sheet and your equation and explored that your results in excel doesn’t match with your posted equations. You never used something like ^t.
    In excel you use something like
    u(t)=u(t-1)*((1-u(t-1)/carring_capacity)*conv*i)+u(t-1)

    Where is the mistake?

  • Pingback: How to measure if users love your product using cohorts and revisit rates | Futuristic Play by @Andrew_Chen

  • gargouri2001

    Nice write up and blog , Thanks for sharing all those good info

    best regards
    John
    http://thenewsempire.com/Technologies/

  • FP NICOLAS

    Great article, thanks !

  • Pingback: To my first 10,000 blog subscribers: Thank you! | Andrew Chen (@andrew_chen)

  • Pingback: What I’m reading: Viral Loop by Adam Penenberg | Andrew Chen (@andrew_chen)

  • http://socialfactory.net/facebook-connect-development.php facebook viral apps

    Nice post, It’s incredible to see how quickly this is all happening.
    Because Facebook is all about friends, I think the apps that deliver true value to friendship networks (like the carpool and couchswap apps) are the ones to watch.

  • http://www.kwiclick.com vinniv

    Great analysis! When you mention “Don’t just model invites, model multiple viral channels” are you suggesting that you break out each cohort into more granular segments, such as “new user from email invite, new user from twitter link, new user from blog widget, etc” then drill into each segment to view the performance? Have you seen disparate performance based on the 'type' of invite a user received from the viral loop?

  • http://andrewchen.typepad.com Andrew Chen

    yep, all the channels are going to be very different.

  • http://www.kwiclick.com/ vinniv

    Great analysis! When you mention “Don’t just model invites, model multiple viral channels” are you suggesting that you break out each cohort into more granular segments, such as “new user from email invite, new user from twitter link, new user from blog widget, etc” then drill into each segment to view the performance? Have you seen disparate performance based on the 'type' of invite a user received from the viral loop?

  • http://andrewchen.typepad.com Andrew Chen

    yep, all the channels are going to be very different.

  • http://www.facebook.com/jgachanja James Murithi Gachanja

    Really enlarging my scope of understanding viral marketing by reading this blog.

Want more? Featured essays and book recommendations