@andrewchen

Get the newsletter · 2018 essays (PDF) · Featured · Recent

Built to Fail: How companies like Google, IDEO, and 37signals build failure-tolerant systems for anything!

Planning for success, not failure

High achieving people who have a long history of being successful often plan accordingly – doing so, of course, means that they plan for success in whatever they do. And when you take a successful person and put them in a successful big company that’s already making money from their products, there’s even more reason to plan for high-achievement outcomes.

But let’s say that you put these successful people and put them in environments of great uncertainty, like at a Silicon Valley startup – what happens? That’s when realities collide! When you apply the big successful company playbook to startups, you can end up with monolithic planning processes, products that can’t find their markets, and lots of money being spent on launches for the wrong products. It’s not that these tactics are stupid, it’s just that they don’t work as well when you’re dealing with ill-defined customer problems with unknown solutions.

At the heart of this conversation is – what happens when you take something that’s usually assumed to be successful, and you instead say that it’s very likely to fail?

In a way, you can think of this as planning to fail, but then building the support structure around the failure in order to create a failure-tolerant system. Let’s dive into this.

Planning for failure, not success
The title of this blog refers to the fact that companies like Google, IDEO, and 37signals all have the culture of “Failure is OK” built into them.

At Google:

  • Google makes money by being always available, ubiquitous, and having a great product
  • To deliver their service, they have 100,000s of servers (maybe more?)
  • Any one of these servers have a high likelihood of failing at any time
  • To create a fault-tolerant system, they have lots of redundancy and lots of sophistication around what happens when an individual box fails
  • Contrast this to a big-iron approach that builds all the redundancy into specialized hardware that’s designed to never fail

At IDEO:

  • Companies hire IDEO to give them fresh designs based on a customer-focused approach
  • Part of every project involves lots of brainstorming and coming up with ideas
  • However, any specific idea is likely bad (for example, 12 out of 4,000 toy ideas were actually successful = 0.3%)
  • Thus, IDEO combines structured brainstorming, rapid prototyping, and field research to rapidly try out new concepts and get to good products
  • Contrast this to a process where the “Great Man” designer thinks about a design problem and then comes up with the right solution spontaneously

At 37signals, in particular Ruby on Rails:

  • Rails is framework built for programmers to build websites
  • Of course, every web project requires lots of lines of code which can easily break at any moment
  • If you assume that programmers will more often write code that is buggy and breaks, then you’ll want to make testing and iteration easy – this is at the heart of Agile, TDD, continuous integration, and other related disciplines
  • Contrast this to a waterfall engineering approach which assumes the correct design and architecture can be thought out by experienced software engineers

Each one of these examples is similar, yet unique in their own way – but there are similar themes that pervade each one of these approaches.

Characteristics of failure-tolerant systems
Each one of these systems takes the central part of a process and assumes failure, and then builds up a support system around it.

This happens by building on a few core principles:

  • Acceptance of failure: You have to accept that shit happens and failure is commonplace – this needs to be internalized so that failure isn’t punished, but rather embraced!
  • Massive redundancy: Then, it needs to be easy to have lots of redundancy built into the system – for designers, that means lots of designs get generated. For startups, that means lots of ideas are tested, and for Google, that means lots of servers are used
  • Cheap, easy, fast: As a side-effect of the redundancy, it needs to be easy, cheap, and fast to have lots of ideas, lots of servers, or write lots of code. The harder it is, harder it will be to create redundancy
  • Iterative, reality-based testing: Testing these individual components constantly becomes key – you need to force failure on the system to figure out how it reacts from a system-wide level

Building up processes based on the ideas above makes it easier and easier to deal with failure and come out on the other side!

Conclusion and next ideas
There are lots of interesting directions that this line of thinking can go.

This area of thinking started out with the hiring process, and the idea that maybe interviews don’t work at all – there’s a bunch of academic research that implies that, actually. So if how would you build a failure-tolerant system around the hiring process, if you assume that good interview candidates actually have no correlation to successful employees?

For dating, what happens if you assume that people you like to date may not be the kind of person you’d have a successful marriage with? What if people suck at figuring out what kind of guy or gal is the “type you’d bring home to Mom?” I think anyone could attest to the idea that many people suck at figuring out the right person to date, much less the right kind of person to marry. I personally find it crazy that people make a 50+year decision to be married based on a 18-month sample size :-)

For careers, what if it turns out that people have a really bad idea figuring out what they’ll actually want to do 40 hours a week, 50 weeks a year, for the rest of their life? How would you figure out the right career faster rather than shorter?

All of these are great thought experiments, I think.

What else am I missing? :-) I’d love to take any suggestions and write up some thought experiments around it.

Want more?
If you liked this post, please subscribe or follow me on Twitter. You can also find more essays here.

PS. Get new updates/analysis on tech and startups

I write a high-quality, weekly newsletter covering what's happening in Silicon Valley, focused on startups, marketing, and mobile.

Views expressed in “content” (including posts, podcasts, videos) linked on this website or posted in social media and other platforms (collectively, “content distribution outlets”) are my own and are not the views of AH Capital Management, L.L.C. (“a16z”) or its respective affiliates. AH Capital Management is an investment adviser registered with the Securities and Exchange Commission. Registration as an investment adviser does not imply any special skill or training. The posts are not directed to any investors or potential investors, and do not constitute an offer to sell -- or a solicitation of an offer to buy -- any securities, and may not be used or relied upon in evaluating the merits of any investment.

The content should not be construed as or relied upon in any manner as investment, legal, tax, or other advice. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Any projections, estimates, forecasts, targets, prospects and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Any charts provided here are for informational purposes only, and should not be relied upon when making any investment decision. Certain information contained in here has been obtained from third-party sources. While taken from sources believed to be reliable, I have not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. The content speaks only as of the date indicated.

Under no circumstances should any posts or other information provided on this website -- or on associated content distribution outlets -- be construed as an offer soliciting the purchase or sale of any security or interest in any pooled investment vehicle sponsored, discussed, or mentioned by a16z personnel. Nor should it be construed as an offer to provide investment advisory services; an offer to invest in an a16z-managed pooled investment vehicle will be made separately and only by means of the confidential offering documents of the specific pooled investment vehicles -- which should be read in their entirety, and only to those who, among other requirements, meet certain qualifications under federal securities laws. Such investors, defined as accredited investors and qualified purchasers, are generally deemed capable of evaluating the merits and risks of prospective investments and financial matters. There can be no assurances that a16z’s investment objectives will be achieved or investment strategies will be successful. Any investment in a vehicle managed by a16z involves a high degree of risk including the risk that the entire amount invested is lost. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by a16z is available at https://a16z.com/investments/. Excluded from this list are investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets. Past results of Andreessen Horowitz’s investments, pooled investment vehicles, or investment strategies are not necessarily indicative of future results. Please see https://a16z.com/disclosures for additional important information.